Linux Kernel 4.4.x (Ubuntu 16.04) – ‘double-fdput()’ bpf(BPF_PROG_LOAD) Privilege Escalation

Exploit Database
149 阅读

作者： Google Security Research

日期： 2016-05-04
类别：
- local
平台：
- linux
来源：https://www.exploit-db.com/exploits/39772/

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

Source: https://bugs.chromium.org/p/project-zero/issues/detail?id=808

In Linux >=4.4, when the CONFIG_BPF_SYSCALL config option is set and the

kernel.unprivileged_bpf_disabled sysctl is not explicitly set to 1 at runtime,

unprivileged code can use the bpf() syscall to load eBPF socket filter programs.

These conditions are fulfilled in Ubuntu 16.04.

When an eBPF program is loaded using bpf(BPF_PROG_LOAD, ...), the first

function that touches the supplied eBPF instructions is

replace_map_fd_with_map_ptr(), which looks for instructions that reference eBPF

map file descriptors and looks up pointers for the corresponding map files.

This is done as follows:

/* look for pseudo eBPF instructions that access map FDs and

* replace them with actual map pointers

static int replace_map_fd_with_map_ptr(struct verifier_env *env)

{

struct bpf_insn *insn = env->prog->insnsi;

int insn_cnt = env->prog->len;

int i, j;

for (i = 0; i < insn_cnt; i++, insn++) {

[checks for bad instructions]

if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) {

struct bpf_map *map;

struct fd f;

[checks for bad instructions]

f = fdget(insn->imm);

map = __bpf_map_get(f);

if (IS_ERR(map)) {

verbose("fd %d is not pointing to valid bpf_map\n",

insn->imm);

fdput(f);

return PTR_ERR(map);

}

[...]

}

[...]

}

__bpf_map_get contains the following code:

/* if error is returned, fd is released.

* On success caller should complete fd access with matching fdput()

struct bpf_map *__bpf_map_get(struct fd f)

{

if (!f.file)

return ERR_PTR(-EBADF);

if (f.file->f_op != &bpf_map_fops) {

fdput(f);

return ERR_PTR(-EINVAL);

}

return f.file->private_data;

}

The problem is that when the caller supplies a file descriptor number referring

to a struct file that is not an eBPF map, both __bpf_map_get() and

replace_map_fd_with_map_ptr() will call fdput() on the struct fd. If

__fget_light() detected that the file descriptor table is shared with another

task and therefore the FDPUT_FPUT flag is set in the struct fd, this will cause

the reference count of the struct file to be over-decremented, allowing an

attacker to create a use-after-free situation where a struct file is freed

although there are still references to it.

A simple proof of concept that causes oopses/crashes on a kernel compiled with

memory debugging options is attached as crasher.tar.

One way to exploit this issue is to create a writable file descriptor, start a

write operation on it, wait for the kernel to verify the file's writability,

then free the writable file and open a readonly file that is allocated in the

same place before the kernel writes into the freed file, allowing an attacker

to write data to a readonly file. By e.g. writing to /etc/crontab, root

privileges can then be obtained.

There are two problems with this approach:

The attacker should ideally be able to determine whether a newly allocated

struct file is located at the same address as the previously freed one. Linux

provides a syscall that performs exactly this comparison for the caller:

kcmp(getpid(), getpid(), KCMP_FILE, uaf_fd, new_fd).

In order to make exploitation more reliable, the attacker should be able to

pause code execution in the kernel between the writability check of the target

file and the actual write operation. This can be done by abusing the writev()

syscall and FUSE: The attacker mounts a FUSE filesystem that artificially delays

read accesses, then mmap()s a file containing a struct iovec from that FUSE

filesystem and passes the result of mmap() to writev(). (Another way to do this

would be to use the userfaultfd() syscall.)

writev() calls do_writev(), which looks up the struct file * corresponding to

the file descriptor number and then calls vfs_writev(). vfs_writev() verifies

that the target file is writable, then calls do_readv_writev(), which first

copies the struct iovec from userspace using import_iovec(), then performs the

rest of the write operation. Because import_iovec() performs a userspace memory

access, it may have to wait for pages to be faulted in - and in this case, it

has to wait for the attacker-owned FUSE filesystem to resolve the pagefault,

allowing the attacker to suspend code execution in the kernel at that point

arbitrarily.

An exploit that puts all this together is in exploit.tar. Usage:

user@host:~/ebpf_mapfd_doubleput$ ./compile.sh

user@host:~/ebpf_mapfd_doubleput$ ./doubleput

starting writev

woohoo, got pointer reuse

writev returned successfully. if this worked, you'll have a root shell in <=60 seconds.

suid file detected, launching rootshell...

we have root privs now...

root@host:~/ebpf_mapfd_doubleput# id

uid=0(root) gid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),113(lpadmin),128(sambashare),999(vboxsf),1000(user)

This exploit was tested on a Ubuntu 16.04 Desktop system.

Fix: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7

Proof of Concept: https://bugs.chromium.org/p/project-zero/issues/attachment?aid=232552

Exploit-DB Mirror: https://gitlab.com/exploit-database/exploitdb-bin-sploits/-/raw/main/bin-sploits/39772.zip