I have implemented a JNA bridge to FDK-AAC. Source code can be found in here
When bench-marking my code, I can get hundreds of successful runs on the same input, and then occasionally a C-level crash that'll kill the entire process, causing a core-dump to be generated:
Looking at the core dump, it looks like this:
#1 0x00007f3e92e00f5d in __GI_abort () at abort.c:90
#2 0x00007f3e92e4928d in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f3e92f70528 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:181
#3 0x00007f3e92e5064a in malloc_printerr (action=<optimized out>, str=0x7f3e92f6cdee "corrupted size vs. prev_size", ptr=<optimized out>, ar_ptr=<optimized out>) at malloc.c:5426
#4 0x00007f3e92e5304a in _int_free (av=0x7f3de0000020, p=<optimized out>, have_lock=0) at malloc.c:4337
#5 0x00007f3e92e5744e in __GI___libc_free (mem=<optimized out>) at malloc.c:3145
#6 0x00007f3e113921e9 in FDKfree (ptr=0x7f3de009df60) at libSYS/src/genericStds.cpp:233
#7 0x00007f3e1130d7d3 in Free_AacEncoder (p=0x7f3de0115740) at libAACenc/src/aacenc_lib.cpp:407
#8 0x00007f3e1130fbb3 in aacEncClose (phAacEncoder=0x7f3de0115740) at libAACenc/src/aacenc_lib.cpp:1395
This back/stack trace error is reproducible if I run repeat benchmark enough times , though I'm having a hard time understanding what might be the cause for such error? Memory allocated to pointer 0x7f3de009df60
is allocated inside the CPP/C code as well and I can guarantee the same instance that's allocated is being freed. The benchmark is, of course - single-threaded.
After reading these:
security checks && internal functions
I'm still having a hard time understanding - what might be a real (non-exploitation, but rather error)) scenario that causes me to get the above error? and why does it happen very scarcely?
Current suspicion:
Running a detailed backtrace, I get this input:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
set = {__val = {4, 6378670679680, 645636045657660056, 90523359816, 139904561311072, 292199584, 139903730612120, 139903730611784, 139904561311088, 1460617926600, 47573685816, 4119199860131166208,
139904593745464, 139904553224483, 139904561311136, 288245657}}
pid = <optimized out>
tid = <optimized out>
#1 0x00007f3e92e00f5d in __GI_abort () at abort.c:90
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x7f3de026db10, sa_sigaction = 0x7f3de026db10}, sa_mask = {__val = {139903730540556, 19, 30064771092, 812522497172832284, 139903728706672, 1887866374039011357,
139900298780168, 3775732748407067896, 763430436865, 35180077121538, 4119199860131166208, 139904561311552, 139904553065676, 1, 139904561311584, 139904561312192}}, sa_flags = 4096,
sa_restorer = 0x14}
sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00007f3e92e4928d in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f3e92f70528 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:181
ap = {{gp_offset = 40, fp_offset = 32574, overflow_arg_area = 0x7f3e11adf1d0, reg_save_area = 0x7f3e11adf160}}
fd = <optimized out>
list = <optimized out>
nlist = <optimized out>
cp = <optimized out>
written = <optimized out>
#3 0x00007f3e92e5064a in malloc_printerr (action=<optimized out>, str=0x7f3e92f6cdee "corrupted size vs. prev_size", ptr=<optimized out>, ar_ptr=<optimized out>) at malloc.c:5426
buf = "00007f3de009e9f0"
cp = <optimized out>
ar_ptr = <optimized out>
ptr = <optimized out>
str = 0x7f3e92f6cdee "corrupted size vs. prev_size"
action = <optimized out>
#4 0x00007f3e92e5304a in _int_free (av=0x7f3de0000020, p=<optimized out>, have_lock=0) at malloc.c:4337
size = 2720
fb = <optimized out>
nextchunk = 0x7f3de009e9f0
nextsize = 736
nextinuse = <optimized out>
prevsize = <optimized out>
bck = <optimized out>
fwd = <optimized out>
errstr = 0x0
locked = <optimized out>
#5 0x00007f3e92e5744e in __GI___libc_free (mem=<optimized out>) at malloc.c:3145
ar_ptr = <optimized out>
p = <optimized out>
hook = <optimized out>
#6 0x00007f3e113921e9 in FDKfree (ptr=0x7f3de009df60) at libSYS/src/genericStds.cpp:233
No locals.
#7 0x00007f3e1130d7d3 in Free_AacEncoder (p=0x7f3de0115740) at libAACenc/src/aacenc_lib.cpp:407
No locals.
#8 0x00007f3e1130fbb3 in aacEncClose (phAacEncoder=0x7f3de0115740) at libAACenc/src/aacenc_lib.cpp:1395
hAacEncoder = 0x7f3de009df60
err = AACENC_OK
0x7f3de009df60
.nextchunk
is 0x7f3de009e9f0
, which is only 2704 bytes after the current pointer which is being released.OK, so I've managed to overcome this issue.
First of all - A practical cause to "corrupted size vs. prev_size" is quite simple - memory chunk control structure fields in the adjacent following chunk are being overwritten due to out-of-bounds access by the code. if you allocate x
bytes for pointer p
but wind up writing beyond x
in regards to the same pointer, you might get this error, indicating the current memory allocation (chunk) size is not the same as what's found in the next chunk control structure (due to it being overwritten).
As for the cause for this memory leak - structure mapping done in the Java/JNA layer implied different #pragma
related padding/alignment from what dll/so was compiled with. This in turn, caused data to be written beyond the allocated structure boundary. Disabling that alignment made the issues go away. (Thousands of executions without a single crash!).