After recent update of my application in Google Play, I started receiving lot of crash reports, all of them are from Samsung devices with Android 5. Lower android versions work fine and devices of other manufacturers with Android 5 work fine too.
I don't have any device where I could reproduce the issue, so I can't bisect. I am trying to deduce what could be wrong from the crash report and from list of changes since my last working version (which is unfortunately long).
All the crash reports look like this (just the addresses slightly vary between devices):
Build fingerprint: 'samsung/kltektt/kltektt:5.0/LRX21T/G900KKTU1BOB1:user/release-keys'
Revision: '15'
ABI: 'arm'
pid: 26265, tid: 26265, name: mt.AnnelidsDemo >>> cz.gdmt.AnnelidsDemo <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x76f57e84
r0 00000800 r1 0000004b r2 b4aa9f9a r3 00000000
r4 1426e019 r5 76f57e80 r6 0000012c r7 76e6b040
r8 00000019 r9 76f57d54 sl 000007ff fp b4e1b330
ip b4aa9f70 sp bea94b50 lr b4bc72c1 pc b4c0d9b8 cpsr 00070030
backtrace:
#00 pc 001099b8 /system/lib/libart.so (art::TypeLookupTable::Lookup(char const*) const+59)
#01 pc 000c32bd /system/lib/libart.so (art::ClassLinker::LookupClassFromImage(char const*, art::gc::space::ImageSpace*)+64)
#02 pc 000d27c1 /system/lib/libart.so (art::ClassLinker::DefineClass(char const*, art::Handle<art::mirror::ClassLoader>, art::DexFile const&, art::DexFile::ClassDef const&)+320)
#03 pc 000d2d89 /system/lib/libart.so (art::ClassLinker::FindClassInPathClassLoader(art::ScopedObjectAccessAlreadyRunnable&, art::Thread*, char const*, art::Handle<art::mirror::ClassLoader>)+452)
#04 pc 001fe20b /system/lib/libart.so (art::VMClassLoader_findLoadedClass(_JNIEnv*, _jclass*, _jobject*, _jstring*)+254)
#05 pc 0001b179 /system/framework/arm/boot.oat
I found out that the art::TypeLookupTable
is Samsung's modification of ART and there are no sources available.
Both this and last working versions are build with the same android SDK and NDK (target is android-19), there are no changes in Java code, there is lot of changes in native code and in data. I started using LTO when building native code. I started using -z
(Zopfli) parameter of zipalign
.
My application uses JNI, so that is probably the first suspect. However CheckJNI doesn't report any problems. The same code runs clearly without any crashes on other Android devices, on IOS and on Linux. It doesn't show any erros in valgrind. So I think some random memory corruption is unlikely.
I think my Java code is ok, but even if it had errors, it shouldn't cause segfault in java runtime...
Users are reporting that the application crashes during start, before even showing anything.
I asked on Samsung developers forum, so far without any response.
I have two questions:
The backtrace starts in boot.oat and continues in libart.so. What is happening in boot.oat? Is it possible that it crashes even before reaching any of my code? (That would indicate bug in Samsung's ART.)
Any idea what could be wrong, what could I try?
Together with one other developer, who was getting the same crash in his application, we discovered that it is triggered by the -z
parameter of zipalign
tool. (Recompress using Zopfli)
The exactly same APK crashes when aligned and recompressed with Zopfli and doesn't crash when aligned without recompressing.
I can only guess that Samsung made some modifications to the Android 5 and introduced some weird bug in the code that reads the APK. Until that is fixed or I have some better explanation, not using the -z
in zipalign
solves the problem.