Friday, January 9, 2009

Disassembling DEX files

One of the most remarkable features of the Dalvik virtual machine (the workhorse under the Android system) is that it does not use Java bytecode. Instead, a homegrown format called DEX was introduced and not even the bytecode instructions are the same as Java bytecode instructions. There was some discussion whether this makes Dalvik a Java virtual machine at all. My personal opinion is that this is a religious and legal dispute. Dalvik opcodes are clearly designed to support only the Java language. Compiling programs to Dalvik bytecode written in a language other than Java is certainly possible, as it was demonstrated with Java but neither the Java bytecode, nor the Dalvik bytecode makes any effort to support any language other than Java. This is in contrast with the .Net virtual machine where at least a claim has been made that the VM supports multiple languages - even though there are always limitations in any virtual machine that prevents running a particular language on a particular virtual machine.

Android comes with a disassembler called dexdump. The location of this tool is not intuitive, it runs on the Linux platform that hosts Android. Launch the emulator, and issue the following commands:

adb shell
dexdump

In order to use the tool, one has to move the DEX file to the Android platform (e.g. adb push in case of the emulator). Then one can say:

dexdump -d classes.dex

The output of this tool is not very easy to use, however. Take for example the bytecode compiled from the following switch statement.


000418: 2b02 0c00 0000 |0000: packed-switch v2, 0000000c // +0000000c
00041e: 12f0 |0003: const/4 v0, #int -1 // #ff
000420: 0f00 |0004: return v0
000422: 1220 |0005: const/4 v0, #int 2 // #2
000424: 28fe |0006: goto 0004 // -0002
000426: 1250 |0007: const/4 v0, #int 5 // #5
000428: 28fc |0008: goto 0004 // -0004
00042a: 1260 |0009: const/4 v0, #int 6 // #6
00042c: 28fa |000a: goto 0004 // -0006
00042e: 0000 |000b: nop // spacer
000430: 0001 0300 faff ffff 0500 0000 0700 ... |000c: packed-switch-data (10 units)


The jump table used by the packed-switch instruction is not disassembled at all, it is not even dumped entirely. The same problem applies to fill-array-data tables and there are further restrictions.

I decided therefore to create a more comfortable disassembler and here is the first cut.

Access the dedexer project's page on SourceForge.

This tool is easier to use than dexdump for many reasons. For starter, it is a standard Java program that runs on the usual JVMs. Its format is much more readable and is familiar to those who know the Jasmin syntax. For example the previous fragment is disassembled like this by dedexer:


.method public calc1(I)I
packed-switch v2,0
ps418_422 ; case 0
ps418_426 ; case 1
ps418_42a ; case 2
default: ps418_default
ps418_default:
const/4 v0,15
l420:
return v0
ps418_422:
const/4 v0,2
goto l420
ps418_426:
const/4 v0,5
goto l420
ps418_42a:
const/4 v0,6
goto l420
nop
.end method


In addition, individual file is created for each class, along with the directory structure representing the package structure.

This is not a full decompiler, however. One has to know the Dalvik opcodes in order to work with the tool. This opcode list has been extended and maintained as dedexer was developed and is now in sync with the disassembler. You will see some unknown opcodes in the list. I have not encountered those instructions "out in the wild" and the disassembler does not recognize them either. If you see any of those, send me the DEX file so that I can analyse it!

This is a simple tool and is not without limitations. The most painful one is that the tool does not process the debug and annotation information in the DEX file. Array data dump could also be better. I am sure that the feature most people would like to see is a bridge toward Java class files but that is far away. Jasmin will be able to generate Java class files once the backward conversion from Dalvik opcodes to Java bytecode is provided but that's a complex task so don't hold your breath. The condition I set for myself as release condition is that the tool is able to disassemble the DEX file in framework.jar. It is able to, so I guess, the tool may be of use for others too. Enjoy!

44 comments:

danfuzz said...

Dexdump is suppposed to ship as a host binary with SDK releases, though I know it got inadvertently skipped in at least one release. In any case, you can grab the source and compile it yourself if you want.

Also, we would gladly accept patches to improve its output.

strazz said...

great stuff gabor - keep up the good work :)

Unknown said...

Hi, I am trying to following the example in the blog to dexdump a dex file.

1. I have an android eclipse project called 'HelloAndroid'. I run it via 'Debug' on the emulator.

2. Then, I go to a linux shell, i did
> adb shell
and then
# dexdump -d HelloAndroid.dex
Processing 'HelloAndroid.dex'...
ERROR: unable to open 'HelloAndroid.dex': No such file or directory
ERROR: DEX parse failed


Can you please tell me why it can't find the dex file?

Thank you.

Gabor Paller said...

Hello, Wade,

The DEX file is not normally accessible to dexdump. I don't know Android package deployer well enough to say whether it is available in some temporary location when the package is opened but it normally resides inside the APK file in the device's file system. The APK file is actually a simple ZIP archive that contains (among other things) a file called classes.dex. I recommend unpacking the APK file on the PC side, moving classes.dex into some directory on the device and launching dexdump in that directory.

Gabor Paller said...

Or even better: run the dedexer tool on the PC and find the DEX file in your development environment (in my setup, dx places the resulting DEX file to [project root]/bin/classes.dex.

Unknown said...

Thank you for your help.

You mentioned "unpacking the APK file on the PC side, moving classes.dex into some directory on the device and launching dexdump in that directory"

But if I am developing under eclipse for my android application, where can I find the apk file?

And where you said 'move the classes.dex in to a directory on device', how can I do it?

Thank you.

strazz said...

wade, gabor;

on the device the the dex files that have been run can be found in /data/dalvik-cache

wade;

an apk file is just like a jar file and can be extracted like one also :)

Rednoah said...

Hi Gabor:

Thanks for providing dedexer, it's greate help for analying dex code.

But I found a problem about the "array-length" opcode which translated by dedexer. It is different from the code dumped by Dexdump and the define of dex opcode.

For example:
Dex code dumped by Dedexer like
"array-length v19"
is different from "array-length v3, v0" which dumped by Dexdump.
"v19" was not exist in my whole dex code and we don't know which array owned the length value.

Could you take a look at this portion ?

Thanks.

Dan said...

I have a related question to this thread. I wish to unpack the AndroidManifest.xml file from the .apk file. I am working in the Java JDK and am not creating an Android project but rather a tool to verify the contents of the AndroidManifest.xml file. The tool will run on Windows PC. I can open the .apk file and extract the manifest but am unable to figure out how to parse the manifest itself. has anyone ever used the kxml.jar for parsing a file in wbxml? Is the manifest even encoding in wbxml does anyone know?

Here is a snippet of what I am doing so far:

**
while( entries.hasMoreElements() ) {
ZipEntry entry = (ZipEntry)entries.nextElement();

System.out.println( "\nDebug: " + entry.getName() );
if( entry.getName().equalsIgnoreCase( "AndroidManifest.xml" ) ) {

InputStream stream = zippy.getInputStream( entry );
int ch;
byte[] data = new byte[ 1024 ];
int counter =0;
while( (ch = stream.read() ) != -1 ) {
data[ counter ] = (byte)ch;
counter++;
}

// Now we have all our bytes in data array
analyze( data );
}
**

that analyze method is actually empty right now since I am stuck on trying to decipher the format for that manifest. I see this reference here -> http://www.w3.org/TR/wbxml/ but wish to know if anyone else out there has already been down this road or not.

any help is greatly appreciated.

Anonymous said...

Thanks for this tool! By way of feedback...

I tried ddx1.4.jar from sourceforge on Mac OSX. It works, but only with Java 1.6 (required setting the version order in a preference panel). It does not run with Java 1.5 (the class file version inside the jar prevents it from running).

Gabor Paller said...

"It works, but only with Java 1.6 "

Probably because I compiled it with 1.6. There should be no problem if you recompile with 1.5,

alexdonnini said...

Hello Gabor,

When I try to run dedexer, it generates the following error (see below). Am I making a mistake somewhere, can you help me resolve this issue?

Thanks Alex
alexdonnini@ieee,org

Exception in thread "main" java.lang.UnsupportedClassVersionError: Bad version number in .class file
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
at java.net.URLClassLoader.access$100(URLClassLoader.java:56)
at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)

Gabor Paller said...

alexdonnini, you probably try to run it with a JVM version earlier than 1.6.0 with which I compile.

I you stick to your JVM version, recompile Dedexer and it will work.

Unknown said...

Hello Gabor, I am trying the dedexer tool and I find nice all the work you have done. I tried your tool with several examples and for some of them it does not work. Apparently the problem is that those dex files have a wrong magic number. I get the following exception trace:

I/O error: Value read: 0x31; value expected: 0x33; file offset: 0x00000005
java.io.IOException: Value read: 0x31; value expected: 0x33; file offset: 0x00000005
at hu.uw.pallergabor.dedexer.DexParser.parseExpected8Bit(DexParser.java:
274)
at hu.uw.pallergabor.dedexer.DexSignatureBlock.parse(DexSignatureBlock.j
ava:19)

at hu.uw.pallergabor.dedexer.Dedexer.run(Dedexer.java:66)

at hu.uw.pallergabor.dedexer.Dedexer.main(Dedexer.java:14)

Do you have an idea about this problem? Have you already had such a problem?
I can eventually send you the files which does not pass dedexer.
Thanks in advance

Gabor Paller said...

Could you send me an example that does not work? gaborpaller@gmail.com

Unknown said...

I want to extract java files from classes.dex file . I have copied the classed.dex file in tools directory and when I use the command dedex -d classes.dex in adb shell, I get an error dedex:not found. how should I go about?

Gabor Paller said...

1. You are using the tool wrong, check out <a href="http://dedexer.sourceforge.net>dedexer.sourceforge.net</a> for the manual.
2. There is no way of "extracting Java files from classes.dex". The best I am aware of is disassembly to Dalvik bytecode which is an assembly-like language. That's what dedexer does.

Anonymous said...

How difficult is it to de-dex back into class files?
This way, "plain" java decompilers will be able to get back the java code out of the class files..
At first look, the generated dex file seems quite plain to convert (and at second look, I don't understand their registrer assignment scheme...)

Gabor Paller said...

it is not by chance that dedexer produces jasmin-like code. the vision was that this code could be transformed into proper jasmin code, jasmin could produce class files that could be decompiled. the problem of turning back register-based code to stack-based code is not trivial, however. dx is a 100kloc program.

Unknown said...

i want to ask . can i see source code from android apps ? for example . i had extract apps android but there source not appear . only xml file. png files and class.dex files. how to see source code on class file ?. email me netnote.adm@gmail.com

Gabor Paller said...

Dimaz, as far as I know, there is no way back to Java. Use dedexer if you are able to cope with Dalvik bytecode.

Here is this user-friendly presentation for start.

Furious ksc91u said...

Is there any assembler available?

I have some dex could not be disassemble by smali, but disassemble by ddx.

Now I want to convert it back to bytecode.

Yet Another Coder said...

How can we recompile the files we get through disassembling in to class.dex to recreate .apk?

Anonymous said...

@Gabor

I am planning to make API monitor. As in a tool to monitor the API calls made by the application.
So my question is in what way your tool can help me..?
Will I get to know the API calls made directly..?

Unknown said...

Hello, looks like interesting tool. Is it possible to run it on the Android, i.e. be able to access the dex bytecode at the runtime ? Thanks.

Gabor Paller said...

Pavel, I am not sure what you mean by accessing the dex bytecode at the runtime? Dedexer is a pretty average Java program so it can be ported to Android although I can't see yet the motivation.

peter Jerry said...

I read through the report and it was a great source of information. Thanks for providing the great read!

Anonymous said...

Gabor ,where i get a dedexer?

Gabor Paller said...

Anonymous, you can get it from here.

Vinit Kumar said...

hi guys ! itried dex tool to decompile the wireless apk file but when i run the command it gives me the following error..

java -jar ddx1.14.jar -d /home/Vinit/dex/ wireles.apk

I/O error: Value read: 0x50; value expected: 0x64; file offset: 0x00000000

So please give me some solution regarding the error..

Gabor Paller said...

Hi, Vinit! You have to extract the classes.dex from the APK file before you decompile it. Read this introductory presentation.

Anonymous said...

When I try to use dexdump from the adb shell, I get "dexdump: permission denied"

I have the dex file in the sdcard folder i.e. /mnt/sdcard

Does the device had to be rooted in order to execute dexdump?

Unknown said...
This comment has been removed by the author.
Unknown said...

Paid decompile project?

I am willing to pay someone for their time. I am looking for someone to decompile an Android apk on my behalf.

I have been looking for the proper code to create my Android application for well over two months now, without finding a viable solution. I am simply not finding the proper answers to build my application however I did find another finished Android project which is very close to what I want my application to do. I no longer have the time to continue researching code without having anything to show for it.

The Android application I am building is less that 30% similar to the Android application that I have found that I want to use as the base to start from and decompile. I would need the .apk file decompiled so I can open the file in Eclipse and view the full codebase and content of all:
src--> .java files
gen--> .java files
.jar files used in the application
and all the .xml files of the:
res--> layout and
res--> values, as well as the
manifest.xml

You can reach me at flygirlsfo@gmail.com

Anonymous said...

Hello Gabor,

Thanks you for the infomative tools, but I tried many times to work with dedexer and it doesn't work. I installed Java and it's run successfully. however when I open the dedexer nothing happene.

Can you please explain how can I install Dedexer please email me azab_moutaz@yahaoo.com

Gabor Paller said...

Anonymous, have you tried to follow the instructions on the dedexer home page?

dedexer.sourceforge.net

If you launch it without parameters, ddx dumps a help message. At least "that" should happen.

Steven said...

I'm getting this exact same error (including hex codes) every time I run this on any odex file:
I/O error: Value read: 0x36; value expected: [0x35,0x33]; file offset: 0x00000006

Any idea what's going wrong?

Gabor Paller said...

Steven, it looks like you have an old ODEX file. Could you send me one or two of those ODEX files?

gaborpaller at gmail.com

Alexandre said...

You can use too EASY APK DISSASSEMBLER

It's a great app and easy way.

You can download it here :

- http://code.google.com/p/easy-apk-dissassembler/

Unknown said...

I have a dex file with Unknown Instruction 0x09 at offset somewhere.... Would you mind checking it out?

Unknown said...

Hi,
I notice you didn't deal with instructions (that are in my DEX file):
03 32x move/16 vAAAA, vBBBB
09 32x move-object/16 vAAAA, vBBBB

I tried to add this two instruction in your source file DexInstructionParser.java, in the way that make instructs 03 like dealing with
02 22x move/from16 vAA, vBBBB
and instruction 09 like
08 22x move-object/from16 vAA, vBBBB
but just use read16Bit() for the first register...
However, then it continues to throw exception of java.lang.ArrayIndexOutOfBoundsException

So Is there something I am missing to add support of the unknown instruction 0x09, 0x03 and Is the array out of bound exception related to the unkown instructions 3e..43 10x (unused)?

Thanks for your reply!

Gabor Paller said...

Monica, could you send me an example DEX file? gaborpaller at gmail.com

Unknown said...

Hi Gaborpaller, I have sent you the DEX file to your email.... Thanks for checking it out....

Omkar said...

Hey,
after reading this ans(With Dedexer, you can disassemble the .dex file into dalvik bytecode (.ddx).

Decompiling towards Java isn't possible as far as I know.
You can read about dalvik bytecode ) on stack overflow ,
so i read ur blog.
I am quite impressed with Ur work and this ans because you're the one saying that it isn't (currently) possible, which seems true.
I would like to know few things about this tool as it is going to be usefull in my project.
I will communicate with you soon , on (gaborpaller@gmail.com).
So plz help me if possible.
Thanx in advance.