That's sad news because the DEX format is not particularly well documented. More exactly: undocumented. There are some descriptions floating on the Internet but they are obsolete and inaccurate. Conveniently, the dx tool in Android SDK has some less used options that effectively document this format.
Dx is the utility that turns Java class files into DEX files. Every Android developer uses it regularly, although not everybody may be aware of its existence because the tool is invoked by automatically generated make/ant files. Dx has an option that dumps the content of the DEX file in human-readable format while generating the DEX file. This is the batch script I use to get that dump:
dx --dex --verbose --verbose-dump --dump-to=%BASEDIR%\%1\dexdump.txt --output=%BASEDIR%\%1\classes.dex %BASEDIR%\%1
Put your Java files into a subdirectory (e.g. test1) and invoke the batch script with the name of the directory. Beside the familiar classes.dex, dexdump.txt will be generated. This dump file is so verbose that reverse engineering of the DEX file format becomes something of a feasible project.
With using some test Java classes, the official Android opcode list and a lot of time, my first step was to document the Android opcodes. This is the bytecode the Dalvik virtual machine uses instead of the Java bytecode. If you are familiar with Java bytecode, you will see that the opcode set is pretty similar. Significant difference is that the Dalvik opcodes are register-based while Java bytecode is stack-based.
Click here to access the Dalvik opcode list.
The next step will be to put together a DEX disassembler. That will take some time, see you in 2009 with that!