androguard code tracing

2022-04-22

androguard code trace

androguard Usage

This is the most common way how we use androguard.

1
2
3
4
from androguard.misc import AnalyzeAPK


apk, dalvikvmformat, analysis = AnalyzeAPK(apk_filepath)

Entry Point

Let’ us start from the AnalyzeAPK.

AnalyzeAPK(androguard/misc.py)

Responsible for the generation of the three objects below.

  • apk
  • dalvikvmformat
  • analysis

apk object

You can get information like the package name, permissions from the AndroidManifest.xml, which generated by class APK().

androguard/misc.py

1
a = APK(_file, raw=raw)

androguard/core/bytecodes/apk.py

1
class APK():

It will parse files in APK through Python zipfile module. This APK object is responsible for decompressing “AndroidManifest.xml” in APK.

Implementation function:

androguard/core/bytecodes/apk.py

1
2
if not skip_analysis:
self._apk_analysis()
1
def _apk_analysis(self)

ASRC class

for decoding resources.arsc

AXML class

for decoding AndroidManifest.xml and all other XML files

Inside this method, it will get the information below.

  • AXML
  • Permissions
  • Each tag in xml

The following video records the process of making an apk object.


dalvikvmformat Object

The DalvikVMFormat corresponds to the DEX file found inside the APK file. You can get classes, methods or strings from the DEX file.

androguard/misc.py

1
2
3
4
5
6
7
8
9
10
11
d = []
dx = Analysis()
for dex in a.get_all_dex():
df = DalvikVMFormat(dex, using_api=a.get_target_sdk_version())
dx.add(df)
d.append(df)
df.set_decompiler(decompiler.DecompilerDAD(d, dx))

dx.create_xref()

return a, d, dx

Analysis Object

The Analysis object should be used instead, as it contains special classes, which link information about the classes.dex and can even handle many DEX files at once.

The Analysis contains a lot of information about (multiple) DalvikVMFormat objects Features are for example XREFs between Classes, Methods, Fields, and Strings. Yet another part is the creation of BasicBlocks, which is important in the usage of the Androguard Decompiler.

quark/androguard/misc.py

1
dx = Analysis()

androguard/core/analysis/analysis.py

1
class Analysis:

Find information inside DEX file:

  • DalvikVMFormat objects
  • classname
  • string
  • methods

EXTERNAL method

It is called the native API.

XREFs

Classes, Methods, Fields, and Strings.


Initialize d and dx.

Prepare for the following code below.

1
2
3
4
5
6
7
for dex in a.get_all_dex():
df = DalvikVMFormat(dex, using_api=a.get_target_sdk_version())
dx.add(df)
d.append(df)
df.set_decompiler(decompiler.DecompilerDAD(d, dx))

dx.create_xref()

Param dex will return the raw data of all classes dex files.

1
2
3
4
get_all_dex()

dexre = re.compile(r"classes(\d*).dex")
return filter(lambda x: dexre.match(x), self.get_files())

get_all_dex get all the file names in the zip, and get all the class.dex files through regular expressions, it may be classes.dex or classes[0-9]+.dex.

quark/androguard/misc.py

1
df = DalvikVMFormat(dex, using_api=a.get_target_sdk_version())

DalvikVMFormat takes the binary content of each DEX as a parameter, and initializes the class DalvikVMFormat according to the API version used.

androguard/core/bytecodes/dvm.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class DalvikVMFormat(bytecode.BuffHandle):
"""
This class can parse a classes.dex file of an Android application (APK).

:param buff: a string which represents the classes.dex file
:param decompiler: associate a decompiler object to display the java source code
:type buff: bytes
:type decompiler: object

example::

d = DalvikVMFormat( read("classes.dex") )
"""

def __init__(self, buff, decompiler=None, config=None, using_api=None):
# to allow to pass apk object ==> we do not need to pass additionally target version
if isinstance(buff, APK):
self.api_version = buff.get_target_sdk_version()
buff = buff.get_dex() # getting dex from APK file
elif using_api:
self.api_version = using_api
else:
self.api_version = CONF["DEFAULT_API"]

super().__init__(buff)
self._flush()

self.CM = ClassManager(self)
self.CM.set_decompiler(decompiler)

self._preload(buff)
self._load(buff)

This class can parse a classes.dex file of an Android application (APK).

androguard/core/bytecodes/dvm.py

  • ClassManager

This class is used to access to all elements (strings, type, proto …) of the dex format based on their offset or index.

  • set_decompiler

Setting the disassembled tool used for disassembling an apk such as Jadx or DAD.

Last one:

androguard/core/analysis/analysis.py

1
dx.create_xref()

Create Class, Method, String, and Field crossreferences for all classes in the Analysis.

If you are using multiple DEX files, this function must be called when all DEX files are added. If you call the function after every DEX file, it will only work for the first time.

This function is already called in the object Analysis, so we don’t need to call it again.

The following video records the process of making an DalvikVMFormat object.

Important

Only when calling the get_source_method function in DecompilerDAD is it considered that disassemble an apk.

In fact, we only use the information in the Dex file and the method of cross-references, as well as the information about android permission. We did not do the disassemble an apk.