Contribute Media
A thank you to everyone who makes this possible: Read More

So, You Want to Build an Anti-Virus Engine?

Translations: en


Android malware analysis engine is not a new story. Every antivirus company has their own secrets to build it. With python and curiosity, we develop a malware scoring system from the perspective of Taiwan Criminal Law in an easy but solid way.

We have an order theory of criminal which explains stages of committing a crime. For example, crime of murder consists of five stages, they are determined, conspiracy, preparation, start and practice. The latter the stage the more we’re sure that the crime is practiced.

According to the above principle, we developed our order theory of android malware. We develop five stages to see if the malicious activity is being practiced. They are

  1. Permission requested.
  2. Native API call.
  3. Certain combination of native API.
  4. Calling sequence of native API.
  5. APIs that handle the same register.

We not only define malicious activities and their stages but also develop weights and thresholds for calculating the threat level of a malware.

Malware evolved with new techniques to gain difficulties for reverse engineering. Obfuscation is one of the most commonly used techniques. In this talk, we present a Dalvik bytecode loader with the order theory of android malware to neglect certain cases of obfuscation.

Inspired by the design principles of the CPython interpreter, our Dalvik bytecode loader consists of functionalities such as 1. Finding cross-reference and calling sequence of the native API. 2. Tracing the bytecode register. The combination of these functionalities (yes, the order theory) not only can neglect obfuscation but also match perfectly to the design of our malware scoring system.

Further, we will also show a case study of Android malware and demonstrate how the obfuscation technique is useless to our engine. Last but not least, we will be open-sourcing everything (Malware Scoring System, Dalvik Bytecode Loader) during our presentation.

Audience 1. Who is this talk for? - Anyone who's interested in cyber security or anyone that wants to know how to build an anti-virus engine with Python.

2. What background knowledge or experience do you expect the audience to have? - A little of Android application development and malware analysis.

3. What do you expect the audience to learn or do after watching the talk? - The Dalvik bytecode loader is written as a python module, the audiences can use this module to boost up their malware analysis. - The malware scoring system can be applied not only to Android malware but also can be applied for PE files or ELF files in other OS. The audience can copy our ideas to extend their work. - Everything’s open-sourced.


1. Introduction of Malware Scoring System. First, we will detail how we decode Criminal Law to simple principles. For example, principles to define crime, penalty and the order theory of criminal, etc. And then we will detail how do we develop the order theory of android malware and other developed theories that construct the malware scoring.

2. Design Logic of the Dalvik Bytecode Loader. Once the malware scoring system were built, this will discuss the design logic of our Dalvik bytecode loader which includes our obfuscation-neglect module and bytecode register tracing module. We will also detail why the order theory of android malware succeeds at neglecting the obfuscation.

3. Quark Engine Practice - Case Study of Android Malware Next, we will practice our engine and case study through an android malware. Moreover, we will also demonstrate our obfuscation-neglect technique against obfuscation malware.

4. Future works Here, we will discuss the limitations of our engine. For example, the challenge of our Dalvik bytecode loader. Also, we will share our plans of implementing more detection techniques conquering the escape detection of malware.

Improve this page