CodexGigas Malware DNA Profiling Search Engine

Luciano Martins and Rodrigo Cetera

Similar to human fingerprints, every malware has its own unique digital fingerprint information that differentiates it from others.  As a result, malware will always attempt to hide their true self by deleting or changing this information to avoid detection by antivirus companies and malware researchers.

Since malware developers go to great lengths to obfuscate the characteristics of their creation, it is often difficult to identify multiple characteristics and correlation points by researchers and malware analysts.

Through our studies we were able to create an algorithm and accompanying search engine that generates a unique thumbprint, catalogues it, compares and searches against millions of other samples that may have similar features or abilities.

By analyzing malware capabilities, the algorithm is able to build characteristic families to which a new sample can be categorized and therefore identified for specific behavior, enabling early identification and detection of new malware by comparing against previous existing ones.

In the presentation we are going to show the results of our studies and show the highlight commonalities that are only visible when a sample is compared against 35 million catalog equivalent of roughly 23.5 TB of binary data.

We will demonstrate the results of our work and the techniques used to derive these results. The framework, analysis plugins, and the portal, will be released as open source.