MAGIC™ - Malware Genomic Correlation
Automatically generate Yara rules based off of shared malware code.
MAGIC™: Malware Genomic Correlation
Cythereal MAGIC™ is a transformative anti-malware defense technology, with no counterpart. MAGIC™ takes in malware uploaded by security analysts and from commercial malware feeds and turns it into actionable intelligence to use for defense, hunting, and incident response. More specifically MAGIC™ helps the analysts:
- To use malware as a query to search for similar malware from the same actor or related purpose
- Use their AV quarantine to identify coordinated malware campaigns and get an early warning of targeted attacks
- Assess the potential of the campaigns to evade their anti-virus systems, and
- Generate Yara rules to hunt for polymorphic variants of a malware
In other words, MAGIC™ converts malware into a source of actionable intelligence for analysts to hunt other malware. MAGIC™ reduces hours and days of chasing IoCs to detect campaigns and waiting for crowd-sourced Yara rules into a simple exercise three step exercise: upload, download, hunt.
The MAGIC™ Difference
MAGIC™ is a machine-learning based malware analysis system produced from over a decade of advanced research sponsored by US DoD. It differs from other machine learning systems in that:
- It uses both static and dynamic analysis to extract features from malware
- It uses a patent-pending method to deobfuscate polymorphic variants and normalize their code
- It uses “semantic features” that make it resilient against many classes of code transformation and compiler optimizations
- It provides a “content-based search” wherein one can upload malware to find other similar malware
- It automatically separates malicious code in a malware from benign code, such as shared libraries
- It creates Yara rules from bytecode of malicious functions, enabling detection of malware that may yet not have been created
Innovations Packed In MAGIC™
The most significant innovation powering MAGIC™ is a method for computing “normalized semantics of code” such that two different sequences of instructions performing the same work have the exact same string representation of semantics. This representation, we call Malware Genome, enables us very fast and very accurate searches for polymorphic variants of malware on very large malware repositories.
Second, MAGIC™ can even search for payload carried in packed malware by automatically unpacking a malware and indexing the payload code as well. MAGIC™’s unpacker, a significant innovation in its own right, unpacks over 80% of malware in our daily malware feed.
Third, MAGIC™ uses recent advances in formal methods called Abstract Symbolic Automata to construct Yara rules from bytecode of a collection of polymorphic variants of a function.
Finally, MAGIC™ marries highly rigorous formal analyses with equally fuzzy statistical analyses to create a highly scalable malware analysis system with tremendous amount of power.
Clusters of malware originating from the same code based identified by MAGIC™
The Power of MAGIC™
While it is often challenging to reason about how and why a machine learning system works, that is not the case with MAGIC™. The use of formal analysis provides us a mathematical certainty that the programs that share a lot of genome do in fact perform similar function. Thus, MAGIC™ generated searches, clusters, and Yara rules are guaranteed to have low false positives.
MAGIC™’s unparalleled ability to cluster malware originating from the same code base has also been independently evaluated by US DoD contracted experts. Several security companies have found that MAGIC™ generated Yara rules have almost no false positives, yet they cast a wide net that even catches new malware families through their sharing of code with other malware.
MAGIC™ is implemented on an elastic cloud architecture, performing end-to-end analysis of new malware in about four minutes. MAGIC™ provides a RESTful API to easily integrate with and automate the workflow of a security operations center.
537 Cajundome Blvd
Lafayette, LA 70506
+1 (504) 335-1910