APT CLASS

Attributing a piece of malware to its creator typically requires threat intelligence to attain a sufficient confidence level. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to gather relevant features and build a fingerprint to identify the author.

To date, most research focuses on source code authorship attribution and the application of similar techniques to benign and malicious binaries. However, this approach provides an opportunity for malicious authors to attack the authorship attribution models due to the stark differences between both source code and binaries and benign and malicious authors.

Our survey (joint work with S3Lab) explores the style of threat actors and the adversarial techniques used by them to remain anonymous. We examine the adversarial impact on state-of-the-art methods for binary authorship attribution. Through this approach, we identify key findings and explore the open research challenges to identifying authorship style within malicious binaries.

One major challenge is the lack of a ground truth dataset of malware and authors. To mitigate this issue for the community, we publish alongside this survey a meta-information dataset of 17,513 malware labeled to 275 threat actor groups. This is the largest and diverse dataset to date. Additionally, we identify a further 5,630 malicious samples currently linked to unknown groups.

Access

To request access to the dataset, please complete the following form:

> FORM TO REQUEST ACCESS <

We have already granted access to people from the following institutions (alphabetical order):

Amadeus IT Group, Spain
Beijing University of Posts and Telecommunications, China
Ben Gurion University, Israel
Bern University of Applied Sciences, Switzerland
BlackTruffle Security
Cybergeeks[.]tech
Delhi Technological University, India
DSO National Laboratories
FortiGuard Labs
Fraunhofer FKIE, Germany
Georgia Tech Research Institute, USA
Global Infotek, Inc, USA
Grammatech, USA
Hacettepe University
Harfanglab, France
Hasso-Plattner-Institut
HRL Laboratories, USA
Illinois Institute of Technology
IMDEA Software Institute
Institute of Information Engineering
International Business Machines (IBM), USA
Indian Institute of Technology Kanpur, India
Information Sciences Institute, University of Southern California, USA
InQuest
Jawaharlal Nehru University
Jinan University, China
Kennesaw State University, USA
Kudu Dynamics, USA
Lancaster University, UK
Mahidol University
Nanjing University of Posts and Telecommunications
Nanyang Technological University - NTU Singapore
National University and Science and Technology Islamabad, Pakistan
National University of Singapore
NATO
Naval Research Laboratory, USA
Norwich University
OpenAnalysis Inc
Osaka Electro-Communication University, Japan
Purdue University
PolySwarm - Malware Intelligence
Recorded Future, USA
Rice University, USA
Ritsumeikan University
Royal Holloway University Of London, UK
Ruhr-Universität Bochum, Germany
Sabancı University, Turkey
Shahid Beheshti University, Iran
SRI International
The MITRE Corporation
TU Wien, Austria
UC Berkeley, USA
University Institute of Information Technology, PMAS, Pakistan
University of Chinese Academy of Sciences, China
University of Colorado
University of Illinois, USA
University of Kent, UK
University of New Brunswick, Canada
University of Saskatchewan, Canada
University of Southern California
Universidad Técnica Particular de Loja
Unknown Cyber Inc
Westphalian University, Germany
Wrexham Glyndwr University
Wright State University
Wuhan University, China
Zeropoint Dynamics, USA
Zetier

Papers

Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets
Jason Gray, Daniele Sgandurra, Lorenzo Cavallaro, Jorge Blasco Alis
CSUR 2024 · ACM Computing Surveys, 2024

@article{Grayetal2024,

  author    = {Gray, Jason and Sgandurra, Daniele and Cavallaro, Lorenzo and Blasco Alis, Jorge},

  title     =  {Identifying Authorship in Malicious Binaries: Features, Challenges \& Datasets},

  journal   = {ACM Comput. Surv.},

  issue_date   = {August 2024},
 
  publisher   = {Association for Computing Machinery},

  address   = {New York, NY, USA},

  volume    = {56},

  number    = {8},

  month     = {apr},

  year      = {2024},

  articleno      = {212},

  numpages       = {36},

  url       = {https://doi.org/10.1145/3653973},

  doi       = {10.1145/3653973},

  issn = {0360-0300},

}

People

Jason Gray, Ph.D. Student, King's College London & Royal Holloway, University of London
Daniele Sgandurra
Lorenzo Cavallaro, Full Professor of Computer Science, Chair in Cybersecurity (Systems Security), King's College London
Jorge Blasco Alis, Associate Professor, Universidad Politécnica de Madrid