On the feasibility of binary authorship characterization

Saed Alrabaee, Mourad Debbabi, Lingyu Wang

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)

Abstract

This work aims to develop an automatic tool that can perform the laborious and error-prone reverse engineering task of binary authorship characterization, i.e., determining clues related to the author(s)of a piece of binary code. Software code written by human programmers reflects the author's educational background, level of expertise, and coding traits. Accordingly, these may be characterized by identifying meaningful features and examining them. Binary authorship characterization reveals information that can be extremely useful for security applications such as digital forensics, malware triage, and binary vulnerability tracking. This paper proposes a system, BinChar, that capture various aspects of author style, including code trait characteristics, code structure characteristics, and code behavior characteristics. For the purpose of detection, a Convolutional Neural Network (CNN)is used. The results generated by the CNN are evaluated more precisely using Bayesian calibration. We tested BinChar in identifying the characteristics of the authors of program binaries. Also, we applied it to almost 500 GB of malware samples provided by the Kaggle Microsoft Malware Classification Challenge, to demonstrate that BinChar is an appropriate tool for characterizing malware families. As an illustration, we report a case study in which we determine the author characteristics of the Mirai botnet and compare them with the author characteristics of 360,000 malware samples.

Original languageEnglish
Pages (from-to)S3-S11
JournalDigital Investigation
Volume28
DOIs
Publication statusPublished - Apr 2019

ASJC Scopus subject areas

  • Pathology and Forensic Medicine
  • Information Systems
  • Computer Science Applications
  • Medical Laboratory Technology
  • Law

Fingerprint

Dive into the research topics of 'On the feasibility of binary authorship characterization'. Together they form a unique fingerprint.

Cite this