Amir R. Khakpour and Alex Liu
March, 2009
This paper concerns the fundamental problem of identifying the content nature of a flow, namely text, binary, or encrypted, for the first time. We propose Iustitia, a tool for identifying flow nature on the fly. The key observation behind Iustitia is that text flows have the lowest entropy and the encrypted flows have the highest entropy, where the entropy of binary flows stands in between. The basic idea of Iustitia is to classify flows using machine learning techniques where a feature is the entropy of every certain number of consecutive bytes. The key features of Iustitia are high speed (10% of average packet inter-arrival time) and high accuracy (86%).
You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format.