13th Working Conference on Reverse Engineering (WCRE 2006)
Extracting Output Formats from Executables
Benevento, Italy
October 23-October 27
ISBN: 0-7695-2719-1
We describe the design and implementation of FFE/x86 (File-Format Extractor for x86), an analysis tool that works on stripped executables (i.e., neither source code nor debugging information need be available) and extracts output data formats, such as file formats and network packet formats. We first construct a Hierarchical Finite StateMachine (HFSM) that over-approximates the output data format. An HFSM defines a language over the operations used to generate output data. We use Value-Set Analysis (VSA) and Aggregate Structure Identification (ASI) to annotate HFSMs with information that partially characterizes some of the output data values. VSA determines an over-approximation of the set of addresses and integer values that each data object can hold at each program point, and ASI analyzes memory accesses in the program to recover information about the structure of aggregates. A series of filtering operations is performed to over-approximate an HFSM with a finite-state machine, which can result in a final answer that is easier to understand. Our experiments with FFE/x86 uncovered a possible bug in the image-conversion utility png2ico.
Citation:
Junghee Lim, Thomas Reps, Ben Liblit, "Extracting Output Formats from Executables," wcre, pp.167-178, 13th Working Conference on Reverse Engineering (WCRE 2006), 2006