“read-only” process (do not require a participation in the communication).
Identify the fixed and dynamic fields of all the messages.
Regroups equivalent messages depending of their fields structures.
The following picture shows the sequence alignment of two messages.
The following picture shows a regroupment of similar messages based on the result of the clustering process.
The abstraction is the process of substituting the dynamic fields with their representation as a regex. An example of abstraction is shown on the follinw picture.
aaa
aaa
aaa
aaa
aaa
aaa
aaa
aaa
This function shows a graphical representation of the distribution of bytes per offset for each message of the current group. This function helps to identify entropy variation of each fields. Entropy variation combined with byte distribution help the user to infer the field type.
[INCLUDE GRAPH]
aaa
aaa
And from the environment...
The function “Find Size Fields”, as its name suggests, is dedicated to find fields that contain any length value as well as the associated payload. It does this on each group. Netzob supports different encoding of the size field : big and little endian binary values are supported through size of 1, 2 and 4 bytes. The algorithm used to find the size fields and their associated payloads is desribed in the table XXX.
[INCLUDE ALGORITHM]
The following picture represents the application of the function on a trace example. It shows the automated extraction of the IP and UDP payloads from an Ethernet frame.