depthcharge.hunter

This module provides the Hunter base class and its associated implementations.

In Depthcharge parlance, a “Hunter” is class that searches data for items of interest and either provides information about located instances or produces an artifact using the located data (e.g., a Stratagem).

Base Class

class depthcharge.hunter.Hunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)

The Hunter class provides a foundation upon which specific Hunter implementations can be built.

It provides default find() and finditer() implementations, and defines a fallback build_stratagem() implementation that raises StratagemNotRequired.

A Hunter subclass implementation can either:

  1. Implement a private _search_at() and allow the base implementation to do the rest.

  2. Override these methods entirely.

When adding a new Hunter, the former is preferable because offset validation logic, support for gaps in the search range (to skip), and displaying progress bars during the search all come for free.

Constructor Arguments:

  • data - Memory or flash dump data to search

  • address - Memory address corresponding to the start of *data

  • start_offset - Offset within data to begin searching. A negative value implies 0 (default).

  • end_offset - Inclusive upper bound offset for the search. A negative value implies the last element in data.

  • gaps - A list of regions within data that should be skipped during searches. (One may want to exclude regions of data actively modified by the running bootloader.) List entries may be either (address, length) tuples of range objects. Caller-provided gaps must not overlap.


_search_at(target, start_offset: int, end_offset: int, **kwargs) dict

Private Method: Not to be called directly by API users

Hunter implementations that wish to leverage the aforementioned facilities of the default find(), finditer(), and build_stratagem() methods should implement this.

The implementation should search for the target information within the Hunter’s data (provided via its constructor). The search should be performed within the bounds of start_offset and end_offset, which will be guaranteed to be within the bounds of data.

If a result is found, this method should return a dictionary per the description of find().

If no result is found and the implementation searched the entire range of [start_offset, end_offset] in a single call, it should raise a HunterResultNotFound exception.

Otherwise, if no result is found and the implementation only checked that no match was present at start_offset, it should return None. The parent class will take care care of repeatedly calling _search_at() to complete a search of the full range.

property name: str

Hunter class name

find(target, start=-1, end=-1, **kwargs) dict

Search for target within the Hunter’s data (provided via the constructor) and return information about the first result encountered as a dictionary with the following keys-value pairs:

Key (str)

Type

Description

src_off

int

0-based index into data where target was found

src_addr

int

Absolute address of the located target

src_size

int

Size of the located target, in bytes

Subclasses of Hunter may return additional entries in their result dictionary. (Refer to the specific subclass documentation for any additional entries.)

Note that the above key-value scheme is intentionally a consistent with that used in Stratagem entries, and the repective Stratagem Specification returned by an Operation’s stratagem_spec() method.

The start_offset and end_offset parameters provided to the Hunter constructor can be overridden using this method’s start and end arguments, respectively. The default negative values mean, “Use the offsets that the object was created with.

A HunterResultNotFound exception is raised if the target could not be found.

finditer(target, start=-1, end=-1, **kwargs)

Return an generator over all find() results for target in the data of interest.

The start and end arguments, if greater than or equal to zero, can be used to override the start_offset and end_offset parameters provided to the Hunter constructor.

Example:

for result in my_hunter.finditer(my_target):
    offset = result['src_off']
    size   = result['src_size']
    hexstr = my_data[offset:offset + size].hex()

    msg = '{:d}-byte result found at 0x{:08x}: {:s}'
    print(msg.format(size, result['src_addr'], hexstr))
build_stratagem(target_payload: bytes, start=-1, end=-1, **kwargs)

Produce a Stratagem that can be used to create target_payload, given data (provided earlier via the Hunter constructor).

The start and end parameters can be used to override the start_offset and end_offset parameters that the Hunter was created with. If left as negative values, the Hunter’s defaults will be used.

The nature of the resulting Stratagem is tightly coupled with the Operation it will be used by. Refer to specific implementations for more information, as well as any **kwargs that can be used to configure the Stratagem creation.

If a particular Hunter does not produce Stratagem objects, invoking this method will raise a StratagemNotRequired exception.

Implementations

class depthcharge.hunter.CommandTableHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)

The CommandTableHunter searches a U-Boot memory dump for instances of linker lists containing the cmd_tbl_s structures that define console commands. Within the U-Boot source code, console commands are declared using U_BOOT_CMD and related preprocessor macros. (More background information can be found in U-Boot’s README.commands.)

Constructor

The CommandTableHunter constructor requires that an arch= keyword argument be provided. This argument may be either a string or a value returned by Architecture.get():

my_hunter = CommandTableHunter(mem_dump, 0x4FF4_F000, arch='arm')

The address parameter should generally refer the corresponding data’s post-relocation address. This address information is used to determine whether a potential cmd_tbl_s structure contains valid information. For example, char * pointers are “dereferenced” (within the confines of the provided image) to confirm that they lead to NULL-terminated ASCII strings.

If the data address is not known, pass check_ptrs=False to find() and finditer(). Be aware that this will very likely result in a (potentially large) number of false postives.

Motivation

The presence of these command tables can serve as an indicator that CONFIG_CMDLINE is enabled. This evidence can be used to justify further analyses focusing of how a console can be accessed, if it is not otherwise obviously exposed or protected with standard functionality. For example, does vendor-specific code hide the U-Boot console unless a particular GPIO pin is asserted? Is a custom functionality akin to CONFIG_AUTOBOOT_STOP_STR and friends used to gate access to the console?

The presence of multiple unique command tables within a U-Boot memory can also be quite interesting! This can be indicative of a situation in which different commands are exposed based upon different authorization levels (implemented by vendor-specific code). An example of this can be found in examples/symfonisk_unlock_bypass.py, where pointers to an unauthenticated “insecure” command table are redirected to a post-authentication “secure” command table.

Once the locations and layout of command tables in memory are known, they can be patched at runtime to insert alternative functionality (provided that a MemoryWriter is available). From there, one can instrument the operating environment as desired to further explore a SoC, its RAM (from a prior operating state), storage media, and peripherals.

find(target, start=-1, end=-1, **kwargs) dict

This Hunter searches for a command table containing the command specified by target. If target is None or an empty string, the first table found in the search range will be returned.

The following keyword arguments can be used to further configure the search behavior.

Name

Type

Default

Description

threshold

int

5

Number of valid-looking consecutive cmd_tbl_s entries to observe before a table is considered valid.

check_ptrs

bool

True

Follow pointers in order to confirm whether data at a given search location is a command table entry. Setting this to False will result in false positives.

longhelp

bool

None

Denotes state of U-Boot’s CONFIG_SYS_LONGHELP. Keep this set to None if you do not know it; the Hunter will attempt to infer the right value.

autocomplete

bool

None

Denotes state of U-Boot’s CONFIG_AUTO_COMPLETE. Keep this set to None if you do not know it; the Hunter will attempt to infer the right value.

finditer(self, target, start=-1, end=-1, **kwargs)

Returns an iterator that provides each command table found in the search range.

Refer to find() for a description of target and supported keyword arguments.

Otherwise, it behaves as described in Hunter.finditer().

classmethod result_str(result: dict) str

Convert find() and finditer() result dictionaries to a string suitable for printing to a user.

classmethod result_summary_str(result: dict) str

This method is similar to result_str(), but returns a shorter summary string.

class depthcharge.hunter.ConstantHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)

This is a simple Hunter that searches for fixed data values (of type bytes).

Example use-cases include searching for file format “magic” values (e.g. Device Tree’s d00dfeed), tables (e.g., CRC32, SHA1, SHA256 LUTs), or opcodes near code or data of interest.

Its constructor and methods are implemented according to the descriptions in Hunter.

find(target, start=-1, end=-1, **kwargs) dict

Searches for constant specified by target.

finditer(target, start=-1, end=-1, **kwargs) dict

Returns an iterator over all occurrences of target in the search range.

class depthcharge.hunter.CpHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)

The CpHunter class searches for a series of cp command invocations that can be performed to result in the desired payload.

find(target, start=-1, end=-1, **kwargs)

CpHunter does not implement this method. Raises OperationNotSupported.

finditer(target, start=-1, end=-1, **kwargs)

CpHunter does not implement this method. Raises OperationNotSupported.

build_stratagem(target_payload: bytes, start=-1, end=-1, **kwargs)

Produce a Strategem for use with CpMemoryWriter. This implementation attempts to reduce the total number of cp operations by searching for common substrings.

class depthcharge.hunter.EnvironmentHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)

An EnvironmentHunter searches a memory or flash dump for instances of U-Boot environments, which store collections of variable definitions. This Hunter identifies three types of U-Boot environments:

  1. Built-in defaults used when no (valid) environment is present in non-volatile storage.

  2. A valid environment stored in non-volatile storage, prefixed with a CRC32 header value.

  3. A “redundant” environment, (see CONFIG_SYS_REDUNDAND_ENVIRONMENT) which includes both a CRC32 header and a flags value used to determine the active environment copy.

The constructor supports two keyword arguments that place a lower and upper bound on the sizes (in number of entries) of environment instances returned by find() and finditer():

  • min_entries (default: 5)

  • max_entries (default: None)

find(target, start=-1, end=-1, **kwargs) dict

If target is an empty string or None, the first environment encountered in the search range is returned. Otherwise, only an environment containing the string or byte specified by target will be returned.

In addition to the standard keys returned by Hunter.find(), EnvironmentHunter includes the additional items:

  • arch - The architecture of the platform. (This will inform the endianness of checksum.)

  • type - One of: ‘Built-in environment’, ‘Stored environment’, ‘Stored redundant environment’

  • crc - The CRC32 checksum of the environment (if one of the ‘Stored’ types)

  • flags - Only present for ‘Stored redundant environment’. Contains monotonically increasing (modulo 256) integer value used to determine which redundant environment is “freshest”.

  • dict - The environment contents as a dictionary, with the variable names as keys.

  • raw - The raw environment data as NULL-terminated ASCII strings (type: bytes).

finditer(self, target, start=-1, end=-1, **kwargs)

Returns an iterator that provides each environment instance found in the search range.

Refer to find() for a description of target and supported keyword arguments.

Otherwise, it behaves as described in Hunter.finditer().

class depthcharge.hunter.FDTHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)

This Hunter searches for Flattened Device Tree (also see elinux.org) instances within a memory or flash dump.

If the Device Tree Compiler (dtc) is installed, results will include both the binary representation of the device tree (dtb) as well as a source representation (dts).

find(target, start=-1, end=-1, **kwargs) dict

If a device tree containing a specific string or byte sequence is desired, this can be specified via target. Otherwise it can be left as None to search for any device tree.

The returned dictionary will contain the binary representation of the device tree in a 'dtb' entry. If the Device Tree Compiler is installed on your machine, a source representation will additionally be provided in a 'dts' entry.

A no_dts=True keyword argument can be used to prevent the FDTHunter from attempting to convert a DTB to a DTS.

finditer(self, target, start=-1, end=-1, **kwargs)

Returns an iterator that provides each Flattened Device Tree instance found in the search range.

Refer to find() for a description of target and supported keyword arguments.

class depthcharge.hunter.ReverseCRC32Hunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)

The ReverseCRC32Hunter searches for CRC32 preimages in order to allow the U-Boot crc32 console command to be exploited as an arbitrary memory write primitive that produces a payload 4-bytes at a time.

It effectively answers the question, “What series of CRC32 operations would I need to perform to produce my desired binary payload?” This answer is provided in the form a Stratagem, which is in turn used by the CRC32MemoryWriter to perform the actual memory writes.

Constructor

The constructor conforms the Hunter definition and supports two additional keyword arguments.

The endianness keyword argument specifies the byte order of the target device. It defaults to sys.byteorder, which may not match your target. It can be specified as either 'big' or 'little'.

A revlut_maxlen keyword argument can be used to override the maximum size of individual reverse lookup table (RLUT) entries. The default value is 256.

In general, increasing revlut_maxlen value uses more host memory when attempting to find a CRC32 preimage, increasing the likelihood of success. The lower limit of this value is 1 and its upper bound is defined by the size of data. Continue reading to get a better understanding of what exactly this parameter controls.

Implementation Details

The implementation of ReverseCRC32Hunter leverages the malleability property of CRC32 – just one reason why it is never suitable for validation of data sucseptible to data forgery or tampering threats. ReverseCRC32Hunter uses a simplification of the technique presented in Listing 6 of the paper listed below. In our simplified case we’re effectively “appending” 4 bytes to a zero-length input to achieve a chosen CRC32 output. The value we end up “appending” is the 4-byte inverse of our chosen CRC32 output. This simplified algorithm implementation resides in depthcharge/revcrc32.py.

Reversing CRC - Theory and Practice
by Martin Stigge, Henryk Plötz, Wolf Müller, Jens-Peter Redlich
HU Berlin Public Report, SAR-PR-2006-05, May 2006
Download: PDF (Mirror)

It is not necessarily the case, however, that the 4-byte inverse of our desired CRC32 value exists within a given ROM or memory dump (i.e., our domain of potential CRC32 inputs). Instead, we iteratively perform this reverse CRC32 operation on each 4-byte result until we find an input that does. When later writing memory using the crc32 U-Boot console command (via CRC32MemoryWriter), each iteration incurs some amount of serial console overhead execution time on the target device. When focusing only on 4-byte inputs, this can result in extraordinarily long execution times.

To reduce execution time on the target (i.e. the number of CRC32 iterations performed) we can make a trade off in the form of increased memory consumption on our host machine at the time when the Stratagem (intended for use with CRC32MemoryWriter) is being created. To achieve this, we use a reverse-lookup table (RLUT) that maps CRC32 values to the shortest byte sequence in our input data that produces them.

When the ReverseCRC32Hunter constructor is invoked it will produced this RLUT. This will generally take some time, so a progress bar and ETA is shown.

The RLUT construction is performed over the constructor’s data argument (excluding any regions specified by the gaps keyword argument) for a sliding windows of size N. This is repeated for all values of N in the inclusive range shown below.

[1, min(revlut_maxlen, len(data)) ]

As shown above, This is where the revlut_maxlen keyword argument described earlier comes into play. In general, results will be “better” with a larger revlut_maxlen value. The “best” value is one maximizes RAM utilization on one’s host (without triggering a MemoryError, of course). The memory requirements (roughly) grow cubically with respect to this parameter. Bear in mind that the memory consumption of these RLUTs are on the order of GiB.

Below is an visual example of this process, with a simplified 4-iteration solution. The left side shows the path from a desired payload back to a byte sequence present in the input data. The right side denotes the runtime sequence of CRC32 operations performed on the target device to produce a 4-byte portion of the desired payload.

../_images/crc32-stratagem.png

To produce an N-length payload (where N is divisible by 4), the whole process is repeated for each 4-byte word in the desired payload. Depthcharge makes two simplifying assumptions:

  1. The input data is static; it does not change at runtime.

  2. The location where payloads are written does not overlap the input data.

Both of these are reasonable assumptions; the input data can be carved to meet these requirements, or the ReverseCRC32Hunter constructor can be invoked with the gaps= keyword to exclude memory regions as needed. Under these assumptions, each 4-byte word in the produced output can be computed in parallel. Depthcharge uses Python’s multiprocessing module to distribute tasks to multiple workers, with a default worker count equal to the system’s CPU count.

Finally, one additional optimization is used to reduce the total number of CRC32 operations that the need to be performed on a target device at runtime. Envision a case where a 4-byte sequence X = [aa bb cc dd] occurs 5 times within a payload, and the produced :py:claske s:~depthcharge.Stratagem requires 3,500 CRC32 iterations produce a single instance of the 4-byte value X. The naive approach would be to perform a total of 17,500 CRC32 operations to write all 5 copies of X, 80% of which are redundant. Therefore, to eliminate this unnecessary overhead, build_stratagem() identifies the unique 4-byte words in a payload, and distributes only these to the parallel workers.

When a result is produced for a word occurring only once in the desired payload, no additional behavior is necessary. However, if a word (X) occurs multiple times, the produced Stratagem entry is split into multiple entries.

  1. The produced Stratagem is appended as-is, but with the iterations value reduced by 1. The in-progress result is written to the payload location for this reduced number of iterations.

  2. For each of the 4 remaining occurrences of X, a special Stratagem entry is created. Instead of having a src_addr pointing to an address within the input data, it instead has a tsrc_off value that denotes that the input can be found at the location where the result of Step 1 was written. Only a single iteration is required to write produce X.

  3. A “finalizing” Stratagem entry is appended that performs one last CRC32 operation on the location storing the result of Step 1, with the output written to the same address.

With this approach, the same desired payload can be achieved, but with only 3,504 CRC32 operations.

To avoid confusion, Stratagem entries produced by ReverseCRC32Hunter do not contain a src_off key. In cases where a tsrc_off (Target buffer source offset) is used, src_addr is set to -1.

Example

If you still have a healthy dose of skepticism about the high-level approach used here, worry not! An example demonstrating this algorithm is provided in examples/reverse_crc32_algo_poc.py This example uses build_stratagem() to produce the following string, using Edgar Allen Poe’s The Raven as input data. In order to allow this example to be used without a target device, the Stratagem is “executed” by simply making calls to zlib.crc32, rather than by passing the Stratagem to CRC32MemoryWriter().

Payload String:

"NCC Group - Depthcharge \n<https://github.com/nccgroup/depthcharge>"

find(target, start=-1, end=-1, **kwargs) dict

Search for a sequence of CRC32 operations, performed over the data provided to the ReverseCRC32Hunter construtor that results in target

That target parameter must be a bytes object with a length of 4.

The start and end parameters operate according to Hunter.find().

A max_iterations keyword argument can be provided to limit the maximum number of operations to allow when searching for a sequence of CRC32 operations that result in the target value. This default is to 4096.

As discussed in the ReverseCRC32HunterImplementation Details” section, increasing the allowed maximum number of iterations will increase the amount of time required to deploy a payload with MemoryWriter. To keep this reasonably low, the revlut_maxlen parameter passed to the ReverseCRC32Hunter constructor may need to be increased.

When a result is found, it is returned in a dictionary that has the keys described in Hunter.find(), and an additional iterations key.

In the event that no result is found, a HunterResultNotFound exception is raised.

build_stratagem(target_payload: bytes, start=-1, end=-1, **kwargs)

Given a target binary payload, return the sequence of CRC32 operations that can be performed to create this payload, one 4-byte word at time. This result is returned in the form of a Stratragem. Refer to the ReverseCRC32Hunter Implementation Details section for details about how this works.

The target_payload must be a multiple of 4 bytes. Otherwise, an StratagemCreationFailed exception is raised.

This implementation assumes that the location where the target_payload is being written does not overlap the memory space being searched for viable operations. It is the user’s responsibility to provide a gaps list to the constructor to avoid this.

If a Stratagem for the desired payload could not be created because no solution was found, a HunterResultNotFound exception is raised.

Optional keyword arguments:

  • max_iterations - Maximum number of CRC32 operations to allow per 4-byte word. Default: 4096

  • num_procs - Number of concurrent processes to use during search. Default: System’s CPU count

Tip

When possible, consider using ROM code as the Hunter’s input data. This will allow Stratagem to remain usable across any changes in the target system’s U-Boot build.

class depthcharge.hunter.StringHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)

The StringHunter can be used to search for NULL-terminated ASCII strings within a binary RAM or flash dump (i.e., data shoudl be of type bytes), via regular expressions.

As every good little reverse engineer knows, stings can be very telling about the nature of code. For example, they could hint at the use of HABv4 functionality to authenticate images on NXP i.MX-based platforms.


find(target, start=-1, end=-1, **kwargs) dict

The target argument should contain a regular expression pattern of type str or bytes. The following keyword arguments may be used to constrain the length of the desired string.

Name

Type

Default

Description

min_len

int

-1

If > 0, places a lower bound on the string to locate

max_len

int

-1

If > 0, places an upper bound on the string to locate

If one is only looking for any printable string within a search range, target can be specified as None or an empty string. The above keyword arguments can be used to constrain results.

The start and end parameters are used as described in Hunter.find().

finditer(self, target, start=-1, end=-1, **kwargs)

Returns an iterator that provides each Flattened Device Tree instance found in the search range.

Refer to find() for a description of target and supported keyword arguments.

string_at(address, min_len=-1, max_len=-1, allow_empty=False) str

Attempt to determine if the specified address contains a NULL-terminated ASCII string, with optional length constraints.

If an ASCII string is located at the specified address, it is returned sans NULL byte. Otherwise, HunterResultNotFound is raised.

An IndexError is raised if address is outside the bounds of the data parameter originally provided to the constructor.

Exceptions

exception depthcharge.hunter.HunterResultNotFound

It may be the case that a Hunter cannot find or produce a result. HunterResultNotFound is raised to indicate in this situation and provides more context within its message text.

In general, potential reasons this exception may be thrown include:

  • The requested item is not present in the provided data.

  • Provided parameters overconstrained the search. Relaxing constraints may be necessary to yield a result.