depthcharge.hunter
This module provides the Hunter
base class
and its associated implementations.
In Depthcharge parlance, a “Hunter” is class that searches data for items of interest
and either provides information about located instances or produces an artifact using
the located data (e.g., a Stratagem
).
Base Class
- class depthcharge.hunter.Hunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)
The
Hunter
class provides a foundation upon which specific Hunter implementations can be built.It provides default
find()
andfinditer()
implementations, and defines a fallbackbuild_stratagem()
implementation that raisesStratagemNotRequired
.A Hunter subclass implementation can either:
Implement a private
_search_at()
and allow the base implementation to do the rest.Override these methods entirely.
When adding a new Hunter, the former is preferable because offset validation logic, support for gaps in the search range (to skip), and displaying progress bars during the search all come for free.
Constructor Arguments:
data - Memory or flash dump data to search
address - Memory address corresponding to the start of *data
start_offset - Offset within data to begin searching. A negative value implies 0 (default).
end_offset - Inclusive upper bound offset for the search. A negative value implies the last element in data.
gaps - A
list
of regions within data that should be skipped during searches. (One may want to exclude regions of data actively modified by the running bootloader.) List entries may be either(address, length)
tuples ofrange
objects. Caller-provided gaps must not overlap.
- _search_at(target, start_offset: int, end_offset: int, **kwargs) dict
Private Method: Not to be called directly by API users
Hunter implementations that wish to leverage the aforementioned facilities of the default
find()
,finditer()
, andbuild_stratagem()
methods should implement this.The implementation should search for the target information within the Hunter’s data (provided via its constructor). The search should be performed within the bounds of start_offset and end_offset, which will be guaranteed to be within the bounds of data.
If a result is found, this method should return a dictionary per the description of
find()
.If no result is found and the implementation searched the entire range of [start_offset, end_offset] in a single call, it should raise a
HunterResultNotFound
exception.Otherwise, if no result is found and the implementation only checked that no match was present at start_offset, it should return
None
. The parent class will take care care of repeatedly calling_search_at()
to complete a search of the full range.
- property name: str
Hunter class name
- find(target, start=-1, end=-1, **kwargs) dict
Search for target within the Hunter’s data (provided via the constructor) and return information about the first result encountered as a dictionary with the following keys-value pairs:
Key (str)
Type
Description
src_off
int
0-based index into data where target was found
src_addr
int
Absolute address of the located target
src_size
int
Size of the located target, in bytes
Subclasses of
Hunter
may return additional entries in their result dictionary. (Refer to the specific subclass documentation for any additional entries.)Note that the above key-value scheme is intentionally a consistent with that used in
Stratagem
entries, and the repective Stratagem Specification returned by an Operation’sstratagem_spec()
method.The start_offset and end_offset parameters provided to the
Hunter
constructor can be overridden using this method’s start and end arguments, respectively. The default negative values mean, “Use the offsets that the object was created with.”A
HunterResultNotFound
exception is raised if the target could not be found.
- finditer(target, start=-1, end=-1, **kwargs)
Return an generator over all
find()
results for target in the data of interest.The start and end arguments, if greater than or equal to zero, can be used to override the start_offset and end_offset parameters provided to the
Hunter
constructor.Example:
for result in my_hunter.finditer(my_target): offset = result['src_off'] size = result['src_size'] hexstr = my_data[offset:offset + size].hex() msg = '{:d}-byte result found at 0x{:08x}: {:s}' print(msg.format(size, result['src_addr'], hexstr))
- build_stratagem(target_payload: bytes, start=-1, end=-1, **kwargs)
Produce a
Stratagem
that can be used to create target_payload, given data (provided earlier via theHunter
constructor).The start and end parameters can be used to override the start_offset and end_offset parameters that the Hunter was created with. If left as negative values, the Hunter’s defaults will be used.
The nature of the resulting Stratagem is tightly coupled with the
Operation
it will be used by. Refer to specific implementations for more information, as well as any**kwargs
that can be used to configure the Stratagem creation.If a particular Hunter does not produce Stratagem objects, invoking this method will raise a
StratagemNotRequired
exception.
Implementations
- class depthcharge.hunter.CommandTableHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)
The CommandTableHunter searches a U-Boot memory dump for instances of linker lists containing the cmd_tbl_s structures that define console commands. Within the U-Boot source code, console commands are declared using U_BOOT_CMD and related preprocessor macros. (More background information can be found in U-Boot’s README.commands.)
Constructor
The
CommandTableHunter
constructor requires that an arch= keyword argument be provided. This argument may be either a string or a value returned byArchitecture.get()
:my_hunter = CommandTableHunter(mem_dump, 0x4FF4_F000, arch='arm')
The address parameter should generally refer the corresponding data’s post-relocation address. This address information is used to determine whether a potential
cmd_tbl_s
structure contains valid information. For example,char *
pointers are “dereferenced” (within the confines of the provided image) to confirm that they lead to NULL-terminated ASCII strings.If the data address is not known, pass check_ptrs=False to
find()
andfinditer()
. Be aware that this will very likely result in a (potentially large) number of false postives.Motivation
The presence of these command tables can serve as an indicator that CONFIG_CMDLINE is enabled. This evidence can be used to justify further analyses focusing of how a console can be accessed, if it is not otherwise obviously exposed or protected with standard functionality. For example, does vendor-specific code hide the U-Boot console unless a particular GPIO pin is asserted? Is a custom functionality akin to CONFIG_AUTOBOOT_STOP_STR and friends used to gate access to the console?
The presence of multiple unique command tables within a U-Boot memory can also be quite interesting! This can be indicative of a situation in which different commands are exposed based upon different authorization levels (implemented by vendor-specific code). An example of this can be found in examples/symfonisk_unlock_bypass.py, where pointers to an unauthenticated “insecure” command table are redirected to a post-authentication “secure” command table.
Once the locations and layout of command tables in memory are known, they can be patched at runtime to insert alternative functionality (provided that a
MemoryWriter
is available). From there, one can instrument the operating environment as desired to further explore a SoC, its RAM (from a prior operating state), storage media, and peripherals.- find(target, start=-1, end=-1, **kwargs) dict
This
Hunter
searches for a command table containing the command specified by target. If target isNone
or an empty string, the first table found in the search range will be returned.The following keyword arguments can be used to further configure the search behavior.
Name
Type
Default
Description
threshold
int
5
Number of valid-looking consecutive cmd_tbl_s entries to observe before a table is considered valid.
check_ptrs
bool
True
Follow pointers in order to confirm whether data at a given search location is a command table entry. Setting this to
False
will result in false positives.longhelp
bool
None
Denotes state of U-Boot’s CONFIG_SYS_LONGHELP. Keep this set to
None
if you do not know it; the Hunter will attempt to infer the right value.autocomplete
bool
None
Denotes state of U-Boot’s CONFIG_AUTO_COMPLETE. Keep this set to
None
if you do not know it; the Hunter will attempt to infer the right value.
- finditer(self, target, start=-1, end=-1, **kwargs)
Returns an iterator that provides each command table found in the search range.
Refer to
find()
for a description of target and supported keyword arguments.Otherwise, it behaves as described in
Hunter.finditer()
.
- classmethod result_str(result: dict) str
Convert
find()
andfinditer()
result dictionaries to a string suitable for printing to a user.
- classmethod result_summary_str(result: dict) str
This method is similar to
result_str()
, but returns a shorter summary string.
- class depthcharge.hunter.ConstantHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)
This is a simple
Hunter
that searches for fixed data values (of typebytes
).Example use-cases include searching for file format “magic” values (e.g. Device Tree’s
d00dfeed
), tables (e.g., CRC32, SHA1, SHA256 LUTs), or opcodes near code or data of interest.Its constructor and methods are implemented according to the descriptions in
Hunter
.- find(target, start=-1, end=-1, **kwargs) dict
Searches for constant specified by target.
- finditer(target, start=-1, end=-1, **kwargs) dict
Returns an iterator over all occurrences of target in the search range.
- class depthcharge.hunter.CpHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)
The CpHunter class searches for a series of
cp
command invocations that can be performed to result in the desired payload.- find(target, start=-1, end=-1, **kwargs)
CpHunter does not implement this method. Raises
OperationNotSupported
.
- finditer(target, start=-1, end=-1, **kwargs)
CpHunter does not implement this method. Raises
OperationNotSupported
.
- build_stratagem(target_payload: bytes, start=-1, end=-1, **kwargs)
Produce a
Strategem
for use withCpMemoryWriter
. This implementation attempts to reduce the total number of cp operations by searching for common substrings.
- class depthcharge.hunter.EnvironmentHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)
An EnvironmentHunter searches a memory or flash dump for instances of U-Boot environments, which store collections of variable definitions. This
Hunter
identifies three types of U-Boot environments:Built-in defaults used when no (valid) environment is present in non-volatile storage.
A valid environment stored in non-volatile storage, prefixed with a CRC32 header value.
A “redundant” environment, (see CONFIG_SYS_REDUNDAND_ENVIRONMENT) which includes both a CRC32 header and a flags value used to determine the active environment copy.
The constructor supports two keyword arguments that place a lower and upper bound on the sizes (in number of entries) of environment instances returned by
find()
andfinditer()
:min_entries (default: 5)
max_entries (default:
None
)
- find(target, start=-1, end=-1, **kwargs) dict
If target is an empty string or
None
, the first environment encountered in the search range is returned. Otherwise, only an environment containing the string or byte specified by target will be returned.In addition to the standard keys returned by
Hunter.find()
, EnvironmentHunter includes the additional items:arch - The architecture of the platform. (This will inform the endianness of checksum.)
type - One of: ‘Built-in environment’, ‘Stored environment’, ‘Stored redundant environment’
crc - The CRC32 checksum of the environment (if one of the ‘Stored’ types)
flags - Only present for ‘Stored redundant environment’. Contains monotonically increasing (modulo 256) integer value used to determine which redundant environment is “freshest”.
dict - The environment contents as a dictionary, with the variable names as keys.
raw - The raw environment data as NULL-terminated ASCII strings (type: bytes).
- finditer(self, target, start=-1, end=-1, **kwargs)
Returns an iterator that provides each environment instance found in the search range.
Refer to
find()
for a description of target and supported keyword arguments.Otherwise, it behaves as described in
Hunter.finditer()
.
- class depthcharge.hunter.FDTHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)
This
Hunter
searches for Flattened Device Tree (also see elinux.org) instances within a memory or flash dump.If the Device Tree Compiler (dtc) is installed, results will include both the binary representation of the device tree (dtb) as well as a source representation (dts).
- find(target, start=-1, end=-1, **kwargs) dict
If a device tree containing a specific string or byte sequence is desired, this can be specified via target. Otherwise it can be left as
None
to search for any device tree.The returned dictionary will contain the binary representation of the device tree in a
'dtb'
entry. If the Device Tree Compiler is installed on your machine, a source representation will additionally be provided in a'dts'
entry.A no_dts=True keyword argument can be used to prevent the
FDTHunter
from attempting to convert a DTB to a DTS.
- class depthcharge.hunter.ReverseCRC32Hunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)
The ReverseCRC32Hunter searches for CRC32 preimages in order to allow the U-Boot
crc32
console command to be exploited as an arbitrary memory write primitive that produces a payload 4-bytes at a time.It effectively answers the question, “What series of CRC32 operations would I need to perform to produce my desired binary payload?” This answer is provided in the form a
Stratagem
, which is in turn used by theCRC32MemoryWriter
to perform the actual memory writes.Constructor
The constructor conforms the
Hunter
definition and supports two additional keyword arguments.The endianness keyword argument specifies the byte order of the target device. It defaults to
sys.byteorder
, which may not match your target. It can be specified as either'big'
or'little'
.A revlut_maxlen keyword argument can be used to override the maximum size of individual reverse lookup table (RLUT) entries. The default value is
256
.In general, increasing revlut_maxlen value uses more host memory when attempting to find a CRC32 preimage, increasing the likelihood of success. The lower limit of this value is 1 and its upper bound is defined by the size of data. Continue reading to get a better understanding of what exactly this parameter controls.
Implementation Details
The implementation of
ReverseCRC32Hunter
leverages the malleability property of CRC32 – just one reason why it is never suitable for validation of data sucseptible to data forgery or tampering threats.ReverseCRC32Hunter
uses a simplification of the technique presented in Listing 6 of the paper listed below. In our simplified case we’re effectively “appending” 4 bytes to a zero-length input to achieve a chosen CRC32 output. The value we end up “appending” is the 4-byte inverse of our chosen CRC32 output. This simplified algorithm implementation resides in depthcharge/revcrc32.py.Reversing CRC - Theory and Practiceby Martin Stigge, Henryk Plötz, Wolf Müller, Jens-Peter RedlichHU Berlin Public Report, SAR-PR-2006-05, May 2006It is not necessarily the case, however, that the 4-byte inverse of our desired CRC32 value exists within a given ROM or memory dump (i.e., our domain of potential CRC32 inputs). Instead, we iteratively perform this reverse CRC32 operation on each 4-byte result until we find an input that does. When later writing memory using the crc32 U-Boot console command (via
CRC32MemoryWriter
), each iteration incurs some amount of serial console overhead execution time on the target device. When focusing only on 4-byte inputs, this can result in extraordinarily long execution times.To reduce execution time on the target (i.e. the number of CRC32 iterations performed) we can make a trade off in the form of increased memory consumption on our host machine at the time when the
Stratagem
(intended for use withCRC32MemoryWriter
) is being created. To achieve this, we use a reverse-lookup table (RLUT) that maps CRC32 values to the shortest byte sequence in our input data that produces them.When the
ReverseCRC32Hunter
constructor is invoked it will produced this RLUT. This will generally take some time, so a progress bar and ETA is shown.The RLUT construction is performed over the constructor’s data argument (excluding any regions specified by the gaps keyword argument) for a sliding windows of size N. This is repeated for all values of N in the inclusive range shown below.
[1, min(revlut_maxlen, len(data)) ]
As shown above, This is where the revlut_maxlen keyword argument described earlier comes into play. In general, results will be “better” with a larger revlut_maxlen value. The “best” value is one maximizes RAM utilization on one’s host (without triggering a
MemoryError
, of course). The memory requirements (roughly) grow cubically with respect to this parameter. Bear in mind that the memory consumption of these RLUTs are on the order of GiB.Below is an visual example of this process, with a simplified 4-iteration solution. The left side shows the path from a desired payload back to a byte sequence present in the input data. The right side denotes the runtime sequence of CRC32 operations performed on the target device to produce a 4-byte portion of the desired payload.
To produce an N-length payload (where N is divisible by 4), the whole process is repeated for each 4-byte word in the desired payload. Depthcharge makes two simplifying assumptions:
The input data is static; it does not change at runtime.
The location where payloads are written does not overlap the input data.
Both of these are reasonable assumptions; the input data can be carved to meet these requirements, or the
ReverseCRC32Hunter
constructor can be invoked with the gaps= keyword to exclude memory regions as needed. Under these assumptions, each 4-byte word in the produced output can be computed in parallel. Depthcharge uses Python’smultiprocessing
module to distribute tasks to multiple workers, with a default worker count equal to the system’s CPU count.Finally, one additional optimization is used to reduce the total number of CRC32 operations that the need to be performed on a target device at runtime. Envision a case where a 4-byte sequence
X = [aa bb cc dd]
occurs 5 times within a payload, and the produced :py:claske s:~depthcharge.Stratagem requires 3,500 CRC32 iterations produce a single instance of the 4-byte value X. The naive approach would be to perform a total of 17,500 CRC32 operations to write all 5 copies of X, 80% of which are redundant. Therefore, to eliminate this unnecessary overhead,build_stratagem()
identifies the unique 4-byte words in a payload, and distributes only these to the parallel workers.When a result is produced for a word occurring only once in the desired payload, no additional behavior is necessary. However, if a word (X) occurs multiple times, the produced
Stratagem
entry is split into multiple entries.The produced Stratagem is appended as-is, but with the iterations value reduced by 1. The in-progress result is written to the payload location for this reduced number of iterations.
For each of the 4 remaining occurrences of X, a special Stratagem entry is created. Instead of having a src_addr pointing to an address within the input data, it instead has a tsrc_off value that denotes that the input can be found at the location where the result of Step 1 was written. Only a single iteration is required to write produce X.
A “finalizing” Stratagem entry is appended that performs one last CRC32 operation on the location storing the result of Step 1, with the output written to the same address.
With this approach, the same desired payload can be achieved, but with only 3,504 CRC32 operations.
To avoid confusion, Stratagem entries produced by
ReverseCRC32Hunter
do not contain a src_off key. In cases where a tsrc_off (Target buffer source offset) is used, src_addr is set to-1
.Example
If you still have a healthy dose of skepticism about the high-level approach used here, worry not! An example demonstrating this algorithm is provided in examples/reverse_crc32_algo_poc.py This example uses
build_stratagem()
to produce the following string, using Edgar Allen Poe’s The Raven as input data. In order to allow this example to be used without a target device, theStratagem
is “executed” by simply making calls tozlib.crc32
, rather than by passing the Stratagem toCRC32MemoryWriter()
.Payload String:
"NCC Group - Depthcharge \n<https://github.com/nccgroup/depthcharge>"
- find(target, start=-1, end=-1, **kwargs) dict
Search for a sequence of CRC32 operations, performed over the data provided to the
ReverseCRC32Hunter
construtor that results in targetThat target parameter must be a
bytes
object with a length of 4.The start and end parameters operate according to
Hunter.find()
.A max_iterations keyword argument can be provided to limit the maximum number of operations to allow when searching for a sequence of CRC32 operations that result in the target value. This default is to 4096.
As discussed in the
ReverseCRC32Hunter
“Implementation Details” section, increasing the allowed maximum number of iterations will increase the amount of time required to deploy a payload withMemoryWriter
. To keep this reasonably low, the revlut_maxlen parameter passed to theReverseCRC32Hunter
constructor may need to be increased.When a result is found, it is returned in a dictionary that has the keys described in
Hunter.find()
, and an additional iterations key.In the event that no result is found, a
HunterResultNotFound
exception is raised.
- build_stratagem(target_payload: bytes, start=-1, end=-1, **kwargs)
Given a target binary payload, return the sequence of CRC32 operations that can be performed to create this payload, one 4-byte word at time. This result is returned in the form of a
Stratragem
. Refer to theReverseCRC32Hunter
Implementation Details section for details about how this works.The target_payload must be a multiple of 4 bytes. Otherwise, an
StratagemCreationFailed
exception is raised.This implementation assumes that the location where the target_payload is being written does not overlap the memory space being searched for viable operations. It is the user’s responsibility to provide a gaps list to the constructor to avoid this.
If a
Stratagem
for the desired payload could not be created because no solution was found, aHunterResultNotFound
exception is raised.Optional keyword arguments:
max_iterations - Maximum number of CRC32 operations to allow per 4-byte word. Default: 4096
num_procs - Number of concurrent processes to use during search. Default: System’s CPU count
- Tip
When possible, consider using ROM code as the Hunter’s input data. This will allow
Stratagem
to remain usable across any changes in the target system’s U-Boot build.
- class depthcharge.hunter.StringHunter(data: bytes, address: int, start_offset=-1, end_offset=-1, gaps=None, **kwargs)
The StringHunter can be used to search for NULL-terminated ASCII strings within a binary RAM or flash dump (i.e., data shoudl be of type
bytes
), via regular expressions.As every good little reverse engineer knows, stings can be very telling about the nature of code. For example, they could hint at the use of HABv4 functionality to authenticate images on NXP i.MX-based platforms.
- find(target, start=-1, end=-1, **kwargs) dict
The target argument should contain a regular expression pattern of type
str
orbytes
. The following keyword arguments may be used to constrain the length of the desired string.Name
Type
Default
Description
min_len
int
-1
If > 0, places a lower bound on the string to locate
max_len
int
-1
If > 0, places an upper bound on the string to locate
If one is only looking for any printable string within a search range, target can be specified as
None
or an empty string. The above keyword arguments can be used to constrain results.The start and end parameters are used as described in
Hunter.find()
.
- finditer(self, target, start=-1, end=-1, **kwargs)
Returns an iterator that provides each Flattened Device Tree instance found in the search range.
Refer to
find()
for a description of target and supported keyword arguments.
- string_at(address, min_len=-1, max_len=-1, allow_empty=False) str
Attempt to determine if the specified address contains a NULL-terminated ASCII string, with optional length constraints.
If an ASCII string is located at the specified address, it is returned sans NULL byte. Otherwise,
HunterResultNotFound
is raised.An
IndexError
is raised if address is outside the bounds of the data parameter originally provided to the constructor.
Exceptions
- exception depthcharge.hunter.HunterResultNotFound
It may be the case that a Hunter cannot find or produce a result.
HunterResultNotFound
is raised to indicate in this situation and provides more context within its message text.In general, potential reasons this exception may be thrown include:
The requested item is not present in the provided data.
Provided parameters overconstrained the search. Relaxing constraints may be necessary to yield a result.