Artefact extraction¶
Extract stable, passive identifiers from firmware and related artefacts so that the Fingerprint Forge can later recognise devices on the internet without ever speaking to them.
This is not vulnerability research. This is not scanning. This is careful, clerical theft of facts the firmware has already volunteered.
Types of artefacts¶
Artefacts are static identifiers embedded in firmware or companion software. They must be:
Extractable offline
Observable without interaction
Stable across deployments of the same firmware
Banner strings and protocol constants¶
These include:
Service banners (HTTP, FTP, Telnet, SNMP)
Protocol identifiers and magic constants
Device or platform identifiers embedded in binaries
Moxa example
From the extracted filesystem:
grep -Rni "NPort" .
grep -Rni "MOXA" .
grep -Rni "5250" .
Common findings include:
Product names hardcoded into binaries
SNMP
sysDescrstringsProtocol identifiers used in Modbus or proprietary services
These strings are ideal fingerprints because they are:
Consistent
Boring
Forgotten by vendors
Which makes them reliable.
Web UI assets, HTTP headers, embedded frameworks¶
Most industrial gateways ship with a small web server. It is rarely original.
Artefacts include:
HTML titles and footers
JavaScript filenames and comments
Embedded web frameworks (Boa, GoAhead, lighttpd)
Static image assets (logos, icons, CSS)
Moxa example
find . -iname "*.html" -o -iname "*.js" -o -iname "*.css"
Then:
sha256sum web/images/*.png
sha256sum web/js/*.js
A single PNG hash reused across thousands of devices is a gift from the gods.
Headers (server strings, cookies) are noted here only if visible in firmware defaults, not tested live.
TLS certificates, serial numbers, cryptographic fingerprints¶
Common sins include:
Vendor‑signed certificates reused across devices
Predictable serial formats
Embedded CA certificates
Hardcoded private keys (yes, really)
Moxa example
find . -iname "*.crt" -o -iname "*.pem" -o -iname "*.key"
openssl x509 -in device.crt -noout -subject -issuer -dates -serial
Even when keys are unique, issuer patterns and validity quirks often are not. These are prime Forge material.
API endpoints, default configurations, companion software paths¶
Artefacts also live in configuration and glue code:
REST or proprietary API paths
Default config files
MQTT topics
Update endpoints
Cloud relay hostnames
Moxa example
grep -Rni "/api/" .
grep -Rni "http://" .
grep -Rni "https://" .
Paths and hostnames that never change across installs are fingerprints wearing false moustaches.
Extraction methods¶
Extraction is mechanical. Interpretation comes later.
Offline string scanning and regex search¶
The workhorse:
strings -a binaryfile | grep -Ei "moxa|nport|http|snmp"
Use regex sparingly. The goal is repeatability, not cleverness.
Anything found must be:
Locatable again
Extractable by another analyst
Explainable without hand‑waving
Checksum hashing¶
Used for:
Static web assets
Embedded binaries
Certificates
Example:
sha256sum web/js/app.js > app.js.sha256
Hashes are stored verbatim. Do not rename files to “interesting.js”. That way lies madness. I know.
Safe disassembly (read-only)¶
Disassembly is used to:
Confirm strings are actually referenced
Locate protocol constants
Identify embedded service logic
It is not used to execute code.
Ghidra, radare2, or objdump are acceptable.
If disassembly reveals behaviour that would require execution to confirm, stop and note it. That is a different page.
Emphasis on reproducibility¶
Every artefact must answer three questions:
Where did it come from?
How was it extracted?
Can someone else extract it again?
If the answer to any is “probably”, it does not leave the Desk.
Metadata tagging¶
Artefacts without context are just trivia.
Unsafe or sensitive artefacts¶
Some artefacts are dangerous to mishandle:
Credentials
Keys
Instructions affecting physical processes
These are:
Retained
Clearly marked
Never forwarded to the Forge unlabelled
Example tag:
SENSITIVE: embedded private key
DO NOT DEPLOY AS FINGERPRINT
The Forge does not need surprises.
Versioned artefact maps¶
Artefacts are recorded with:
Vendor
Device family
Exact firmware version
File path and offset (where applicable)
For example:
Vendor: Moxa
Device: NPort 5250AI-M12
Firmware: 2.0
Artefact: HTTP server banner string
Location: /usr/sbin/httpd (offset 0x01A3F2)
This allows correlation across versions later, when patterns emerge. They always do.
Correlation to device types and vendors¶
Artefacts are grouped so the Forge can later ask:
“Which devices share this string?”
“Which firmware introduced this header?”
“Which vendors reuse this framework?”
The Desk does not answer those questions. It merely ensures they can be asked.
Closing note (written in pencil, not ink)¶
If an artefact would only be visible after sending a packet, logging in, or pressing a button:
It does not belong here. Put it back on the shelf, label it “interesting”, and walk away. The city remains standing because clerks know when to stop.