Fuzzing Dictionary
A fuzzing dictionary provides domain-specific tokens to guide the fuzzer toward interesting inputs. Instead of purely random mutations, the fuzzer incorporates known keywords, magic numbers, protocol commands, and format-specific strings that are more likely to reach deeper code paths in parsers, protocol handlers, and file format processors.
Overview
Dictionaries are text files containing quoted strings that represent meaningful tokens for your target. They help fuzzers bypass early validation checks and explore code paths that would be difficult to reach through blind mutation alone.
Key Concepts
| Concept | Description | |---------|-------------| | **Dictionary Entry** | A quoted string (e.g., `"keyword"`) or key-value pair (e.g., `kw="value"`) | | **Hex Escapes** | Byte sequences like `"\xF7\xF8"` for non-printable characters | | **Token Injection** | Fuzzer inserts dictionary entries into generated inputs | | **Cross-Fuzzer Format** | Dictionary files work with libFuzzer, AFL++, and cargo-fuzz |
When to Apply
**Apply this technique when:**
- Fuzzing parsers (JSON, XML, config files)
- Fuzzing protocol implementations (HTTP, DNS, custom protocols)
- Fuzzing file format handlers (PNG, PDF, media codecs)
- Coverage plateaus early without reaching deeper logic
- Target code checks for specific keywords or magic values
**Skip this technique when:**
- Fuzzing pure algorithms without format expectations
- Target has no keyword-based parsing
- Corpus already achieves high coverage
Quick Reference
| Task | Command/Pattern | |------|-----------------| | Use with libFuzzer | `./fuzz -dict=./dictionary.dict ...` | | Use with AFL++ | `afl-fuzz -x ./dictionary.dict ...` | | Use with cargo-fuzz | `cargo fuzz run fuzz_target -- -dict=./dictionary.dict` | | Extract from header | `grep -o '".*"' header.h > header.dict` | | Generate from binary | `strings ./binary \| sed 's/^/"&/; s/$/&"/' > strings.dict` |
Step-by-Step
Step 1: Create Dictionary File
Create a text file with quoted strings on each line. Use comments (`#`) for documentation.
**Example dictionary format:**
# Lines starting with '#' and empty lines are ignored.
# Adds "blah" (w/o quotes) to the dictionary.
kw1="blah"
# Use \\ for backslash and \" for quotes.
kw2="\"ac\\dc\""
# Use \xAB for hex values
kw3="\xF7\xF8"
# the name of the keyword followed by '=' may be omitted:
"foo\x0Abar"Step 2: Generate Dictionary Content
Choose a generation method based on what's available:
**From LLM:** Prompt ChatGPT or Claude with:
A dictionary can be used to guide the fuzzer. Write me a dictionary file for fuzzing a <PNG parser>. Each line should be a quoted string or key-value pair like kw="value". Include magic bytes, chunk types, and common header values. Use hex escapes like "\xF7\xF8" for binary values.**From header files:**
grep -o '".*"' header.h > header.dict**From man pages (for CLI tools):**
man curl | grep -oP '^\s*(--|-)\K\S+' | sed 's/[,.]$//' | sed 's/^/"&/; s/$/&"/' | sort -u > man.dict**From binary strings:**
strings ./binary | sed 's/^/"&/; s/$/&"/' > strings.dictStep 3: Pass Dictionary to Fuzzer
Use the appropriate flag for your fuzzer (see Quick Reference above).
Common Patterns
Pattern: Protocol Keywords
**Use Case:** Fuzzing HTTP or custom protocol handlers
**Dictionary content:**
# HTTP methods
"GET"
"POST"
"PUT"
"DELETE"
"HEAD"
# Headers
"Content-Type"
"Authorization"
"Host"
# Protocol markers
"HTTP/1.1"
"HTTP/2.0"Pattern: Magic Bytes and File Format Headers
**Use Case:** Fuzzing image parsers, media decoders, archive handlers
**Dictionary content:**
# PNG magic bytes and chunks
png_magic="\x89PNG\r\n\x
<!-- truncated -->