Audio steganography hides data inside sound files. Whether you are solving CTF challenges, creating puzzles, or hiding messages for fun, this cheatsheet covers the tools, techniques, and tricks you need.
Bookmark this page. You will come back to it.
What Is Audio Steganography?
Steganography is the practice of hiding information inside other media. Audio steganography specifically hides data inside sound files, where the hidden content is not audible to casual listeners.
The main approaches:
| Technique | How It Works | Detection Difficulty |
|---|---|---|
| Spectrogram encoding | Image mapped to audio frequencies | Easy with spectrogram viewer |
| LSB encoding | Data in least significant bits of audio samples | Requires analysis tools |
| Phase coding | Data in phase shifts between frequencies | Moderate |
| Echo hiding | Data encoded as micro-echoes | Hard |
| Spread spectrum | Data spread across frequency range | Very hard |
For CTF competitions and creative projects, spectrogram encoding is the most common and visually impressive technique. The most famous example is Aphex Twin's hidden face in Windowlicker's spectrogram, which inspired an entire subculture of hiding images in audio. See our step-by-step Aphex Twin tutorial for a full guide.
Spectrogram Steganography
How It Works
An image is converted to audio by mapping:
-
Vertical position in the image to audio frequency (pitch)
-
Horizontal position to time
-
Brightness to amplitude (volume)
The resulting audio sounds like ambient noise. But open it in a spectrogram viewer and the image appears.
Creating Spectrogram Art
Img2Sound - The easiest way to hide images in audio files as spectrogram art. Upload any image, get audio with the image encoded in the spectrogram. Supports custom frequency ranges and durations.
Coagula - Free Windows tool for spectrogram synthesis. Manual but flexible.
Python with librosa/matplotlib - Programmatic approach for custom implementations.
Viewing Hidden Spectrograms
| Tool | Platform | Best For |
|---|---|---|
| Audacity | All | Free, full-featured, spectrogram view built in |
| Spek | All | Simple, fast, dedicated spectrogram viewer |
| Sonic Visualiser | All | Research-grade analysis |
| SoX | CLI | Command-line spectrogram generation |
| iZotope RX | Pro | Professional audio forensics |
Audacity quick setup: 1. Open the audio file 2. Click track name dropdown, select "Spectrogram" 3. Right-click track, Spectrogram Settings, Max Frequency = 20000 Hz 4. Use logarithmic scale for best visibility
LSB (Least Significant Bit) Encoding
How It Works
Audio samples are typically 16 or 24 bits. LSB encoding replaces the least significant bit(s) of each sample with data bits. The change is inaudible because the least significant bit represents the smallest possible volume change.
Tools
steghide - Classic CLI tool for embedding data in audio (and images):
# Embed
steghide embed -cf audio.wav -ef secret.txt -sf stego_audio.wav
# Extract
steghide extract -sf stego_audio.wav
stegolsb - Python library for LSB audio steganography:
# Install
pip install stegolsb
# Encode
stegolsb wavsteg -h -i cover.wav -s secret.txt -o stego.wav -n 2
# Decode
stegolsb wavsteg -r -i stego.wav -o output.txt -n 2
OpenStego - GUI tool for digital watermarking and data hiding.
Detection
Check for LSB manipulation with:
-
zsteg (for WAV files):
zsteg audio.wav -
Statistical analysis of sample distribution
-
Compare file size vs expected size for duration/bitrate
Metadata and Header Tricks
ID3 Tags (MP3)
MP3 files have ID3 metadata fields that can hide data:
# Check metadata
exiftool audio.mp3
mediainfo audio.mp3
# Common hiding spots
- Comment field
- Album art (embedded image)
- Custom frames
- Padding between frames
WAV Header
WAV files use RIFF chunks. Data can hide in:
-
Extra chunks after the data section
-
Padding bytes
-
Custom chunk types
Check with: xxd audio.wav | head -50 or any hex editor.
Frequency Domain Techniques
Phase Coding
Data encoded in the phase relationship between frequency components. Harder to detect than LSB because it does not change amplitude.
Spread Spectrum
Data spread across many frequencies using a pseudo-random sequence. Only someone with the correct key can extract it. Used in digital watermarking.
Frequency Band Hiding
Hide data in specific frequency bands:
-
Above 15 kHz (most adults cannot hear above this)
-
In narrow bands masked by louder signals
-
In ultrasonic range (above 20 kHz) for non-audible hiding
CTF Challenge Patterns
Common CTF Audio Stego Patterns
- Check the spectrogram first - Most CTF audio challenges use spectrogram encoding
- Check metadata -
exiftool,mediainfo,ffprobe - Check for hidden files -
binwalk audio.wav - Check LSB -
zstegorstegolsb - Check strings -
strings audio.wav | grep -i flag - Listen carefully - Morse code, DTMF tones, SSTV signals
- Check file format - Is it really a WAV?
file audio.wav - Compare with original - If given a cover file, diff the two
SSTV (Slow Scan Television)
Some CTF challenges encode images using SSTV, the same protocol ham radio operators use to send images over audio:
Decode with:
-
QSSTV (Linux)
-
RX-SSTV (Android)
-
Black Cat SSTV (iOS)
Morse Code
Audio Morse code is a classic CTF pattern:
-
Listen for dots and dashes
-
Use Audacity to visualize in waveform view
-
Decode with online Morse decoders
DTMF Tones
Telephone keypad tones encode digits:
-
multimon-ng decodes DTMF from audio
-
Each digit is a combination of two frequencies
Quick Reference Commands
# Spectrogram
sox input.wav -n spectrogram -o spectrogram.png
# Metadata
exiftool audio.wav
mediainfo audio.wav
ffprobe -v quiet -show_format audio.wav
# Hidden files
binwalk audio.wav
foremost -i audio.wav
# Strings
strings audio.wav | grep -iE 'flag|ctf|key|password'
# LSB
zsteg audio.wav
stegolsb wavsteg -r -i audio.wav -o out.txt -n 2
# Hex dump
xxd audio.wav | head -100
# Convert formats
ffmpeg -i input.mp3 output.wav
sox input.wav -r 44100 -b 16 output.wav
Create Your Own Challenges
Building audio steganography challenges? Img2Sound lets you:
-
Convert any image to a spectrogram audio file
-
Hide QR codes, text, logos, or flags in audio
-
Set custom frequency ranges to control where the image appears
-
Create multi-layer puzzles (spectrogram reveals QR code, QR links to next clue)
The best CTF challenges combine multiple techniques. Hide a spectrogram image that contains a QR code that links to a page with whitespace-encoded data. Each layer requires a different skill to solve.
Further Reading
-
OWASP Audio Steganography Testing Guide
-
CTF Field Guide: Audio Forensics
-
IEEE papers on audio watermarking techniques