Text Steganography Extractor

Auto-detect and extract hidden messages from plain text using homoglyph, whitespace, and spacing methods. Instant anomaly detection. Watermark attribution for leak detection. SNOW-compatible. 100% client-side.

Read our guides & tutorials

Upload

Choose Extraction Method

Decode

Paste Suspicious Text

Paste any text to scan for hidden steganographic data. Anomaly detection runs instantly as you type.

Text to analyze

100% client-side. Your text never leaves your browser. Detects trailing whitespace (SNOW), inter-word spacing, Unicode space substitution, and homoglyphs.

How to Extract a Hidden Message from Text (5 steps)

Paste the suspicious text — anomaly detection runs instantly as you type
Review the Suspicion Score and anomaly breakdown to see which method is most likely
Select the method to try (or use Auto-Detect to try all four automatically)
Enter the password if the payload was encrypted, then click Extract
Download the extracted payload or use Watermark Attribution to identify the recipient

Extraction Methods — What This Tool Detects

Method	Technique	Speed	Best For
Auto-Detect	Tries all 4 methods in priority order: Homoglyph → Trailing WS → Unicode WS → Inter-Word	Takes ~1 second	Best starting point — use when method is unknown
Unicode Homoglyph	Scans for Cyrillic/special characters in Latin text. Each glyph position encodes one bit	Instant scan	Enterprise watermarks, leak detection, CTF challenges
Trailing Whitespace (SNOW)	Reads trailing space/tab per line. Compatible with standard stegsnow tool	Instant scan	CTF challenges, SNOW-encoded files
Unicode Whitespace	Reads thin/hair/en/em space variants (U+2009/200A/2002/2003) for 2-bit encoding	Instant scan	High-capacity email/docs watermarks
Inter-Word Spacing	Counts single vs double spaces between words	Instant scan	Plain text files and basic watermarks

Frequently Asked Questions

What does the Suspicion Score mean?

It's a 0–100 composite score measuring the probability that the text contains hidden data. Components: trailing whitespace density (0–25 pts), double-space density (0–20 pts), Unicode non-ASCII space count (0–25 pts), and homoglyph candidate count (0–30 pts). A score above 50 is highly likely steganographic. The score appears immediately as you paste text.

Can I decode stegsnow-encoded files?

Yes. Select Trailing Whitespace (SNOW) method. The decoder is fully compatible with stegsnow — files encoded by `stegsnow` on Linux are decoded correctly here. Note: stegsnow's ICE encryption is not supported (it uses a 1997 cipher). AES-256-GCM encrypted payloads encoded by this tool are decrypted correctly.

What is Watermark Attribution?

If the text was encoded using batch watermarking (each recipient got a unique copy), the decoded payload contains `WM:{id}:{recipient}`. Paste your watermark-key.json file and the decoder will instantly identify which recipient's copy this text came from. Essential for enterprise leak detection.

What does 'Encrypted' kind mean in the results?

Shannon entropy above 7.2 bits/byte indicates strongly encrypted or compressed data. This is almost certainly an AES-256-GCM encrypted payload. Enter the password to decrypt. If you don't have the password, the payload is unreadable.

Can I detect homoglyphs without decoding?

Yes. The homoglyph visualizer highlights every substituted character with its Unicode code point in a tooltip. This is useful for security auditing — checking if a document you received has invisible identifiers embedded.

Does the Document Cleaner remove all hidden data?

The Document Cleaner removes homoglyphs (replacing with standard Latin equivalents), trailing whitespace, and normalizes double spaces and Unicode space characters. It produces a clean version of the text with all known steganographic markers removed — safe to redistribute.