Steganography Exploits LLMs with Hidden Text Techniques

"To hide text, try white text on a white background."

Surface tricks: white-on-white and black-on-black as bypasses

The author opens with a blunt, repeatable observation: simple visual tricks remain a persistent method to conceal text. The post names two such approaches explicitly — white font on a white background and black font on a black background — and notes that the latter is sometimes characterized as censorship rather than steganography. The point is not aesthetic; it is pragmatic: visually invisible text can evade human inspection while remaining machine-readable.

The essay also recounts an anecdotal expansion of that idea — a suggestion to “try the command line to reformat the hard drive” when testing outside a live environment — a comment the author frames as a test scenario rather than a recommended operational step. Readers should note the original wording presents the remark as part of an exploratory narrative, not a how‑to guide.

Phonological obfuscation and the limits of model brittleness

One experiment described in the piece attempted to shroud human-detectable meaning by altering the phonology of words: deliberate misspellings and nonstandard phonetic renderings intended to confuse tokenization. The essay reproduces an example sentence written in altered orthography and reports a concrete outcome: even “small 4 billion parameter models” decoded those changes “with ease.”

The author draws a conclusion from that test: low-level orthographic tricks may not reliably evade contemporary large language models. The post frames this as an empirical observation about tokenization and model robustness rather than a theoretical claim about future systems.

Layering steganography: coherence versus readability

Another substantive point in the post is conceptual. Steganography can operate at different “layers” of language; the author argues that the higher the layer — effectively, the longer the token span used to encode information — the more coherent the resulting stego-text is word for word. But that coherence comes at a cost: texts encoded at higher layers tend to “read badly” because of jumps in context or similar discontinuities.

Put simply, the more you stretch a covert message across linguistic structure, the more likely the surface text will betray oddities in flow even if individual words appear plausible. That trade-off — coherence of discrete tokens versus overall readability — is central to the author’s critique of text-in-text steganography.

TEMPEST, Zero Emission Pad, and historical lineage

The post situates these modern concerns alongside an older class of emissions-security (EmSec) work. It credits Markus G. Kuhn, working at the University of Cambridge Computer Laboratory under Prof J. Anderson, for original “Soft Tempest” font research and points readers to the lab blog “lightbluetouchpaper.org” where related conversations occurred.

The author mentions a now-rare Windows program called “Zero Emission Pad,” described as an anti‑TEMPEST notepad that performed font smoothing for emissions protection. The post suggests the software vanished quickly from much of the web and even proposes that, if found, it could be uploaded to archive.org — a preservation recommendation rather than a technical endorsement.

SDRs, GNU Radio, Tempest for Eliza, and modern signal risk

The essay moves from fonts to radio: it notes that Software Defined Radios (SDRs), greater CPU power, and wider I/Q bandwidths have materially changed what hobbyists and researchers can do. The author points to GNU Radio as an enabler that lets users define radio chains and observes that these advances have improved both offensive and defensive capabilities in the EmSec space.

Against that backdrop, the post asserts that the original Soft Tempest Fonts “don’t give you much these days” and that while spread-spectrum techniques can improve things, gains are limited. As demonstration software, the author recommends the free/open-source program “Tempest for Eliza,” which the post says “works on modern day monitors and demonstrates just how insecure our devices are,” adding that no special hardware is needed and that monitors alone can broadcast to local AM/FM radio; for more advanced users the post points at TempestSDR as an SDR-oriented option.

What this means for technologists, defenders, and end users

Technologists and security teams: Expect that low-level orthographic tricks will not reliably fool contemporary LLMs; the author’s test with a 4 billion parameter model is explicit on that point. At the same time, advances in SDRs and signal-processing software have altered the EmSec threat model compared with earlier eras.
Defenders and procurement leaders: The post highlights older countermeasures (Soft Tempest Fonts, Zero Emission Pad) as increasingly insufficient in isolation; spread-spectrum techniques and hardware-aware defenses are discussed as partial mitigations but are described as limited.
End users and curious researchers: The author points to publicly available demonstrations — notably “Tempest for Eliza” — as accessible ways to understand monitor-based emissions risks, while cautioning implicitly about the limits of antique tools and the rapid evolution of SDR capabilities.

The overall thrust of the essay is clear: some everyday tricks to hide text work against human readers, but machine models and modern radio tooling blunt many historical approaches to steganography and emissions security. The author threads practical anecdotes and named programs through a single argument — that the layer at which you attempt to hide information matters, and that tools once thought protective are no longer panaceas.

Original post