Illustrations by Carl De Torres
Their neighbors thought they were just ordinary U.S. residents, but secretly they were spies, sent by Russia’s Foreign Intelligence Service to gather information on U.S. policies and programs. For years they thwarted detection partly by hiding secret correspondence in seemingly innocent pictures posted on public websites. They encoded and decoded the dispatches using custom-made software.
But the scheme wasn’t as covert as the spies had assumed. Eventually investigators from the U.S. Department of Justice tracked down the altered images, which helped build a case against the Russians. In June 2010, federal agents arrested 10 of them, who admitted to being secret agents a few weeks later.
The act of concealing data in plain sight is known as steganography. Since antiquity, clandestine couriers have used hundreds of steganographic techniques, including invisible ink, shrunken text, and strategically placed tattoos. Picture steganography—one of the Russian spies’ primary tactics—dates back to about the early 1990s. That they used such an old-school strategy is odd, particularly because doctored images can be detected and used as evidence.
A more modern approach, known as network steganography, leaves almost no trail [see “Vice Over IP,”IEEE Spectrum, February 2010]. Rather than embed confidential information in data files, such as JPEGs or MP3s, network steganography programs hide communication in seemingly innocent Internet traffic. And because these programs use short-lived delivery channels—a Voice over Internet Protocol (VoIP) connection, for example—the hidden exchanges are much harder to detect.
Network security experts have invented all of the dozens of publicly documented network steganography techniques. But this doesn’t mean that criminals, hackers, and spies—as well as persecuted citizens wanting to evade government censorship or journalists wanting to conceal sources—aren’t using these or similar tactics. They probably are, but nobody has tools that are effective enough to detect these techniques. In fact, had the Russian spies used newer steganography methods, they might not have been exposed so handily.
As members of the Network Security Group at Warsaw University of Technology, in Poland, we study new ways to disguise data in order to help security experts design better detection software for those cases when steganography is used for nefarious purposes. As communication technologies evolve, we and other steganographers must develop ever more advanced steganography techniques.
About a decade ago, state-of-the-art programs manipulated the Internet Protocol primarily. Today, however, the most sophisticated methods target specific Internet services, such as search tools, social networks, and file-transfer systems. To illustrate the range of things that are possible, we present four steganographic techniques we’ve recently developed, each of which exploits a common use of the Internet.
Silences in a telephone conversation can carry a great deal of meaning—and hidden messages.
Skype, Microsoft’s proprietary VoIP service, is particularly easy to exploit because of the way the software packages audio data. While a user—let’s call her Alice—is talking, Skype stuffs the data into transmission packets. But unlike many other VoIP apps, Skype continues to generate audio packets when Alice is silent. This improves the quality of the call and helps the data clear security firewalls, among other advantages.
But the outgoing silence packets also present an opportunity to smuggle secret information. These packets are easy to recognize because they’re much smaller—about half the number of bits—than the packets containing Alice’s voice.
We’ve developed a steganography program that allows Alice to identify the small-size packets and replace their contents with encrypted secret data. We call this program SkyDe, shorthand for Skype Hide. For a covert transaction to take place, the recipient of Alice’s call—let’s name him Bob—also needs to have SkyDe installed on his computer. The software intercepts Alice’s transmission, grabs some of the small packets while letting all of the big ones pass through, and then reassembles the secret message.
Meanwhile, Alice and Bob chat away as if nothing unusual were transpiring. Bob’s Skype application assumes the filched packets have simply been lost. Skype then fills the gap left by each lost packet most likely by reconstructing its contents based on the contents of its neighbors’ packets. (Because Skype is proprietary, we don’t know for sure.) As a result, the missing silence packets sound just like all the other silence packets surrounding them.
Our experiments show that up to 30 percent of Alice’s silence packets can transport clandestine cargo without causing a noticeable change in call quality. This means that Alice could send Bob up to about 2 kilobits per second of secret data—roughly 100 pages of text in 4 minutes—without arousing the suspicion of anyone monitoring their call.
What better place to hide secrets than in one of the world’s most popular file-sharing systems? The peer-to-peer transfer protocol BitTorrent conveys hundreds of trillions of bits worldwide every second. Anyone sniffing for criminal correspondence on its networks would have better luck finding that proverbial needle in a haystack.
Our group developed StegTorrent for encoding classified information in BitTorrent transactions. This method takes advantage of the fact that a BitTorrent user often shares a data file (or pieces of the file) with many recipients at once.
So let’s say Alice wants to send a hidden message to Bob. First, Bob needs to have previously established control over a group of distributed computers that all run a BitTorrent application. These are most likely computers that Bob owns or, if he’s an especially savvy hacker, computers he has co-opted to do his bidding. Both he and Alice need to know how many computers are in this group and what their IP addresses are.
For simplicity’s sake, let’s say Bob controls a group of just two computers. To initiate a transaction, he commands the computers to each request a file from Alice. In a typical BitTorrent transfer, Alice’s program would transmit the data packets in random order, and Bob’s computers would stitch them back together based on the instructions they contain. Using StegTorrent, however, Alice can reorder the packets to encode a specific bit sequence.
For example, if she sends a packet to computer 1 and then to computer 2, that sequence might designate the binary number 1. But if she sends a packet to computer 2 first, Bob’s StegTorrent program would read the signal as binary number 0. To prevent scrambling due to packet losses or delays, StegTorrent modifies the time stamp on each packet so that Bob can decipher the exact order in which Alice sent them. Our experiments showed that using six IP addresses, Alice can relay up to 270 secret bits per second—enough bandwidth for a simple text conversation—without distorting the transfers or attracting suspicion.
Alice can also conceal her messages to Bob—and the fact the two conspirators are communicating at all—simply by having him perform a series of innocent-looking Google searches. Our StegSuggeststeganography program targets the feature Google Suggest, which lists the 10 most popular search phrases given a string of letters a user has entered in Google’s search box.
Here’s how it works: For Alice to send Bob a hidden note, she must first infect his computer with StegSuggest malware so that she can monitor the traffic exchanged between Google’s servers and Bob’s browser. This can be done using basic hacker tools. Then, when Bob types in a random search term, say, “Robots will…,” Alice intercepts the data traveling from Google to Bob. Using StegSuggest, she adds a unique word to the end of each of the 10 phrases Google suggests. The software chooses these additions from a list of 4096 common English words, so the new phrases aren’t likely to be too bizarre. For example, if Google suggests the phrase “Robots will take our jobs,” Alice might add “Robots will take our jobs tree.” Odd, yes, but probably not worthy of alarm.
Bob’s StegSuggest program then extracts each added word and converts it into a 10-bit sequence using a previously shared lookup table. (Each of the 1024 possible bit sequences corresponds to four different words, making the code more difficult to crack.) Alice can thus transmit 100 secret bits each time Bob types a new term into his Google search box.
To send data faster, Alice could hijack the searches of several innocent googlers in a crowded hot spot, such as an Internet café or a college dormitory. In this scenario, both she and Bob would intercept the googlers’ traffic. Alice would insert the coded words into Google’s suggested phrases, and Bob would extract and decode them. He would pass on only the original phrases to the googlers—who would never suspect they had just facilitated a secret exchange.
Now let’s say Alice wants to secretly send video in addition to documents or text messages. In this case, she might opt to smuggle the stream in a very average-looking wireless transmission.
But not just any wireless network will do. Alice must use a network that relies on the data-encoding technique known as orthogonal frequency-division multiplexing (OFDM). Wireless standards that employ this scheme are some of the most popular, including certain versions of IEEE 802.11, used in Wi-Fi networks.
To understand how to hide data in OFDM signals, you must first know something about how OFDM works. This transmission scheme divvies up a digital payload among several small-bandwidth carriers of different frequencies. These narrowband carriers are more resilient to atmospheric degradation than a single wideband wave, allowing data to pass to receivers with higher fidelity. OFDM carefully selects carriers and divides the bits up into groups of set length, known as symbols, to minimize interference.
In reality, though, a digital payload rarely divides perfectly into a collection of symbols; there will usually be some symbols left with too few bits. So OFDM transmitters add extra throwaway bits to these symbols until they conform to the standard size.
Because this “bit padding” is meaningless, Alice can replace it with secret data without compromising the original data transmission. We call this steganographic method Wireless Padding, or WiPad. Because bit padding is abundant in OFDM transmissions, Alice can send hidden data to Bob at a pretty good clip. A single connection on a typical Wi-Fi network in a school or coffee shop, for instance, could support up to 2 megabits per second—fast enough for Alice to secretly stream standard-definition video to Bob.
This article originally appeared online 23 September 2013. A version appeared in print in the November 2013 issue.
Wojciech Mazurczyk, Krzysztof Szczypiorski, and Józef Lubacz wrote “Vice Over IP” in the February 2010 issue of IEEE Spectrum. In 2002, as members of the Network Security Group at Warsaw University of Technology, in Poland, they founded the Stegano.net project to investigate new ways to smuggle data through networks and how to thwart such attempts. After many years spent anticipating evildoers and their machinations, Szczypiorski says his favorite saying comes from Indiana Jones: “Nothing shocks me. I’m a scientist.”