Statistics sniff out secrets

By Kimberly Patch, Technology Research News

As digitized pictures, audio and text proliferate, people are exploring ways to exploit these media by hiding messages within the information, which leads others to try to detect these hidden messages.

Although steganography -- the practice of hiding a secret message in written or audio information -- is hardly new, computers and the Web have added a new twist simply because the volume of information that makes up digitized media is so large. This provides for historically large haystacks that easily obscure needles.

A Dartmouth College researcher has found a method that makes it easier to detect hidden messages in digital images, which can contain a megabyte -- a string of one million ones and zeros -- of information, or more.

Digital images are made up of pixels, or dots of color. Especially with high-resolution digital images that have one million or more different shades of color, it's easy to hide a message by slightly altering these colors in ways that are imperceptible to the human eye.

In an image that has not been tampered with, however, the information that makes up the image is not simply random. The key to the Dartmouth detection method is creating a statistical profile of the compressed data files that make up natural, or undisturbed images, then checking a given image against the profile, according to Hany Farid, an assistant professor of computer science at Dartmouth. "In order to detect hidden messages in an image we need to start by characterizing the statistics of natural images. The hope, then, is that when a message is hidden in an image, these statistics are disturbed," he said.

When images are compressed so they can be stored as smaller files, the digital information that indicates the color of each pixel is changed into wavelet information. Wavelet mathematics includes functions like spatial position, orientation and scale. Wavelets allow for compression because all the information that makes up a wavelet can be reconstructed from only a portion of that information. An image is compressed by storing only the portion that is needed to reconstruct the whole.

Farid collected two types of wavelet statistics: variations like mean, variance, skewness and kurtosis in the coefficients, or numbers that make up the wavelets, and information about the rate of errors that occur when reconstructing full wavelets from compressed information.

Variance shows how spread out the data is from the mean, or average; skewness shows how evenly distributed the data is on either side of the mean, and kurtosis shows how peaked the distribution of data is around the mean, said Farid.

He then combined the variation and error rate statistics into a vector -- a mathematical construction that is like a virtual sculpture with 70 to 100 dimensions rather than the usual three.

By comparing the statistical vector information with the same information in an individual image, Farid was able to tell if the image had been disturbed with a hidden message, he said.

The practice of information hiding, or steganography is related to, but different from cryptography. In cryptography a message is encrypted and then transmitted. If you saw the transmission you wouldn't be able to decipher the message, but you would know the sender and receiver might be trading secrets. The goal of information hiding is to go a step further by camouflaging the transmission entirely, said Farid.

The statistical vector method only detects hidden messages, and cannot read or remove them, but may eventually be adapted to do so, said Farid. "This work cannot obviously be adapted to remove or decipher the hidden message. I do believe, however, that it is possible to do so," he said.

The technique is an extension of previous steganography detection schemes, said Neil F. Johnson, associate director of the Center for Secure Information Systems. "It is potentially useful if the techniques for detection are repeatable," he said.

In addition to determining if there's information embedded in a message, it is also useful for a detection method to identify the steganography technique used to hide the information, said Johnson. Another goal is to be able to extract the embedded information, he said. Steganography tools exist that can do this in at least some cases, he added.

Steganography has many applications, both good and bad, said Farid. "It can be used to protect copyrights in digital media, for unobtrusive military and intelligence communication, covert criminal communication, trafficking of illegal pornography, and for the protection of civilian speech against repressive governments."

An unfortunate side effect of research that reveals hidden messages, is "repressive governments could use this research to limit civilian speech," Farid said. Because of the possible unpleasant applications, "some will be very critical of this research, possibly with good reason," said Farid. "Nevertheless, I believe that the development of these techniques are inevitable and... will lead to better techniques for hiding information, which in turn will lead to better detection schemes and so on. My larger research vision is in authenticating digital media so that [neither] the 'good-guys' [nor] the 'bad-guys' will... be able to manipulate digital sound, image or video to suit their needs," he said.

The method can also eventually be applied to analyzing works of art to detect forgeries or to determine if more than one artist painted a single painting, Farid said.

The method could be used practically in less than two years, said Farid.

Farid's research was funded by the National Science Foundation (NSF) and the Department of Justice (DOJ).

Timeline:   < 2 years
Funding:   Government
TRN Categories:  Cryptography and Security; Pattern Recognition
Story Type:   News
Related Elements:  Technical paper, "Detecting Steganographic Messages in Digital Images," posted at http://www.cs.dartmouth.edu/farid/publications/tr01.html




Advertisements:



September 26, 2001

Page One

Statistics sniff out secrets

Quantum bit withstands noise

Image search sorts by content

Study finds Web quality time

Powerless memory gains time




News:

Research News Roundup
Research Watch blog

Features:
View from the High Ground Q&A
How It Works

RSS Feeds:
News  | Blog  | Books 



Ad links:
Buy an ad link

Advertisements:







Ad links: Clear History

Buy an ad link

 
Home     Archive     Resources    Feeds     Offline Publications     Glossary
TRN Finder     Research Dir.    Events Dir.      Researchers     Bookshelf
   Contribute      Under Development     T-shirts etc.     Classifieds
Forum    Comments    Feedback     About TRN


© Copyright Technology Research News, LLC 2000-2006. All rights reserved.