Stegdetect Research

(Via Gwen)
everything is on Github at https://github.com/linggan

AutoSteg: a framework to embed with f5 and then test detection on a directory of pics (with output printed to a csv file).

C port of F5 algorithm: This one is only about 70% done but the hard math parts are all implemented (and commented pretty thoroughly, if you're curious on lining up the paper with the code and seeing what all the vague math explanations actually mean). What's left is basically fixing the bit manipulation errors (that I haven't been able to track down, unfortunately) and putting the code together with some libjpeg functions that will give the algorithm the right information from the input picture.

As for what my research has come up with in regards to mitigating detection, there are two factors to consider:
-- embedding in lower frequency coefficients. While it limits capacity, it makes sense intuitively that the smaller the message is, the less changes to the picture to detect.
-- only embedding in luminance coefficients. To explain a bit more, JPEG compression involves transforming the pixels from RGB representation to YCrCb (one luminance component and two chrominance components). Each chrominance coefficient tends to contain more information about the picture (I was told that changing one chrominance coefficient is about equivalent to changing four luminance coefficients), so embedding information in them would likely be more visibly detectable.

And as for the question of making it possible to upload stego-images to Facebook and other social media, it's feasible(with even more limited capacity) with the error-correcting codes out there. However, the question hasn't really been explored (that is, the problem of recovering a stego-image, assuming that errors in transmission) other than that one Javascript implementation of embedding 137 characters into a 900x700 size pic. The follow-up question of mitigating detection is even more uncharted territory. The research and time to put into implementing such a feature would be non-trivial, as the person working on it would not only have to understand the mathematical mind riddles that are error-correcting codes (in particular, wet paper codes, reed-solomon codes, and turbo codes) enough to be able to apply and modify them as appropriate, but also be able to actually implement them in code. So hey, if Guardian were to ever get into applied math/comp sci research, this would be one rabbit hole to go down!

Also available in: PDF HTML TXT