29a.ch experiments by Jonas Wagner


 View all my experiments

Recent Articles

Noise Analysis for Image Forensics

Written by

People seem to be liking Forensically. I enjoy hacking on it. So upgraded it with a new tool - noise analysis. I also updated the help page and implemented histogram equalization for the magnifier/ELA.

Screenshot of the noise analysis tool
Open the noise analysis tool

Noise Analysis

The basic idea behind this tool is very simple. Natural images are full of noise. When they are modified this often leaves visible traces in the noise in an image. But seeing the noise in an image can be hard. This is where the new tool in forensically comes in. It takes a very simple noise reduction filter (a separable Median Filter) and reverses it's results. Rather than removing the noise it removes the rest of the image.

One of the benefits of this tool is that it can recognize modifications like Airbrushing, Warps, Deforms, transformed clones that the aclone detection and error level analysis might not catch.

Please be aware that this is still a work in progress.


Enough talk, let me show you an example. I gave myself a nosejob with the warp tool in gimp, just for you.

nose manipulation animation

As you can see the effect is relatively subtle. Not so the traces it leaves in the noise!

noise analysis of nose manipulation

The resampling done by the warp tool looses some of the high frequency noise, creating a black halo around the region.

Can you find any anomalies in the demo image using noise analysis?

A bit of code

I guess that many of the readers of this blog are fellow coders and hackers. So here is a cool hack for you. I found this in some old sorting code I wrote for a programming competition but I don't remember where I had it from originally. In order to make the median filter fast we need a fast way to find the median of three variables. A stupidly slow way to go about this could look like that:

// super slow
[a, b, c].sort()[1]

Now the obvious way to optimize this would be to transform it into a bunch of ifs. That's going to be faster. But can we do even better? Can we do it without branching?

// fast
let max = Math.max(Math.max(a,b),c),
    min = Math.min(Math.min(a,b),c),
    // the max and the min value cancel out because of the xor, the median remains
    median = a^b^c^max^min;

Mina and max can be computed without branching on most if not all cpu architectures. I didn't check how it's handled in javascript engines. But I can show you that the second approach is ~100x faster with this little benchmark.


Forensically, Photo Forensics for the Web

Written by

Back in 2012 I hacked to together a little tool for performing Error Level Analysis on images. Despite being such a simple tool with, frankly, a bad UI it has been used by over 250'000 people.

A few days ago I randomly stumbled across the paper Detection of Copy-Move Forgery in Digital Images by Jessica Fridrich, David Soukal, and Jan Lukáš. I wanted to see if I could do something similar and make it run in a browser. It took a good bit of tweaking but I ended up with something that works. I took a copy of my photo film emulator as a base for the UI, adapted it a bit, ported the old ELA code and added some new tools. The result is called Forensically.

Screenshot of Forensically
Open Forensically

How to use Forensically

If you want some guidance on how to use forensically you get to pick your poison. On offer is a 12 minute monologue in form of a tutorial video or a whole bunch of cryptic text on the help page. I'm sorry that neither are very good.

How the Clone Detection works

I guess the most interesting feature of this new tool is the clone detection. So let me reveal to you how I made it work. I will try to keep the explanation simple. If there is interest in it I might still write a more technical description of the algorithm later.

The basic idea

Create a Table
Move a window over the image, for each position of the window
    Use all of the pixels in the window as a key
    If the key is already in the table
        We found a clone! Mark it.
        Add the key to the table

This does actually work, but it will only find perfect copies. We want the matching to be more fuzzy.


So the next key step is to make the matching more fuzzy. We do this by compressing the key to make it less unique. You can think of this step as converting each of the little blocks into a tiny JPEG and then using those pixels as a key. The actual implementation is using Haar wavelets for this step. You can see the compressed blocks that are used by clicking on Show Quantized Image in the Clone Detection Tool.

This works too but now we have too many results!


So the next step is to filter all of the blocks and to throw away the boring ones. This is done by comparing the amount of detail in the high frequencies to a threshold. You can think of it as subtracting a blurred image of the block from the block and then looking at how much is left of the pixels. In practice the blurring is not required because the wavelet step has already done it for us. You can see the rejected blocks as black spots in the quantized image.

At this stage the algorithm works but it does still show a lot of uninteresting copies of blocks that just happen to look similar.


So now we take another look at all of the clones that we found. If the distance between the source and destination is too small we reject them. Next we look at clones that start from a similar place and are copied into a similar direction. If we find less than Minimal Cluster Size other clones that are similar we discard the clone as noise.

Source Code

I haven't figured out how I want to license the code and assets yet. But I do plan to release it in some form.


As always, feedback is appreciated both on the app and on the post. Would you like future posts to be more in depth and technical or do you like the current format?


Light Leaks in the Film Emulator

Written by

Inspired by the feature in the recent release of g'mic, I added a new feature to my Film Emulator, light leaks.

Photo with light leak

G'mic seems to use predefined images for the light leaks. I decided to go another route and created a procedural version of it. The benefits are clear: Now big images to download, and an infinite variation of light leaks.

The implementation is also rather straight forward, in fact it shares most of the code with the already existing grain code. It uses simplex noise at a fairly low frequency to create a colored plasma. Three octaves of simplex noise are used for the luminance, and a single octave of simplex noise is used to randomize each color channel. This was just my first approximation but in my opinion it works better than it has any right to, so I'll stick to it for now.


Javascript Film Emulation

Written by

I hacked together a little analog film emulation tool in Javascript. It's based on the awesome work of Pat David. I wrote it mainly to play with some new tech but I liked the result enough to share it with you. You can try it here:

example image
View the Film Emulator

It also works on android phones running chrome, give it a try!

How the Film Emulation works

I guess the most interesting part for most people is the actual film emulation code. It's using Color Lookup Tables (cluts).

So in simplistic terms:

For every pixel in the image
Take it's color values r, g, b
Look up it's new color value in the lookup table
r', g', b' = colorLookupTable[r, g, b]
Set the pixel to the color values (r', g', b')

In practice there are a few more considerations. Most cluts don't contain values for all 16 777 216 (224) colors in the rgb space. A simplistic solution to this problem would be to always just use the closest color (nearest-neighbor interpolation). This is fast but results in very ugly banding artifacts.

So to keep things fast I use random dithering for the previews and trilinear filtering for the final output. The random dithering is probably a suboptimal choice, but it was easy to implement.

You can find more details about how the lookup tables were create on Pat Davids website.


As stated at the beginning I wrote this application to play with new technology, so there is a lot going on in this little application.

The entire code is written in Javascript (ES6 to be precise). Which is then converted to more mainstream javascript using babel.js.

It is using the canvas API for accessing the pixel data of images and then processes them in web workers for parallelism using transferable objects to avoid copies.

WebGL would obviously also be suitable for this task, I might even write an implementation in the future

The css makes heavy use of flexible boxes and is written in scss. The icon font was generated using fontello.

The whole thing is built using grunt and browserify.

Of course these are just a few of the bits of tech that I played with to make this append. If you want to know even more, just look at the source.

Source Code

You can find the source code of this tool on github. The code is not licensed under an open source license and does not come with all the data files in order to prevent lazy people from just copying everything and pretending it is their own work. You are of course free to study the code and takes bits and pieces, I consider this fair use. Just attribute them to me properly. If you have grander plans for it and the lack of a license prevents you from following up on them feel free to contact me.


Full-text search example using lunr.js

Written by

I did a little experiment today. I added full-text search to this website using lunr.js. Lunr is a simple full-text search engine that can run inside of a web browser using Javascript.

Lunr is a bit like solr, but much smaller and not as bright, as the author Oliver beautifully puts it.

With it I was able to add full text search to this site in less than an hour. That's pretty cool if you ask me. :)

You can try out the search function I built on the articles page of this website.

I also enabled source maps so you can see how I hacked together the search interface. But let me give you a rough overview.


The indexing is performed when I build the static site. It's pretty simple.

// create the index
var index = lunr(function(){
    // boost increases the importance of words found in this field
    this.field('title', {boost: 10});
    this.field('abstract', {boost: 2});
    // the id

// this is a store with some document meta data to display
// in the search results.
var store = {};

        href: entry.href,
        title: entry.title,
        abstract: entry.abstract,
        // hacky way to strip html, you should do better than that ;)
        content: cheerio.load(entry.content.replace(/<[^>]*>/g, ' ')).root().text()
    store[entry.href] = {title: entry.title, abstract: entry.abstract};

fs.writeFileSync('public/searchIndex.json', JSON.stringify({
    index: index.toJSON(),
    store: store

The resulting index is 1.3 MB, gzipping brings it down to a more reasonable 198 KB.

Search Interface

The other part of the equation is the search interface. I went for some simple jQuery hackery.

jQuery(function($) {
    var index,
        data = $.getJSON(searchIndexUrl);

        store = data.store,
        // create index
        index = lunr.Index.load(data.index)

    $('.search-field').keyup(function() {
        var query = $(this).val();
        if(query === ''){
        else {
            // perform search
            var results = index.search(query);
            data.then(function(data) {
                    results.length ?
                        var el = $('<p>')
                                .attr('href', result.ref)
                        return el;
                    }) : $('<p><strong>No results found</strong></p>')

Learn More

If you want to learn more about how lunr works I recommend you to read this article by the author.

If you still want to learn more about search, then I can recommend this great free book on the subject called Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze.


Source Maps with Grunt, Browserify and Mocha

Written by

If you have been using source maps in your projects you have probably also encountered that they do not work with all tools. For instance, if I run my browserifyd tests with mocha I get something like this:

ReferenceError: dyanmic0 is not defined
    at Context.<anonymous> (
    at callFn (
    at timeslice (

That's not exactly very helpful. After some searching I found a node module to solve this problem: node-source-map-support. It's easy to use and magically makes things work.


npm install --save-dev source-map-support
and then add this somewhere in your initialization code:

I place it in a file called dev.js that I include in all development builds.

Now you get nice stack traces in mocha, jasmine, q and most other tools:

ReferenceError: dyanmic0 is not defined
    at Context.<anonymous> (src/physics-tests.js:44:1)

Nicely enough this also works together with Qs long stack traces:

var Q = require('q');
Q.longStackSupport = true;
Q.onerror = function (e) {
    console.error(e && e.stack);

function theDepthsOfMyProgram() {
  }).done(function explode() {
    throw new Error("boo!");

Will result in:

Error: boo!
    at explode (src/dev.js:12:1)
From previous event:
    at theDepthsOfMyProgram (src/dev.js:11:1)
    at Object./home/jonas/dev/sandbox/atomic-action/src/dev.js.q (src/dev.js:16:1)

That's more helpful. :) Thank you Evan!


Desktop rdiff-backup Script

Written by

I have recently revamped the way I backup my desktop. In this post I document the thoughts that went into this. This is mostly for myself but you might still find it interesting.

To encrypt or not to encrypt

I do daily incremental backups of my desktop to an external harddrive. This drive is unencrypted.

Encrypting your backups has obvious benefits - it protects your data from falling into the wrong hands. But at the same time it also makes your backups much more fragile. A single corrupted bit can spell disaster for anything from a single block to your entire backup history. You also need to find a safe place to store a strong key - no easy task.

Most of my data I'd rather have stolen than lost. A lot of it is open source anyway. :)

The data that I'd rather lose than having it fall into the wrong hands (mostly keys) is stored and backed up in encrypted form only. For this I use the gpg agent and ecryptfs.

Encrypting only the sensitive data rather than the whole disk increases the risk of it being leaked. Recovering those leaked keys will however require a fairly powerful adversary which would have other ways of getting his hands on that data anyway so I consider this strategy to be a good tradeoff.

As a last line of defense I have an encrypted disk stored away offsite. I manually update it a few times a year to reduce the chance of loosing all of my data in case of a break in, fire or another catastrophic event.

Before showing you the actual backup script I'd like to explain you to why I'm back to using rdiff-backup for my backups.

Duplicity vs rdiff-backup vs rsync and hardlinks

Duplicity and rdiff backup are some of the post popular options to do incremental backups on linux (ignoring the more enterprisey stuff like bacula). Using rsnapshot which is using rsync and hardlinks is another one.

The main drawback to using rsync and hardlinks is that it stores full copies of every when it changes. This can be a good tradeoff especially when fast random access to historic backups is needed. This combined with snapshots is what I would most likely use for backing up production servers, where getting back some (or all) files of a specific historic version as fast as possible is usually what is needed. For my desktop however incremental backups are more of a backup of a backup. Fast access is not needed but I want to have the history around just in case I get the order of the -iname and -delete arguments to find wrong again without noticing.

Duplicity backs up your data by producting compressed (and optionally encrypted) tars that contain diffs of a full backup. This allows it to work with dumb storage (like s3) and makes encrypted backups relatively easy. However if even just a few bits get corrupted any backups after the corruption can be unreadable. This can be somewhat mitigated by doing frequent full backups, but that takes up space and increases the time needed to transfer backups.

rdiff-backup works the other way around. It always stores the most recent version of your data as a full mirror. So you can just cp that one file you need in a pinch. Increments are stored as 'reverse diffs' from the most current version. So if a diff is corrupted only historic data is affected. Corruption to a file will only corrupt that file, which is what I prefer.

The Script


Most backup scripts you find on the net are written for backing up servers or headless machines. For backing up Desktop Linux machines the most popular solution seems to be deja-dup which is a frontend for duplicity.

As I want to use rdiff-backup I hacked together my own script. Here is what it roughly does.

  • Mounts backup device by label via udisks
  • Communicates start of backup via desktop notifications using notify-send
  • Runs backup via rdiff-backup
  • Deletes old increments after 8 weeks
  • Communicates errors or success via desktop notifications.
# delay backup a bit after the login
sleep 3600
# unmount if already mounted, ensures it's always properly mounted in /media
udisks --unmount $BACKUP_DEV
# Mounting disks via udisks, this doesn't require root
notify-send -i document-save Backup Started
rdiff-backup --print-statistics --exclude /home/jonas/Private --exclude MY_OTHER_EXCLUDES $HOME $BACKUP_DEST 2>> $BACKUP_LOG_ERROR >> $BACKUP_LOG
if [ $? != 0 ]; then
    echo "BACKUP FAILED!"
    # notification
    MSG=$(tail -n 5 $BACKUP_LOG_ERROR)
    notify-send -u critical -i error "Backup Failed" "$MSG"
    # dialog
    notify-send -u critical -t 0 -i error "Backup Failed" "$MSG"
    exit 1
} fi
rdiff-backup --remove-older-than 8W $BACKUP_DEST
udisks --unmount $BACKUP_DEV
STATS=$(cat $BACKUP_LOG|grep '^Errors\|^ElapsedTime\|^TotalDestinationSizeChange')
notify-send -t 1000 -i document-save "Backup Complete" "$STATS"

This script runs whenever I login. I added it via the Startup Applications settings in Ubuntu.

The backup ignores the ecryptfs Private folder but does include the encrypted .Private folder thereby only backing up the cipher texts of sensitive files.

I like using disk labels for my drives. The disk label can easily be set using e2label:

e2label /dev/sdc backup0

The offsite backup I do by manually mounting the LUKS encrypted disk and running a simple rsync script. I might migrate this to amazon glacier at some point.

I hope this post is useful to someone including future me. ;)


smartcrop.js ken burns effect

Written by

This is an experiment that multiple people have suggested to me after I have shown them smartcrop.js. The idea is to let smartcrop pick the start and end viewports for the ken burns effect.. This could be useful to automatically create slide shows from a bunch of photos. Given that smartcrop.js is designed for a different task it does work quite well. But see for yourself.

I'm sure it could be much improved by actually trying to zoom on the center of interest rather than just having it in frame. The actual animation was implemented using css transforms and transitions. If you want to have a look you can find the source code on github.


 View & search all my articles