Building images from binary data

Simple, yet fun

2019-09-11 python images

During a computer science class today, we were talking about embedding code and metadata in jpg and bmp files. @exvacuum was showing off a program he wrote that watched a directory for new image files, and would display them on a canvas. He then showed us a special image. In this image, he had injected some metadata into the last few pixels, which were not rendered, but told his program where to position the image on the canvas, and it’s size.

This demo got @hyperliskdev and I thinking about what else we can do with image data. After some talk, the idea of converting application binaries to images came up. I had seen a blog post about visually decoding OOK data by converting an IQ capture to an image. With a little adaptation, I did the same for a few binaries on my laptop.

Program design

Like all ideas I have, I wrote some code to test this idea out. Above is a small sample of the interesting designs found in the gimp binary. The goals for this script were to:

  • Accept any file of any type or size
  • Allow the user to select the file dimensions
  • Generate an image
  • Write the data in a common image format

If you would like to see how the code works, read “check out the script”.

A note on data wrapping

By using a generator, and the range function’s 3rd argument, any list can be easily split into a 2d list at a set interval.

# Assuming l is a list of data, and n is an int that denotes the desired split location
for i in range(0, len(l), n):
    yield l[i:i + n]

Binaries have a habit of not being rectangular

Unlike photos, binaries are not generated from rectangular image sensors, but instead from compilers and assemblers (and sometimes hand-written binary). These do not generate perfect rectangles. Due to this, my script simply removes the last line from the image to “reshape” it. I may end up adding a small piece of code to pad the final line instead of stripping it in the future.

Other file types

I also looked at other file types. Binaries are very interesting because they follow very strict ordering rules. I was hoping that a wav file would do something similar, but that does not appear to be the case. This is the most interesting pattern I could find in a wav file:

Back to executable data, these are small segments of a dll file:

Segment 1

Segment 2

Check out the script

This script is hosted on my GitHub account as a standalone file. Any version of python3 should work, but the following libraries are needed:

  • Pillow
  • Numpy

Thank you for reading this post. If you enjoyed the content, and want to let me know, or want to ask any questions, please contact me via one of the methods listed here. If you would like to be notified about future posts, feel free to load my rss feed into your favorite feed reader, or follow me on Twitter for notifications about my work and future posts.

If you have the time to read some more, I recommend checking out one of the following posts:

Tunneling a printer from a home network to a VPN
I use a self-hosted VPN to access all my devices at all times, and to deal with my school's aggressive firewall. This post explains the process I use for exposing my home printer to the VPN.
2020 Wrap-Up
2020 has been my most productive year so far in terms of software development. This post looks back at the year
How I have tweaked my Minecraft client to be 'just right'
Over the past 10 years, I have been building the perfect Minecraft experience for myself. This post shares the collection of mods I run, and why I use them.

Made with ♥ by Evan Pratten | RSS | API Status