Dealing with program recordings

[I will be at GUADEC from tomorrow evening. See you in Manchester!]

In the previous post I talked about the technology behind UndoDB, in this last post of the series I will talk about replaying recorded programs.

Debugging programs with the ability to go backwards is really useful, but what about automated tests? How about bugs that happen only for a specific user but you cannot reproduce?

Saving recordings helps here. Using Live Recorder you can save the complete status and execution history of a process (including debug symbols) and debug it using UndoDB on another machine with the same architecture.
Live Recorder can be used as a standalone program (live-record -o recording.undo my-program) or as a library.
Using it as a library allows the programmer to start/stop recording when they want, for instance recording only the execution of a test but not the test set up.

The API is quite simple to use, for instance, to record a program until its execution ends, you can just do:

// Start recording.
undolr_start(NULL);
// Automatically save when the program exits.
// This saves also if the program crashes or terminates
// due to an uncaught signal.
undolr_save_on_termination("/foo/bar/recording.undo");

This is particularly useful for tests, in particular tests which fail due to a rarely occurring bug.
You can record your test execution and, if the test fails, save the recording for later debugging. Otherwise, you can just discard the recording.

See the Undo website if you want to try UndoDB and Live Recorder.

How UndoDB works

In the previous post I described what UndoDB is, now I will describe how the technology works.

The naïve approach to record the execution of a program is to record everything that happens, that is the effects of every single machine instruction. This is what gdb does to offer reversible debugging.
Unfortunately this is so slow that it’s unusable even for trivial programs (this is why most people don’t know gdb already has reversible debugging).

UndoDB takes a different approach. It distinguishes which operations are deterministic and which aren’t. For instance, “2+2” will always produce “4”, so there’s no need to save the result of this instruction.
On the other hand, a small proportion of what a program does is non-deterministic, so the effect of these operations must be saved in memory in what we call the event log.
Some non-deterministic operations are:

  • System calls. For instance, for a read we need to save what was read from a file, for a write we only need to save the return code as the content of the buffer is already in the program memory..
  • Signals.
  • Thread switches.
  • Access to shared memory.
  • Non-deterministic assembly instructions. For instance, on x86, RDTSC returns the CPU’s time stamp counter which counts the number of cycles since reset.

Snapshots
Snapshots of a program at different times in execution history.

When you want to run a program under UndoDB, the deterministic program instructions are executed as normal, while non-deterministic ones are executed but their result is also saved in the event log.
Periodically, snapshots of the program are taken. These are complete copies of the current state of the program, but since they are created using the Linux copy-on-write mechanism the impact on the sytem resources is minimal.

Searching
Moving backwards in execution history.

Later, if you need to go back in time, you start replaying the application from a previous snapshot, re-executing only the deterministic operations. The results of non-deterministic operations are synthesised based on what is stored in the event log.

In the next post I will talk about saving program recordings to replay them later.

By the way, we are hiring software engineers in Cambridge (UK). If you are interested, contact me.

What I do at Undo

In October, I started working for Undo and, now that I understand our technology better, it’s time to explain what I do.

Undo produces a (closed source) technology which allows to record, rewind and replay Linux programs (on x86 and ARM).
One of our products using this technology is UndoDB, a debugger built on top of gdb which allows you to do everything you do with gdb, but also to go back in time.

Example of reverse commands in UndoDB

Before joining Undo, I mainly used printfs or similar to debug my code. The main reason is that, when you read logs, it’s easy to jump between different parts of the log and proceed backwards from the point where the bug became apparent to the point where the bug was caused.
On the other hand, with standard gdb, once the bug happens it’s not possible to know what was going on earlier.

UndoDB fixes this problem by allowing the user to go backwards. Every command which moves the program forward has an equivalent reverse command. For instance, next has reverse-next (or rn for short) which moves to the previous line of code, continue has reverse-continue (or rc) which executes backwards until a breakpoint is hit or the start of the program is reached, and so on.

I should point out that UndoDB is not just some kind of fancy logging. You can really jump around in execution history, explore variables and registers at different points in time, bookmark interesting points in history, etc. What you cannot do is change history.

Finally, UndoDB is also useful when gdb wouldn’t be able to show any information. Have you ever seen anything like this in gdb?

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(udb) backtrace
#0  0x0000000000000000 in ?? ()
#1  0x0000000000000000 in ?? ()

Not very useful, is it?
With UndoDB you can step backwards until you reach a point before your program messed up its state:

(udb) backtrace
#0  foo () at program.c:75
#1  0x0000000000400557 in bar (n=42) at program.c:120
#2  0x000000000040056a in main () at program.c:420

In the next post I will give some details on how the technology actually works.

By the way, we are hiring software engineers in Cambridge (UK). If you are interested, contact me.

reTrumplation, a Twitter bot experiment

A few years ago, somebody introduced me to Translation Party, a website which automatically translates a sentence back and forth until further translations produce the same English text. The results are mostly funny nonsense.

Recently at work we were talking about automatic translations, so I thought it could be funny to use the same principle for a Twitter bot which works on Donald Trump’s many tweets. The result is @reTrumplation.

@reTrumplation, first example

@reTrumplation, second example

Karton – running Linux programs on macOS, a different Linux distro, or a different architecture

At work I use Linux, but my personal laptop is a Mac (due to my previous job developing for macOS).

A few months ago, I decided I want to be able to do some work from home without carrying my work laptop home every day.
I considered using a VM, but I don’t like the experience of mixing two operating systems. On Mac I want to use the native key bindings and applications, not a confusing mix of Linux and Mac UI applications.

In the end, I wrote Karton, a program which, using Docker, manages semi-persistent containers with easy to use automatic folder sharing and lots of small details which make the experience smooth. You shouldn’t notice you are using command line programs from a different OS.

Karton logo

After defining which distro and packages you need (this is called an “image”), you can just execute Linux programs by prefixing them with karton run IMAGE-NAME LINUX-COMMAND. For example:

$ uname -a # Running on macOS.
Darwin my-hostname 16.4.0 Darwin Kernel Version 16.4.0 [...]

$ # Run the compiler in the Ubuntu image we use for work
$ # (which we called "ubuntu-work"):
$ karton run ubuntu-work gcc -o test_linux test.c

$ # Verify that the program is actually a Linux one.
$ # The files are shared and available both on your
$ # system and in the image:
$ file test_linux
test_linux: ELF 64-bit LSB executable, x86-64, [...]

Karton runs on Linux as well, so you can do development targeting a different distro or a different architecture (for instance ARMv7 while using an x86_64 computer).

For more examples, installation instructions, etc. see the Karton website.

Markoshiki

Lately, I’ve been working on a web app to learn more about JavaScript, jQuery and other technologies that web developers use. This app is available as a web app, on iPhones/iPads and on Android.

Markoshiki is a logic puzzle game, similar to Sudoku, Futoshiki, etc. The user needs to fill the numbers missing from a board, split in four quadrants, which already has some numbers in it.
The rules are simple:

  • Numbers grow in a clockwise direction following the arrows.
  • Consecutive numbers are in the same row or column as the previous number, but in different quadrants.
  • The numbers that are already in the board when you start the game cannot be modified.

Markoshiki with an empty board  Markoshiki with a partially filled board

For the next version I will focus on making the iOS and Android apps look more native, improve the flow of inserting notes (it’s a bit cumbersome now) and use better the available screen space (including support for landscape mode).

Please play online or install the apps.
If you have any feedback, please let me know at markoshiki@markoshiki.com.