How UndoDB works

In the previous post I described what UndoDB is, now I will describe how the technology works.

The naïve approach to record the execution of a program is to record everything that happens, that is the effects of every single machine instruction. This is what gdb does to offer reversible debugging.
Unfortunately this is so slow that it’s unusable even for trivial programs (this is why most people don’t know gdb already has reversible debugging).

UndoDB takes a different approach. It distinguishes which operations are deterministic and which aren’t. For instance, “2+2” will always produce “4”, so there’s no need to save the result of this instruction.
On the other hand, a small proportion of what a program does is non-deterministic, so the effect of these operations must be saved in memory in what we call the event log.
Some non-deterministic operations are:

  • System calls. For instance, for a read we need to save what was read from a file, for a write we only need to save the return code as the content of the buffer is already in the program memory..
  • Signals.
  • Thread switches.
  • Access to shared memory.
  • Non-deterministic assembly instructions. For instance, on x86, RDTSC returns the CPU’s time stamp counter which counts the number of cycles since reset.

Snapshots of a program at different times in execution history.

When you want to run a program under UndoDB, the deterministic program instructions are executed as normal, while non-deterministic ones are executed but their result is also saved in the event log.
Periodically, snapshots of the program are taken. These are complete copies of the current state of the program, but since they are created using the Linux copy-on-write mechanism the impact on the sytem resources is minimal.

Moving backwards in execution history.

Later, if you need to go back in time, you start replaying the application from a previous snapshot, re-executing only the deterministic operations. The results of non-deterministic operations are synthesised based on what is stored in the event log.

In the next post I will talk about saving program recordings to replay them later.

By the way, we are hiring software engineers in Cambridge (UK). If you are interested, contact me.

Information and Links

Join the fray by commenting, tracking what others have to say, or linking to it from your blog.

Other Posts

Write a Comment

Take a moment to comment and tell us what you think. Some basic HTML is allowed for formatting.

Reader Comments

[…] In the next post I will give some details on how the technology actually works. […]

[…] works – dealing with non-determinism {$excerpt:n} submitted by /u/testfailure [link] [comments] Source: […]

[…] the previous post I talked about the technology behind UndoDB, in this last post of the series I will talk about replaying recorded […]