The results back then turned out to be a bit of mystery and surprise to me, and latency is not something that can be easily debugged. It took many small steps to better understand what's happening. First of them, was closer look at what impact does the environment have on the test.
One of the things that I tried back then was running benchmarks in VirtualBox on idle MacBook Pro - my primary home machine. I was also using an ageing ThinkPad laptop with 32bit CPU, but it's results were hardly comparable to 64bit architectures so I left it.
Anyway, VBox+MacBookPro turned out to be fairly bad idea, because those extra layers add a lot of noise, whereas latency test is exactly that moment in time when you don't want any noise.
Running the tests in the cloud (i.e. DigitalOcean) appeared to be much better because that way, each test could run in a fresh new OS provisioned exclusively for the test, without any random services or processes. DigitalOcean droplets performance turned out to be very stable over last few weeks I was playing with it.
As an example, I was measuring how long it takes NodeJS to return from setTimeout call. It's used to invoke a callback after specified amount of time expressed in milliseconds. In each case setTimeout(0) and setTimeout(50) was measured.
On an Ubuntu in VirtualBox on MacBook:
90 percent of calls took up to 5ms and 60ms (for setTimeout(0) and setTimeout(50) respectively).
90 percent of calls took up to 3ms and 56ms (for setTimeout(0) and setTimeout(50) respectively).
On clean Ubuntu in DigitalOcean 512m droplet (smallest):
90 percent of calls took up to 2ms and 52ms (for setTimeout(0) and setTimeout(50) respectively).
Full details can also be seen here.
The seemingly intuitive idea that simple and lightweight call (such as setTimeout) would always work the same, turned out to be wrong. There's more happening than one may think even on an idle machine.
Eventually, I improved the DigitalOcean setup to the level that it was easy to run different tests on many droplets of different size at the same time, making it very easy to compare how exactly does memory or CPU affect specific benchmarks - something I want to continue on in next post.