Shedding some light into UUID version 4 in Ruby and Rails

Ruby's standard library and Rails's PostgreSQL adapter use by default version 4 UUIDs. This can be changed in Rails migrations via:

default: 'uuid_generate_v1()'

param whereas Ruby's stdlib only supports version 4.

Another interesting difference is in the implementations:

In the Ruby's stdlib the UUID method is found in SecureRandom:
  # SecureRandom.uuid generates a random v4 UUID (Universally Unique IDentifier).
  # The version 4 UUID is purely random (except the version).
  # It doesn't contain meaningful information such as MAC addresses, timestamps, etc.
  # See RFC 4122 for details of UUID.
  def uuid
    ary = random_bytes(16).unpack("NnnnnN")
    ary[2] = (ary[2] & 0x0fff) | 0x4000
    ary[3] = (ary[3] & 0x3fff) | 0x8000
    "%08x-%04x-%04x-%04x-%04x%08x" % ary

source-code redirect

The Rails version uses uuid-ossp postgres extension:
  # By default, this will use the +uuid_generate_v4()+ function from the
  # +uuid-ossp+ extension, which MUST be enabled on your database.
  def primary_key(name, type = :primary_key, options = {})
    return super unless type == :uuid
    options[:default] = options.fetch(:default, 'uuid_generate_v4()')
    options[:primary_key] = true
    column name, type, options

source-code redirect

The only difference is how the actual numbers are being generated in each library - this is quite interesting to investigate - let's start with the Ruby stdlib.

Ruby standard library implementation
module SecureRandom
  if defined? OpenSSL::Random
    def self.gen_random(n)
      @pid = 0 unless defined?(@pid)
      pid = $$
      unless @pid == pid
        now = Process.clock_gettime(Process::CLOCK_REALTIME, :nanosecond)
        ary = [now, @pid, pid]
        OpenSSL::Random.random_add(ary.join("").to_s, 0.0)
        @pid = pid
      return OpenSSL::Random.random_bytes(n)
    def self.gen_random(n)
      ret = Random.raw_seed(n)
      unless ret
        raise NotImplementedError, "No random device"
      unless ret.length == n
        raise NotImplementedError, "Unexpected partial read from random device: only #{ret.length} for #{n} bytes"

source-code redirect

Is starts with OpenSSL::Random which can be described as:

OpenSSL cannot generate truly random numbers directly. The choices are to use a cryptographically secure PRNG with a good random seed (i.e. with OS harvested data from effectively random hardware events); or use a real hardware RNG. [1]

If OpenSSL is not present it falls back to Ruby's pseudo-random number generator implemented in Random.raw_seed which uses a modified Mersenne Twister with a period of 2**19937-1.

uuid-ossp extension implementation

Looking at uuid-ossp source-code we can find:

  return uuid_generate_internal(UUID_MAKE_V4, NULL, NULL, 0);

and the uuid_generate_internal relevant part of the function:

case 4:         /* random uuid */
  #ifdef HAVE_UUID_E2FS
    uuid_t      uu;
    uuid_unparse(uu, strbuf);

As we can see there's a call to uuid_generate_random - what can this be?

uuid_generate_random(3) - Linux man page

Its description:

The uuid_generate_random function forces the use of the all-random UUID format, even if a high-quality random number generator (i.e., /dev/urandom) is not available, in which case a pseudo-random generator will be substituted. Note that the use of a pseudo-random generator may compromise the uniqueness of UUIDs generated in this fashion.

Interesting to note that if /dev/urandom is not available - again it falls back to PRNG - It seems there's a nuance here or even a contradiction as /dev/urandom uses csPRNG (cryptographically secure pseudorandom number generator) [3]


This started as a simple curiosity on what versions of UUID I could use - note that this is not a critique of truly-random vs PRGNs as the RFC clearly states:

The version 4 UUID is meant for generating UUIDs from truly-random or pseudo-random numbers. [0]

which makes both implementations correct.

It makes sense for the UUID method to be in SecureRandom in Ruby since it's only the version 4 implementation and by default it uses OpenSSL::Random. The only minor gripe I have with this is that there's no UUID lib for general use in Ruby supporting all the versions [2].

Regarding their implementations it should be noted that they differ and if the quality of randomness is important one should further investigate (e.g. due to missing OpenSSL - Ruby will use the internal PRNG which at first look it appears quite solid as it uses for seeding dev/urandom [6] but it might not be in the same class of csPRNG)

For UUIDs both implementations should yield usable random UUIDs even if the library falls back to a non csPRNG algorithm. In the end there's a small gotcha:

Distributed applications generating UUIDs at a variety of hosts must be willing to rely on the random number source at all hosts. If this is not feasible, the namespace variant should be used.

In order to avoid UUID collisions [4].

[0] - RFC-4122 redirect
[1] - Why OpenSSL can't use /dev/random directly? question redirect
[2] - I usually use uuidtools redirect
[3] - Myths about /dev/urandom redirect
[4] - uuid-collisions redirect
[5] - bonus read GoodPracticeRNG redirect
[6] - source-code redirect

The half-life of a programmer

Half-life (t1/2) is the amount of time required for the amount of something to fall to half its initial value. The term is very commonly used in nuclear physics to describe how quickly unstable atoms undergo radioactive decay, but it is also used more generally for discussing any type of exponential decay.

I had this almost random idea of when a programmer hits his or hers career half-life.


First a few points that need explaining: almost random is in the sense that next year this time I'll be hitting thirty (Faith No More track redirect is already playing in my head).

The next point is how one should understand the word "career" in the context of the field I'm working in. The word itself makes me think of a reductionist fifties like view of how one works and/or deals with work.

This begets the question "What does 'career' mean for a programmer?" - tricky question and interesting tangent: methinks it refers to that period of time when one starts to work professionally (getting paid) to the point in time when one stops doing that.

Let's jump on another tangent: "Why is the end of a programming career when one stops writing code?" - clearly one could easily jump on a management position and still call oneself a programmer - that for me is not programming any more, maybe I'm a purist and yes I'm not fond of labels like "team leader", "CTO", "lead developer", "senior developer" and all that paraphernalia that in the end is just utter non-sense.

Relation to age

If one looks around the "average" age or more likely the anecdotal age (yes this is not even proper empiric data) of developers in shiny start-ups is around twenty seven, I reckon. And it makes sense - they're relatively fresh out of university programmers with a decent experience on that hi(y)ppie new programming language or paradigm.

I don't want to turn this into a rant, methinks I was also in that situation and it's a sane approach as long as one doesn't go into extremes (like using MongoDB, OrientDB, InfluxDB, etc. for a problem's domain where a relational DB is eye melting clear).

In any case, years pass, one gets experience and more importantly, experience at a higher level i.e. when one can much more easily understand the inner-workings of complex systems. The problem that stems out of this is that at some point work gets repetitive (if you let it) and one can easily get entrenched in a tool-chain.

The half-life of a programmer

I'm starting to feel this is the crux of the problem: a programmer hits half-life when he realizes that most of the work he does is repetitive and he or she gets entrenched in a tool-chain. Clearly, if one cannot overcome these slight issues then we have a problem that might lead to a non-programming career or an illusory one: "team leader", "CTO", "lead developer", "senior developer", "consultant" etc.

The issue with those terms is that some programmers start to take it way too seriously whilst completely ignoring what I like to call referential humility.

Most of my heroes: the likes of Jim Weirich redirect, Sandi Metz redirect and more seem to be in for life (the career) and I find that very reassuring.

Referential humility

What does that mean in the end? - just a play of words from referential integrity:

Referential integrity is a property of data which, when satisfied, requires every value of one attribute (column) of a relation (table) to exist as a value of another attribute in a different (or the same) relation (table).

In other words: there's always something "new" or something we still don't have a grasp of - hence the end of the road in a programmer's career is virtually non-existent.


Once you hit this half-life - choose wisely, work in the end is just another distraction from death, it's in our best interest to make that distraction authentic.

Vaguely related to the distraction from death tangent:

Ruby 2.2.0 Preview 1 quick Rails benchmarks

Great news everyone! Ruby 2.2 preview 1 has been released redirect! I'm really curios about the Incremental GC redirect and Symbol GC redirect so let's run some quick Rails benchmarks on a normal Rails API app.

First off let's install the preview via RVM:

rvm install ruby-2.2.0-preview1

After fiddling around about five minutes trying to find a part of the application that doesn't fail under the Preview I stopped at the simple /profiles endpoint that just renders a JSON of all profiles, quite simple indeed. Using the trusty wrk redirect I fired up a quick bench:

wrk -t10 -c10 -d20s http://localhost:8080/profiles

The results are as follows:

Ruby 2.1.2p95

Running 20s test @ http://localhost:8080/profiles
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   255.02ms   25.10ms 372.80ms   67.61%
    Req/Sec     3.21      0.70     5.00     71.13%
  771 requests in 20.01s, 4.40MB read
Requests/sec:     38.53
Transfer/sec:    225.31KB
50%,252 ms
90%,285 ms
99%,328 ms
99.999%,372 ms

Ruby 2.2.0preview1

Running 20s test @ http://localhost:8080/profiles
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   253.27ms   30.64ms 344.75ms   64.21%
    Req/Sec     3.34      0.70     5.00     89.63%
  786 requests in 20.02s, 4.49MB read
Requests/sec:     39.26
Transfer/sec:    229.60KB
50%,251 ms
90%,291 ms
99%,329 ms
99.999%,344 ms

I'm not really sure that I should interpret them yet it seems that under the Preview we have a slight improvement but within margins of error. At this point I don't think is the best benchmark for the Preview as we don't use views thus Rails won't bloat up the memory with Strings.

On the memory usage side we have 65M vs 75M (Preview vs. 2.1) so in this scenario we clearly have a winner.

note: this was measured using OSX's Activity Monitor after wrk finished the benchmark and it's the average of the unicorn workers sizes.


Bundler and all the gems installed without issue but in some cases I got silent failures. The benchmarks were run on an actual working/production Rails 4.0.x app with around 25 gems. Nonetheless all of the gems installed and I could boot up Rails with unicorn and benchmark the simpler endpoints which is great.


TBD - this is work in progress I will update it with more information