JavaScript engine internals: NaN-boxing

In researching code plugin sandboxing with WebAssembly, my thoughts naturally turned to how to let people actually program for that environment without forcing use of low-level C++ or such… so naturally, now I’m writing a small JavaScript-like runtime engine and compiler that targets WebAssembly. ;)

My explorations so far are themselves written in C++, with tiny JS samples hand-translated into C++ code that calls the runtime’s classes. If I continue the project, I may switch to directly emitting Wasm source from the translator, with an eye towards future Wasm improvements like native garbage-collected reference types. (Right now I have to implement garbage collection myself, which will be the subject of another blog post!)

The first step of implementing a JavaScript engine is to implement a representation for values, which is tricky because JS values can be any of several distinct types:

  • undefined
  • null
  • boolean
  • number (float64)
  • object reference (string, Symbol, Object, etc)

Many other dynamic languages like PHP and Python similarly allow a value to hold multiple competing types like this. There are two main ways to implement a value type like this.

Tagged unions

The first is to use a “tagged union” struct, something like this:

enum TypeTag {
  TAG_UNDEFINED,
  TAG_INT32,
  TAG_DOUBLE,
  ...
  TAG_OBJECT
};

struct Value {
  TypeTag tag;
  union {
    int32_t int32_val;
    double double_val;
    Object* object_val;
  };
};

Then add accessor/mutator functions which check the tag and perform appropriate accesses or conversions. This is simple, but has the downside of requiring 16 bytes per value (to maintain 8-byte alignment on the double-precision float or a 64-bit pointer or int64). That means most reads require two loads, writes require two stores, and you can’t pass a value in a register reliably. (WebAssembly’s local variables and function arguments are sort of like CPU registers in how they’re addressed, so would be the equivalent.)

To get around these, a common trick in high-performance dynamic language runtimes like LuaJIT and all major JavaScript engines is known as NaN boxing.

NaN boxing the SpiderMonkey way

Floating point numbers are a complex beast! A 64-bit double-precision float can be broken down into several sets of bits:

  • 1 bit for sign
  • 11 bits for exponent part
  • 52 bits for binary fractional part

If the exponent bits are all 1, then you have a special value:

  • if the fractional bits are all 0, then it represents infinity (positive or negative, depending on the sign bit)
  • if any fractional bit is 1, then it represents “not a number” (NaN), used as a special signal that an operation was invalid or a non-value is being represented.
  • and oh by the way, regular operations only ever produce a canonical NaN value with just the topmost fractional bit set.

The interesting part is that means you have an entire number space within the range of representable NaN values that will never be produced by legit math operations on legit numbers!

  • 1 bit for sign (set to 1 for a signalling NaN that won’t be produced by math ops)
  • 11 bits for exponent (all set to 1)
  • 1 bit to force it to NaN rather than Infinity if rest is 0s
  • 51 bits for tag and payload

Firefox’s SpiderMonkey engine divides this up into 4 bits of tag (16 types) and 47 bits of address, which is enough for x86_64 (where bit 48 will always be 0 in user space and bits 49-64 aren’t needed … yet) but apparently causes problems on AArch64 and Sparc64. (They work around this by deliberately mapping memory in the lower 47 bits of address space with mmap.)

I’ve chosen to experiment with a similar technique, but only using 3 bits of tag (8 types) to leave a clean 48 bits of address. Though when building to WebAssembly, you only need 32 bits of address so it doesn’t make much difference. ;)

When reading a value of as-yet-unknown type, you mask off the top X bits and compare against some constant values:

  • each type other than double-precision its has a particular tag value, so you just == for it
  • actual double-precision values can be detected quickly by forcing the sign bit on and then doing a uint64 compare against a cutoff value: any legit double including canonical NaNs will be <= 0b1111111111111000’0000000000000000’0000000000000000’0000000000000000 once sign/signalling bit is set.
  • if there are multiple type tags for pointer types, you can put them at the high end of the tag range and do a uint64 compare against a cutoff value

One downside is that every object dereference has to be masked from the stored value, and typical JavaScript code does a lot of objects. It’s a bit-op so it’s cheap, though… and in WebAssembly you can actually wrap the 64-bit value down to 32-bits without masking, since pointers are 32 bits. :)

JavaScriptCore style

WebKit’s JavaScriptCore uses a different technique on 64-bit native architecture.

When storing a double value, the bit pattern is treated as a uint64 and has a constant added to it to “rotate” the NaN patterns such that the top 1 bits in a range of the signaling NaNs turn into 0s.

  • object references have 0x0000 in the top 16 bits, meaning a 48-bit pointer can be read directly with a 64-bit read — no bit-masking penalty on dereference!
  • int32 values have 0xffff in the top 16 bits, making common integer operations easy with a mask
  • true/false/undefined/null are stored as special fake pointer values in the object pointer space
  • most double-precision float values sit in the 0x0001-0xfffe space, and have to subtract the constant to get back to their original bit pattern.
  • -Infinity has to be specially stored in the object space because the shift puts it in the wrong place

Sounds clever!

I haven’t checked what Chrome’s V8 and Edge’s ChakraCore do, but I believe they are similar to one or the other of the above.

Security considerations

The downside of all this magic bit-fiddling is that you have to be careful at API boundaries between the JS world and the external world it’s connected to to canonicalize all incoming NaN values in doubles, or it would be possible to create arbitrary object pointer references from within the scripting language — a huge security hole.

So, you know, be careful with that. ;)