• BSON >
  • BSON 4.0.0 Tutorial

BSON 4.0.0 Tutorial

This tutorial discusses using the core Ruby BSON gem.


The BSON gem is hosted on Rubygems and can be installed manually or with bundler.

To install the gem manually:

gem install bson -v '~> 4.0'

To install the gem with bundler, include the following in your Gemfile:

gem 'bson', '~> 4.0'

The BSON gem is compatible with MRI >= 1.9.3, JRuby >= 9.1, and Rubinius 2.5.x.

Use With ActiveSupport

Serialization for ActiveSupport-defined classes, such as TimeWithZone, is not loaded by default to avoid a hard dependency of BSON on ActiveSupport. When using BSON in an application that also uses ActiveSupport, the ActiveSupport-related code must be explicitly required:

require 'bson'
require 'bson/active_support'

BSON Serialization

Getting a Ruby object’s raw BSON representation is done by calling to_bson on the Ruby object, which will return a BSON::ByteBuffer. For example:

"Shall I compare thee to a summer's day".to_bson

Generating an object from BSON is done via calling from_bson on the class you wish to instantiate and passing it a BSON::ByteBuffer instance.


Byte Buffers

In 4.0, BSON introduced the use of native byte buffers in both MRI and JRuby instead of using StringIO. This was done for performance improvements.


A BSON::ByteBuffer can be instantiated with nothing (for write mode) or with a string of raw bytes (for read mode).

buffer = # a write mode buffer.
buffer = # a read mode buffer.

Writing to the buffer is done via the following API:

buffer.put_byte(value) # Appends a single byte.
buffer.put_cstring(value) # Appends a null-terminated string.
buffer.put_double(value) # Appends a 64-bit floating point.
buffer.put_int32(value) # Appends a 32-bit integer (4 bytes).
buffer.put_int64(value) # Appends a 64-bit integer (8 bytes).
buffer.put_string(value) # Appends a UTF-8 string.

Reading from the buffer is done via the following API:

buffer.get_byte # Pulls a single byte from the buffer.
buffer.get_bytes(value) # Pulls n number of bytes from the buffer.
buffer.get_cstring # Pulls a null-terminated string from the buffer.
buffer.get_double # Pulls a 64-bit floating point from the buffer.
buffer.get_int32 # Pulls a 32-bit integer (4 bytes) from the buffer.
buffer.get_int64 # Pulls a 64-bit integer (8 bytes) from the buffer.
buffer.get_string # Pulls a UTF-8 string from the buffer.

Converting a buffer to its raw bytes (for example, for sending over a socket) is done by simply calling to_s on the buffer.

buffer =

Supported Objects

Core Ruby objects that have representations in the BSON specification and will have a to_bson method defined for them are: Object, Array, FalseClass, Float, Hash, Integer, NilClass, Regexp, String, Symbol (deprecated), Time, TrueClass.

In addition to the core Ruby objects, BSON also provides some special types specific to the specification:


This is a representation of binary data. The raw data and a subtype must be provided when constructing., :md5)

Valid subtypes are: :generic, :function, :old, :uuid_old, :uuid, :md5, :user.


Represents a string of JavaScript code."this.value = 5;")


Represents a string of JavaScript code with a hash of values."this.value = age;", age: 5)


This is a subclass of Hash that stores all keys as strings, but allows access to them with symbol keys.

BSON::Document[:key, "value"]


Represents a value in BSON that will always compare higher to another value.


Represents a value in BSON that will always compare lower to another value.


Represents a 12 byte unique identifier for an object on a given machine.


Represents a special time with a start and increment value., 30)


Represents a placeholder for a value that was not provided.


Represents a 128-bit decimal-based floating-point value capable of emulating decimal rounding with exact precision.

# Instantiate with a String"1.28")

# Instantiate with a BigDecimal
d =, 3)

JSON Serialization

Some BSON types have special representations in JSON. These are as follows and will be automatically serialized in the form when calling to_json on them.

Object JSON
BSON::Binary { "$binary" : "\x01", "$type" : "md5" }
BSON::Code { "$code" : "this.v = 5 }
BSON::CodeWithScope { "$code" : "this.v = value", "$scope" : { v => 5 }}
BSON::MaxKey { "$maxKey" : 1 }
BSON::MinKey { "$minKey" : 1 }
BSON::ObjectId { "$oid" : "4e4d66343b39b68407000001" }
BSON::Timestamp { "t" : 5, "i" : 30 }
Regexp { "$regex" : "[abc]", "$options" : "i" }

Special Ruby Date Classes

Ruby’s Date and DateTime are able to be serialized, but when they are deserialized, they will always be returned as a Time since the BSON specification only has a Time type and knows nothing about Ruby.

Using Regexes

Ruby regular expressions always have BSON regular expression’s equivalent of ‘m’ on. In order for behavior to be preserved between the two, the ‘m’ option is always added when a Ruby regular expression is serialized to BSON.

There is a class provided by the bson gem, Regexp::Raw, to allow Ruby users to get around this. You can simply create a regular expression like this:"^b403158")

This code example illustrates the difference between serializing a core Ruby Regexp versus a Regexp::Raw object:

regexp_ruby = /^b403158/
# => /^b403158/
# => #<BSON::ByteBuffer:0x007fcf20ab8028>
# => "^b403158\x00m\x00"
regexp_raw ="^b403158")
# => #<BSON::Regexp::Raw:0x007fcf21808f98 @pattern="^b403158", @options="">
# => #<BSON::ByteBuffer:0x007fcf213622f0>
# => "^b403158\x00\x00"

Please use the Regexp::Raw class to instantiate your BSON regular expressions to get the exact pattern and options you want.

When regular expressions are deserialized, they return a wrapper that holds the raw regex string, but do not compile it. In order to get the Ruby Regexp object, one must call compile on the returned object.

regex = Regexp.from_bson(byte_buffer)
regex.pattern #=> Returns the pattern as a string.
regex.options #=> Returns the raw options as a String.
regex.compile #=> Returns the compiled Ruby Regexp object.
←   BSON BSON 3.0.0 Tutorial  →