- Administration >
- Data Center Awareness >
- Tag Aware Sharding >
- Tiered Hardware for Varying SLA or SLO
Tiered Hardware for Varying SLA or SLO¶
On this page
MongoDB Tag Aware Sharding allows administrators to define ranges of the shard key and tag them to one or more shards. Using this feature, data can be routed to one or more shards based on the tag. This gives administrators control over data distribution in a sharded cluster.
This tutorial uses Tag Aware Sharding to route documents based on creation date either to shards tagged for supporting recent documents, or those tagged for supporting archived documents.
The following are some example use cases for segmenting data based on Service Level Agreement (SLA) or Service Level Objective (SLO):
- An application requires providing low-latency access to recently inserted / updated documents
- An application requires prioritizing low-latency access to a range or subset of documents
- An application that benefits from ensuring specific ranges or subsets of data are stored on servers with hardware that suits the SLA’s for accessing that data
The following diagram illustrates a sharded cluster that uses hardware based shard tags to satisfy data access SLAs or SLOs.
Scenario¶
A photo sharing application requires fast access to photos uploaded within the
last 6 months. The application stores the location of each photo along with
its metadata in the photoshare
database under the data
collection.
The following documents represent photos uploaded by a single user:
Note that only the document with _id : 10003012
was uploaded within
the past year (as of June 2016).
Shard Key¶
The photo collection uses the { creation_date : 1 }
index as the shard key.
The creation_date
field in each document allows for creating a tag range
on the creation date.
Tags¶
The application requires tagging each shard in the cluster based on its hardware tier. Each hardware tier represents a specific hardware configuration designed to satisfy a given SLA or SLO.
- Fast Tier (“recent”)
These are the fastest performing machines, with large amounts of RAM, fast SSD disks, and powerful CPUs.
The tag requires a range with:
- a lower bound of
{ creation_date : ISODate(YYYY-mm-dd)}
, where the Year, Month, and Date specified byYYYY-mm-dd
is within the last 6 months. - an upper bound of
{ creation_date : MaxKey }
.
- a lower bound of
- Archival Tier (“archive”)
These machines use less RAM, slower disks, and more basic CPUs. However, they have a greater amount of storage per server.
The tag requires a range with:
- a lower bound of
{ creation_date : MinKey }
. - an upper bound of
{ creation_date : ISODate(YYYY-mm-dd)}
, where the Year, Month, and Date match the values used for therecent
tier’s lower bound.
- a lower bound of
Note
The MinKey
and MaxKey
values are reserved special
values for comparisons.
As performance needs increase, adding additional shards and tagging them based on their hardware tier allows for the cluster to scale horizontally.
When defining tag ranges based on time spans, weigh the benefits of infrequent updates to the tag ranges against the amount of data that must be migrated on an update. For example, setting a limit of 1 year for data to be considered ‘recent’ likely covers more data than setting a limit of 1 month. While there are more migrations required when rotating on a 1 month scale, the amount of documents that must be migrated is lower than rotating on a 1 year scale.
Write Operations¶
With tag-aware sharding, if an inserted or updated document matches a configured tag range, it can only be written to a shard with the related tag.
MongoDB can write documents that do not match a configured tag range to any shard in the cluster.
Note
The behavior described above requires the cluster to be in a steady state with no chunks violating a configured tag range. See the following section on the balancer for more information.
Read Operations¶
MongoDB can route queries to a specific shard if the query includes the shard key.
For example, MongoDB can attempt a targeted read operation on the following query because it includes
creation_date
in the query document:
If the requested document falls within the tag range assigned to the
recent
storage tier, MongoDB would route this query to the tagged shards,
ensuring a faster read compared to a cluster-wide broadcast read
operation
Balancer¶
The balancer migrates the tagged chunks to the appropriate shard. Until the migration completes, shards may contain chunks that violate configured tag ranges and tags. Once balancing completes, shards should only contain chunks whose ranges do not violate its assigned tags and tag ranges.
Adding or removing tags can result in chunk migrations. Depending on the size of your data set and the chunks a tag affects, these migrations may impact cluster performance. Run your balancer in specific scheduled windows. See Schedule the Balancing Window for a tutorial on how to set a scheduling window.
Security¶
For sharded clusters running with Role-Based Access Control, authenticate as a user
with at least the clusterManager
role on the admin
database.
Procedure¶
You must be connected to a mongos
to create tags and tag ranges.
You cannot create tags by connecting directly to a shard.
Disable the Balancer¶
The balancer must be disabled on the collection to ensure no migrations take place while configuring the new tags.
Use sh.disableBalancing()
, specifying the namespace of the
collection, to stop the balancer
Use sh.isBalancerRunning()
to check if the balancer process
is currently running. Wait until any current balancing rounds have completed
before proceeding.
Tag each shard¶
Tag shard0000
with the recent
tag.
Tag shard0001
with the recent
tag.
Tag shard0002
with the archive
tag.
You can review the tags assigned to any given shard by running
sh.status()
.
Define ranges for each tag¶
Define range for recent photos and associate it to the recent
tag
using the sh.addTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the tag.
Define range for older photos and associate it to the
archive
tag using the sh.addTagRange()
method.
This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the tag.
MinKey
and MaxKey
are reserved special values for
comparisons.
Enable the Balancer¶
Re-enable the balancer to rebalance the cluster.
Use sh.enableBalancing()
, specifying the namespace of the
collection, to start the balancer
Use sh.isBalancerRunning()
to check if the balancer process
is currently running.
Review the changes¶
The next time the balancer runs, it splits and migrates chunks across the shards respecting the tag ranges and tags.
Once balancing finishes, the shards tagged as recent
should only
contain documents with creation_date
greater than or equal to
ISODate("2016-01-01")
, while shards tagged as archive
should
only contain documents with creation_date
less than
ISODate("2016-01-01")
.
You can confirm the chunk distribution by running sh.status()
.
Updating Tag Ranges¶
To update the shard ranges, perform the following operations as a part of a cron job or other scheduled procedure:
Disable the Balancer¶
The balancer must be disabled on the collection to ensure no migrations take place while configuring the new tags.
Use sh.disableBalancing()
, specifying the namespace of the
collection, to stop the balancer
Use sh.isBalancerRunning()
to check if the balancer process
is currently running. Wait until any current balancing rounds have completed
before proceeding.
Remove the old shard tag ranges¶
Remove the old recent
shard tag range using the
sh.removeTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the tag.
Remove the old archive
shard tag range using the
sh.removeTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the tag.
MinKey
and MaxKey
are reserved special values for
comparisons.
Add the new shard tag range for each tag¶
Define range for recent photos and associate it to the recent
tag using
the sh.addTagRange()
method. This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the tag.
Define range for older photos and associate it to the
archive
tag using the sh.addTagRange()
method.
This method requires:
- the full namespace of the target collection.
- the inclusive lower bound of the range.
- the exclusive upper bound of the range.
- the tag.
MinKey
and MaxKey
are reserved special values for
comparisons.
Enable the Balancer¶
Re-enable the balancer to rebalance the cluster.
Use sh.enableBalancing()
, specifying the namespace of the
collection, to start the balancer
Use sh.isBalancerRunning()
to check if the balancer process
is currently running.
Review the changes¶
The next time the balancer runs, it splits chunks where necessary and migrates chunks across the shards respecting the tag ranges and tags.
Before balancing, the shards tagged as recent
only contained documents
with creation_date
greater than or equal to ISODate("2016-01-01")
,
while shards tagged as archive
only contained documents with
creation_date
less than ISODate("2016-01-01")
.
Once balancing finishes, the shards tagged as recent
should only
contain documents with creation_date
greater than or equal to
ISODate("2016-06-01")
, while shards tagged as archive
should
only contain documents with creation_date
less than
ISODate("2016-06-01")
.
You can confirm the chunk distribution by running sh.status()
.