Hi,
Many of you have asked me about the real meaning of Cosmos DB partitions, the partition key and how to choose a partition key if needed. This post is all about this.
1- Cosmos DB partitions, what is it ?
The official Microsoft article explains well partitions in CosmosDB, but to simplify the picture:
- When you create a container in a CosmosDB database (A Collection in case of SQL API), CosmosDB will provision a capacity for that container
- If the container capacity is more than 10GB, then CosmosDB requires an additional information to create it : WHY ?
When CosmosDB provisions a container, it will reserve capacity over its compute and storage resources. The Storage and Compute resources are called Physical Partitions.
Within the physical partitions, Cosmsos uses Logical Partitions, the maximum size of a Logical Partition is 10GB
You get it now : When the size of a container exceeds (or can exceed) 10GB, then Cosmos DB needs to spread data over the multiple Logical Partitions.
The following picture shows 2 collections (containers):
- Collection 1 : The Size is 10 GB, so CosmosDB can place all the documents within the same Logical Partition (Logical Partition 1)
- Collection 2 : The size is unlimited (greater than 10 GB), so CosmsosDB has to spread the documents across multiple logical partitions
“The fact of spreading the documents across multiple Logical Partitions, is called partitioning”
NB1: CosmosDB may distribute documents across Logical Partitions within different physical partitions. Logical Partitions of the same container do not belong necessarily to the same physical partition, but this is managed by CosmsosDB
NB2: Partitioning is mandatory if you select Unlimited storage for your container, and supported if you choose 1000RU/s and more
NB3: Switching between partitioned and un-partitioned containers is not supported. You need to migrate your data
2- Why partitioning matters ?
This the first question i have asked to myself: Why do i need to know about this partitioning stuff? The service is managed, so why do i need to care about this information if Cosmsos DB will distribute automatically my documents across partitions.
The answer is :
- CosmosDb does not manage Logical partitioning
- Partitioning has impacts on Performance and related to the partition Size limit
2.1- Performance
When you query a Container, CosmosDB will look into the documents to get the required results (To keep it simple because this is a more elaborated ). When a request spans multiple Logical Partitions, it consumes more Request Units, so the Request Charge per Query will be greater
–> HINT : With this constraint, it’s better that the queries don’t span multiple logical containers, so it’s better that the documents related to the same query stay within the same logical partition
2.2- Size
When you request the creation of a new document, CosmosDB will place it within a Logical Partition. The question is how CosmosDB will distribute the documents between the Logical Partitions : A Logical Partition can’t exceed 10GB : So CosmosDB must intelligently distribute documents between the Logical Partitions –> This is easy i think, a mechanism like round robin can be enough, but this is not true! Because in case of round robin, your documents will be spread between N logical Partitions. And we have seen that queries over multiple logical Partitions consume a lot of RUs, so this is not optimal
We have now the Performance-Size dilemma : How cosmosDB can deal with these two factors ? How we can find the best configuration to :
- Keep ‘documents of the same query’ under the same Logical Partition
- Not reaching the 10GB limit easily
–> The answer is : CosmosDB can’t deal with this, you have to deal with it by choosing a Partition Key
3- The Partition Key
My definition: The Partition Key is a HINT to tell CosmosDB where to place a document, and if two documents should be stored within the same Logical Partition. The partition key is a value within the JSON document
NB : The PartitionKey must be submitted during a query to cosmsosDB
Let me explain this by an example:
Suppose we have a Hotel multi-tenant application that manages the hotel rooms like reservation. Each room is identified by a document where all the room’s information are located. The Hotel is identified by a hotelid and the room by id
The document structure is like the following:
{
“hotelid” : “”,
“name” : “”,
“room” : {
“id” : “”,
“info” : {
“info1” : “”,
“info2” : “”
}}}
The following is the graphical view of the JSON document:
Suppose we have documents of 6 rooms:
3.1- No Partition Key
If you create a container with a size of 10GB, the container will be not partitioned, and all the documents will be created within the same Logical Partition. So all your documents should not exceed the size of 10GB.
3.2- Partition Key : Case 1
Partition Key = /hotelid
In this case, when CosmosDB will create the 6 documents based on the /hotelid, it will spread the documents on 3 Logical Partitions, because there are 3 /hotelid distinct values.
- Logical Partition 1 : /hotelid = 2222
- 3 documents
- Logical Partition 2 : /hotelid = 3333
- 2 documents
- Logical Partition 3 : /hotelid = 4444
- 1 document
What are the Pro and Limits of this partitioning scheme:
- Pro
- Documents from the same hotel will be placed on a distinct Logical Partition
- Each Hotel can have documents up to 10GB
- Queries across the same hotel will perform well since they will not span multiple Logical Partitions
- Limits
- All rooms of the same hotel will be placed within the same Logical Partition
- The 10GB limit may be reached when the rooms count grows
3.1- Partition Key : Case 2
NB : CosmosDB supports only 1 JSON properties for the Partition Key, so in my case i will create a new properties called PartitionKey
Suppose that after making a calculation, we figured out that each Hotel will generate 16GB of documents. This means that i need that the documents be spread over two Logical Containers. How can i achieve this ?
- The /hotelid has only 1 distinct value per Hotel, so it’s not a good partition key
- I need to find a value that can have at least 2 distinct values for a Hotel
- I know that each Hotel have multiple rooms, so the multiple room ids
The idea is to create a new json proprieties called PartitionKey, the PartitionKey can have two values :
- hotelid-1 if the roomid is odd
- hotelid-2 id the roomid is even
This way:
- Whe you create a new document (which contains a room), you have to look to the roomId, if it’s even than PartitionKey = hotelid-2, if it’s odd: PartitionKey = hotelid-1
- This way, Cosmos will place even rooms within a Logical partition, and odd rooms within another Logical Partition
–> Result : The hotel documents will span two Logical Partitions, so 20 GB of storage
What are the Pro and Limits of this partitioning scheme:
- Pro
- Documents related to the same hotel will be placed on 2 Logical Partitions
- Limits
- Queries related the same hotel will be spread across 2 Logical Partitions, which will result on an additional request charge
4- How to choose the Partition Key?
This is most difficult exercise when designing your future CosmosDB data structure, here some recommendations to guide your thought it:
- What is the expected size per document? This will give you an information about the level of partitioning you will make. Think about the examples above. If each document is 100KB max, then you can have up to 105k documents per Logical Partition, which means 105k room per hotel (More than enough), so /hotelid is a good partition key against the Size Constraint
- If you are faced to more combinations of partition keys and are unable to get decided, do the following:
- Do not use the partition key that will fire the Size constraint quickly : Reaching the Size limit makes the application unusable
- Choose the Partition Key that will consume less Request Charge, but how to predict that : You have to determine the most used queries across your application, and choose the best Partition Key according to them.
- Add new properties to your json document (PartitionKey), even if they are not really useful, just to achieve a good Partitioning
5- I have determined a good Partition Key, but i afraid hitting the 10 GB limit per Logical Partition ?
This is the most asked question after choosing the Partition Key : What if all the documents with the same Partition Key value hit the 10GB limit !!
Like he example above, try to find a mandatory value that gives your the X factor you want : The idea is to say: Can i find an additional properties that i can use in my partition key ?
NB : The Request Charge will be multiplied by X, but at least i can predict it
This was simple in my case, but in case have a factor X, you can use a Bucket calculator function. Here’s a blog about this : You just provide how much logical partitions you want to span your documents into. A good blog post here about the subject.
Hope that this article helps.
Cheers
Good explanation. thanks