This is a fast post about something I’m encountering when trying to understand the data disks performance in Azure virtual machines.
If you read the official TechNet article where the different Azure virtual machines series are exposed and their configuration detailed, maybe (like me) you will be confused about the data disks performances between the (A-Series, D-Series and G-Series) vs (DS-Series).
In the A-Series, D-Series and G-series virtual machines, the data disks performances are described using the metric: Max IOPS
In the DS-Series virtual machines, the data disks performances are described using the metric: Max. disk IOPS and bandwidth
The questions that I was asking myself are:
- What this IOPS means, how much data per IOPS ?
- Why there is an additional bandwidth metric in the DS Series ?
And here, I will answer based on information I got:
What this IOPS means, how much data per IOPS ?
IOPS means Input/Output Per Second, and it’s absolutely not referring to bandwidth. There is a tight relation between bandwidth and IOPS, but we need another parameter to do that. The Input/output size. What that means ?
Let’s imagine when you were kids, you are playing a game. You need to fill a bucket with water, cross the road, empty the bucket, re-cross the road, and do that for 1 minute, and finally we will see, who is the winner ? The winner is the one who collects more water, yes you say it, more water, not more back and forth. Yes, it depends of how the bucket is filled. If my bucket is completely filled and I made 5 turns, and my friend’s bucket is half-filled and he made 5 turns, I will win !! This is exactly the same logic, you have to know first, what is the size of the Input/output unit used here, and you win
So in the case of the A, D and G Series, using a IO unit of 8KB for the 500 IOPS per disk, will result in approximately 8*500 = 4000KB/s = 3,9 MB/s
This guy makes a nice test to verify that, thanks to him : LINK HERE
In fact, the guy making the tests found that with a 4 disks strip, only 1130 IOPS was achieved (We expect 4*500 = 2000). So it’s a real best effort and storage performances are not guaranteed or that the test is not relevant, or and i guess it is the more likely, the total throughout is throttled.
In addition, I wonder why Microsoft did not show the maximum bandwidth, because based on my calculation : The maximum bandwidth for a 8KB IO unit is, and for a maximum size of 1 TB is : (Number of disks*500*8)/(1024). Based on this I can achieve a 62 MB/s on a A11 VM, and 250 MB/s on a G5 VM. Why Microsoft did not make things more clear, because this can’t be true !!!! There is certainly something I missed
Why there is an additional bandwidth metric in the DS Series ?
DS-Series can use the Azure Premium Storage. Azure Premium Storage is a new storage service that gives you high throughput capacity, low latency and maximum performances. PS is based on SSD disks. To acquire Premium Storage, you will need a premium storage account.
To confirm the high performances of its Premium Storage service, Microsoft added an additional metric to describe the storage throughput.
But what this means ?
First, it’s a little complicated (Just a little) but I will explain it here, the easiest way (My explanations are based on this Microsoft article : http://azure.microsoft.com/en-us/documentation/articles/storage-premium-storage-preview-portal/)
- With Azure Premium Storage you can achieve a 50,000 IOPS and 32 TB of storage. The throughput is not mentioned here, because it has to be calculated
- Premium Storage is based on three Azure Storage Disk types. P10, P20 and P30. That means that your VHDs will be stored on disks of that type. You need to know that the IOPS&Throughput are dependent of the disk size (therefore on the VDH size)
The table above shows the P10, P20 and P30 specifications:
P10 disk is 128 GB, it can achieves 500 IOPS and up to 100 MB/s.
P20 disk is 512 GB, it can achieves 2300 IOPS and up to 150 MB/s.
P30 disk is 1024 GB, it can achieves 5000 IOPS and up to 200 MB/s.
You can see that the specifications are not linear. That means that Capacity, IOPS and throughput metrics are not linearly dependent. For example, the P20 disk size is 4 times greater than the P10 disk size, but the IOPS is a little greater than that, and for the throughput, it’s only 1.5 greater. So we need to be sharp when we create our VHDs.
Let’s see some example to better assimilate the facts
Example 1 : I want to create a 200 GB VHD (Option1)
Azure will roundup your choice, so the more accurate disk to use is a P20, because the P10 disks are 128 GB only, and VHD will not fit in. You will benefit then, of 2300 IOPS and up to 150 MB/s
Example 2 : I want to create a 200 GB VHD (Option2)
You will create 2 VHDs with 100 GB each. Azure will create these two VHDs using P10 disks (smaller than 128 GB). Then you will use Windows (Or Linux) to create a strip (Storage spaces for example) using these 2 VHDs. You will have as a result a 200 GB VHD with 1000 IOPS and 200 MB/s
Example 3: I want to create a VHD with 600 GB and 400 MB/s of throughput
You will not obtain such throughput if you just create a 600 GB VHD, because Azure will create a 600 GB VHD on a P30 Disk, and then you will have only 200 MB/s.
To achieve that, you should use stripping, and to do that, you can proceed with different ways:
Way1 : You create two 600 GB VHDs. Azure will create them using P30 disks. Then you use your stripping tool (Storage spaces) to create a 1200 GB volume. This volume will permit 400 MB/s and 10000 IOPS. But in this case you will have 600 un-nedded GB
Way2 : You create 3 VHDs with 200 GB each. Azure will create them using P20 disks (Example 1, Option1). Then you use your stripping tool (Storage spaces) to create a 600 GB volume. This volume will permit 450 MB/s (150 MB/s *3) and 6900 IOPS (2300 IOPS *3).
Example 4: I want to create a VHD with 600 GB and 600 MB/s of throughput
Unfortunately, we can’t just dream, and ask Azure to do it, not till today. In fact, the maximum throughput possible is 512 MB/s, we can’t do better.
- The total data storage, the IOPS and the throughput are limited by the VM series and size. Each Azure Virtual Machine type is limited by a number of disks (total storage size), a maximum IOPS (IOPS) and a maximum throughput (Throughput). For example, you may achieve a 400 MB/s (Example 3) only in a Standard_DS14 VM. All the other VM types will throttle your IOPS or throughput when you reach the threshold. The following picture (Microsoft credit) shows the DS-Series maximum storage performances
Follow this link for all the Azure Virtual machines types, sizes and specifications : https://msdn.microsoft.com/en-us/library/azure/dn197896.aspx
- You should now that Azure will throttle storage whenever one of the two metrics threshold is reached: IOPS or Throughput. Let’s take a DS2 VM, If you run 2000 IOPS workload with a 100 KB IO unit, you will reach a 195 MB/s throughput. This value exceeds the 64 MB/s threshold (The DS2 threshold) , and Azure will throttle your disk access, this will result on a huge IO queue, and you will suffer from performances degradation.
- When you choose your storage, you will choose your VM too. And when you choose your VM, you choose your storage too. So make a reasonable calculation before spending money
Azure is relatively new, new features and terms are coming everyday, and lack of information and ambiguity is a real headache, specially for persons that will make a decision. Storage for virtual machines in Azure a bit tricky, and Microsoft did not provide any tool where we put our need to obtain a list of possible configurations. But till this day, I invite you to proceed like the following:
- Determine your storage needs : 1- How much data disks 2- How much IOPS, Throughput for each disk
- Calculate the options when converting to Azure storage (P10, P20, P30), it’s like the Examples 1, 2, 3 and 4 we discussed above
- Match the results to Azure Virtual machines Series to find the best suitable size for you : For each Option found in step 2, find the VM sizes that fit
- Create a table with all these information exposing all the scenarios
- Calculate the cost for each scenario
- Choose the best Spec/Cost VM