AWS storage for Kubernetes performance
Do you want some form of persistence on Kubernetes? What are your options? How does it perform? In this post I go over some options on AWS, but it can be used for other platforms aswell.
Step 1: Don’t use volumes if you can
I think I have to say this. Try to not get into the state (pun intended) of having to worry about stateful applications on Kubernetes. It’s just annoying and even though there are options, it’s not that cool to figure these things out. Storage is hard. Storage for a cloud-native orchestration tool containing diverse workloads is even harder.
Best-case you work with stateless applications, cloud-native, and based on the ‘12 factor principle’ (https://12factor.net/)
So if your use-case is some wacko old-school application that needs to share a volume between 10 pods and does madness read/writes to it.. Then seriously stop and don’t use Kubernetes in the first place. Not using Kubernetes does not solve the actual problem, but at least you don’t have to figure your NFS problems out on another complex system.
However, I do believe there are use-cases.
Identifing your use-case
I believe there are 4 options;
- Storage for “applications”
- Storage for “media” and cloud-native solutions
- “I have legacy-stuff and I’m just doing something or I have no idea what I’m doing - storage”.
- The “I actually have a decent use-case and I’m using this - storage”
So basically these are also the availible products.
- EBS, just good storage in both IOPS and throughput. It can be mounted to a single system. In fact, you can just see this as a regular HDD (or better yet SSD).
- Object storage, S3. Great to host “media” on it. Either you use it as CDN or do something smart in your application with it.
- EFS, an NFS file share on steroids. Number 3 & 4 on the use-cases use this.
Let’s just recap this a bit. We have storage that is basically the standard disk you have in your PC. Then we have two options for sharing files between multiple systems. It’s that simple. So why is storage difficult?
Expecting and wanting too much
Now we have this application and “I have to share these files between 100 systems”. Best-case you can use S3 to give your systems access to the files. The applications load this data or serve it (or even as CDN). Yet what happens is that often applications start to do the processing. It’s going to read the file, write others, change files and do a gazillion other actions. The question is “how”.
There is nothing wrong with “downloading” the data from S3, do some processing, and uploading it back. It does not “hurt” other systems and the processing can take place on your local storage (or EBS in that case).
Yet if you have an EFS/NFS mount, applications tend to do that processing ON that share. Continuously doing those read/writes. It’s just not made for that. NFS can work if you periodically call a file. It just does not work if you write the logs of 100 systems to 1 single file on a share. For that you just ship it into a logging solution that can run cloud-native and you are set.
Difference between EFS and S3?
It might look that EFS and S3 are somewhat similar. I do think they share the same use-case but the implementation is different. We can go into much more details, but I try to keep it high-level here: With EFS we have a file system, with S3 we have a RESTful API. We mount the EFS volume and with S3 we can programmatically get our objects.
This is also why EFS is a thing. Not all applications are made for an object-store. It tends to be that legacy wants a simple file system. This is “fine”. However, it’s keener to cause performance issues because it’s easier to do things that it should not be done on a file share.
What is a valid use-case for EFS or NFS-like systems?
- Probably not your use-case.
- Not for applications
- Unlikely for Kubernetes workloads
The problem is that people tend to go for EFS because it’s “the” solution that can mount a single volume to many systems. So instead of making your application cloud-native, 12factor-style. They just mount a share and “yolo”.
Yes, it would work. Yes, it might be “good” for you now. However, there are so many pitfalls and issues that can and will happen. If you do something wrong in your application or the workload is increasing, you have the chance to completely fail on your storage. It’s then to slow, it has hiccups, it locks - you will cry.
Using the right tool for the job
So we can define a somewhat solid use-case for the various storage solutions. Let’s say I’m running Prometheus. It needs storage but I’m also taking a more cloud-native approach by implementing either Cortex or Thanos.
What happens is that Prometheus will process data and store this, but for a limited amount. Every 2 hours my data gets uploaded to an object-store. That’s it. For the future chain of ‘logic’, the object store data is used.
In this case, I will use EBS volumes for Prometheus and let Cortex/Thanos do the rest on S3. I even can make this setup high available without the need to have a file share between the Prometheuses. The data just gets shipped twice to the object store and de-duplicated later on.
Neat.
The single flaw in EBS
However, there is one thing that EBS cannot do. It’s not able to mount cross-zone. If I have a machine in zone A and I want EBS; the EBS volume will get created in zone A and it will be locked there. I simply cannot unmount and mount that volume to an instance in zone B.
I could “solve” this by running, for example, Prometheus in two different zones. If there is a zone outage, at least one instance is still up. The other instance will not be able to reschedule in Kubernetes because it simply cannot find a node that is in the correct zone for the volume.
Multi-zone clusters and EBS
The neat thing about Kubernetes is that you provide a workload (I.e. pods) and they get scheduled over the nodes you have. When we have a cluster that is using 3 AZ’s (availability zones) - we do however have a little problem.
If we use EBS, that pod is dead-locked to a specific set of nodes in a specific AZ. If the zone goes down, it won’t be able to schedule that pod. It will search for space in a node, which can mount the volume. Yet there are no nodes because the AZ is down.
If you use EBS and you want to be HA (high availability), you will need to have a minimum of 2 replicas with a zone anti-affinity. I.e. replica’s get scheduled in different zones. If the zone goes down, 1 replica is down, yet the other will keep working.
The trade-off with EBS and multi-zone Kubernetes
If you want to be HA, you will have to force a minimum of 2 replicas on your workloads that require a volume like EBS. This has multiple downsides:
- You will need twice the compute power (CPU/mem)
- You will need a big overcommitment to be able to schedule everything
Let’s say I run 10 pods that have a volume. 4 live in zone A, 4 in zone B, and 2 in zone C. Things are fine, but now I have to scale my cluster because my workload increases, and I need more memory. So I add extra nodes. If I want to support the 4 pods that live in Zone A, I need to add n*3 nodes, because of AZ’s.
I.e I want to add 2 nodes in zone A, I need to add 2*3, 6 nodes. 2 in A, 2 in B, and 2 nodes in zone C. Unless I’m going to specifically add nodes to a certain zone. Yet this beats the purpose of Kubernetes and perhaps poses more problems later on when a zone outage happens.
But why not use EFS then?
So EFS can mount in every AZ. Yet why I should not use it? That’s using the right tool for the job. Yes, it can mount everywhere, but it’s not made for the workload that I do. I do know that AWS advertise on IOPS and throughput for EFS, but frankly, it’s not suitable for applications or EBS-like volumes. Don’t take my word for it, let’s test it.
Testing it
So I’ve created an EC2 instance for the sake of testing. It’s a bit easier than setting up various storage controllers/drivers.
The instance I used was a 2 CPU, 4 GB memory instance. I then mounted 3 volumes.
- 1 EBS volume of 30GB
- 1 EFS General Purpose volume
- 1 EFS Max I/O volume with a Provisioned Throughput of 4MB/s
I used the tool FIO with the following test-case:
[global]
name=fio-rand-RW
filename=fio-rand-RW
rw=randrw
rwmixread=60
rwmixwrite=40
bs=4K
direct=0
numjobs=4
time_based
runtime=900
[file1]
size=10G
ioengine=libaio
iodepth=16
I believe this somewhat pushes the system for the behaviour that you can expect from an application that is trying quite hard.
EBS
file1: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.1
Starting 4 processes
file1: Laying out IO file (1 file / 10240MiB)
Jobs: 4 (f=4): [m(4)][100.0%][r=18.3MiB/s,w=12.2MiB/s][r=4677,w=3122 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=20470: Sat Feb 6 20:53:59 2021
read: IOPS=562, BW=2251KiB/s (2305kB/s)(1979MiB/900001msec)
slat (usec): min=2, max=178266, avg=1088.46, stdev=1565.14
clat (usec): min=28, max=262712, avg=16074.01, stdev=15871.56
lat (usec): min=398, max=269008, avg=17163.72, stdev=16870.85
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 10], 60.00th=[ 11],
| 70.00th=[ 13], 80.00th=[ 24], 90.00th=[ 41], 95.00th=[ 55],
| 99.00th=[ 73], 99.50th=[ 80], 99.90th=[ 91], 99.95th=[ 96],
| 99.99th=[ 106]
bw ( KiB/s): min= 496, max= 5648, per=24.95%, avg=2251.02, stdev=1615.05, samples=1800
iops : min= 124, max= 1412, avg=562.72, stdev=403.76, samples=1800
write: IOPS=375, BW=1503KiB/s (1539kB/s)(1321MiB/900001msec)
slat (usec): min=3, max=96415, avg=1013.27, stdev=1747.81
clat (usec): min=5, max=258974, avg=15844.44, stdev=15836.98
lat (usec): min=23, max=262640, avg=16858.86, stdev=16776.30
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 10], 60.00th=[ 11],
| 70.00th=[ 13], 80.00th=[ 23], 90.00th=[ 41], 95.00th=[ 54],
| 99.00th=[ 73], 99.50th=[ 80], 99.90th=[ 91], 99.95th=[ 95],
| 99.99th=[ 106]
bw ( KiB/s): min= 288, max= 3624, per=24.94%, avg=1503.06, stdev=1083.55, samples=1800
iops : min= 72, max= 906, avg=375.73, stdev=270.89, samples=1800
lat (usec) : 10=0.01%, 50=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
lat (usec) : 1000=0.01%
lat (msec) : 2=0.07%, 4=2.11%, 10=52.53%, 20=23.86%, 50=15.10%
lat (msec) : 100=6.29%, 250=0.02%, 500=0.01%
cpu : usr=0.57%, sys=3.36%, ctx=954242, majf=1, minf=11
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=506556,338245,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20471: Sat Feb 6 20:53:59 2021
read: IOPS=564, BW=2259KiB/s (2314kB/s)(1986MiB/900001msec)
slat (usec): min=2, max=96180, avg=1087.73, stdev=1547.74
clat (usec): min=27, max=250965, avg=16000.04, stdev=15779.10
lat (usec): min=50, max=257360, avg=17089.02, stdev=16775.86
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 10], 60.00th=[ 11],
| 70.00th=[ 13], 80.00th=[ 23], 90.00th=[ 41], 95.00th=[ 54],
| 99.00th=[ 73], 99.50th=[ 80], 99.90th=[ 91], 99.95th=[ 96],
| 99.99th=[ 107]
bw ( KiB/s): min= 496, max= 5458, per=25.06%, avg=2261.26, stdev=1622.41, samples=1800
iops : min= 124, max= 1364, avg=565.18, stdev=405.62, samples=1800
write: IOPS=377, BW=1512KiB/s (1548kB/s)(1329MiB/900001msec)
slat (usec): min=3, max=174374, avg=1002.89, stdev=1748.38
clat (usec): min=5, max=246378, avg=15780.95, stdev=15731.99
lat (usec): min=19, max=252531, avg=16784.97, stdev=16661.97
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 10], 60.00th=[ 11],
| 70.00th=[ 13], 80.00th=[ 23], 90.00th=[ 41], 95.00th=[ 54],
| 99.00th=[ 73], 99.50th=[ 80], 99.90th=[ 91], 99.95th=[ 96],
| 99.99th=[ 106]
bw ( KiB/s): min= 336, max= 3912, per=25.10%, avg=1512.91, stdev=1090.43, samples=1800
iops : min= 84, max= 978, avg=378.11, stdev=272.58, samples=1800
lat (usec) : 10=0.01%, 50=0.01%, 100=0.01%, 500=0.01%, 750=0.01%
lat (usec) : 1000=0.01%
lat (msec) : 2=0.08%, 4=2.20%, 10=52.85%, 20=23.41%, 50=15.30%
lat (msec) : 100=6.13%, 250=0.03%, 500=0.01%
cpu : usr=0.53%, sys=3.40%, ctx=955398, majf=0, minf=13
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=508376,340134,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20472: Sat Feb 6 20:53:59 2021
read: IOPS=563, BW=2255KiB/s (2309kB/s)(1982MiB/900001msec)
slat (usec): min=2, max=36556, avg=1085.50, stdev=1544.38
clat (usec): min=142, max=239522, avg=16009.40, stdev=15792.14
lat (usec): min=147, max=242281, avg=17096.17, stdev=16787.27
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 10], 60.00th=[ 11],
| 70.00th=[ 13], 80.00th=[ 23], 90.00th=[ 41], 95.00th=[ 54],
| 99.00th=[ 73], 99.50th=[ 81], 99.90th=[ 93], 99.95th=[ 97],
| 99.99th=[ 110]
bw ( KiB/s): min= 424, max= 5680, per=24.99%, avg=2255.00, stdev=1619.12, samples=1800
iops : min= 106, max= 1420, avg=563.72, stdev=404.78, samples=1800
write: IOPS=376, BW=1507KiB/s (1543kB/s)(1325MiB/900001msec)
slat (usec): min=3, max=180505, avg=1012.37, stdev=1765.62
clat (usec): min=5, max=242263, avg=15859.64, stdev=15842.96
lat (usec): min=135, max=243047, avg=16873.17, stdev=16782.22
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 10], 60.00th=[ 11],
| 70.00th=[ 13], 80.00th=[ 23], 90.00th=[ 41], 95.00th=[ 54],
| 99.00th=[ 73], 99.50th=[ 80], 99.90th=[ 93], 99.95th=[ 97],
| 99.99th=[ 109]
bw ( KiB/s): min= 328, max= 3608, per=25.00%, avg=1506.78, stdev=1080.48, samples=1800
iops : min= 82, max= 902, avg=376.66, stdev=270.12, samples=1800
lat (usec) : 10=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.07%, 4=2.12%, 10=52.70%, 20=23.70%, 50=15.21%
lat (msec) : 100=6.16%, 250=0.04%
cpu : usr=0.67%, sys=3.24%, ctx=956670, majf=1, minf=11
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=507450,339079,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20473: Sat Feb 6 20:53:59 2021
read: IOPS=564, BW=2256KiB/s (2310kB/s)(1983MiB/900001msec)
slat (usec): min=2, max=174351, avg=1086.38, stdev=1564.66
clat (usec): min=5, max=224279, avg=16001.65, stdev=15727.92
lat (usec): min=392, max=227137, avg=17089.26, stdev=16724.56
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 10], 60.00th=[ 11],
| 70.00th=[ 13], 80.00th=[ 23], 90.00th=[ 41], 95.00th=[ 54],
| 99.00th=[ 73], 99.50th=[ 80], 99.90th=[ 91], 99.95th=[ 96],
| 99.99th=[ 107]
bw ( KiB/s): min= 480, max= 5434, per=25.04%, avg=2259.20, stdev=1624.18, samples=1800
iops : min= 120, max= 1358, avg=564.59, stdev=406.06, samples=1800
write: IOPS=376, BW=1506KiB/s (1542kB/s)(1323MiB/900001msec)
slat (usec): min=3, max=100681, avg=1011.28, stdev=1735.27
clat (usec): min=395, max=224278, avg=15876.79, stdev=15810.02
lat (usec): min=400, max=224290, avg=16889.20, stdev=16740.12
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 8], 40.00th=[ 9], 50.00th=[ 10], 60.00th=[ 11],
| 70.00th=[ 13], 80.00th=[ 24], 90.00th=[ 41], 95.00th=[ 54],
| 99.00th=[ 73], 99.50th=[ 79], 99.90th=[ 91], 99.95th=[ 95],
| 99.99th=[ 107]
bw ( KiB/s): min= 248, max= 3632, per=25.01%, avg=1507.46, stdev=1079.29, samples=1800
iops : min= 62, max= 908, avg=376.69, stdev=269.78, samples=1800
lat (usec) : 10=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.08%, 4=2.10%, 10=52.72%, 20=23.61%, 50=15.29%
lat (msec) : 100=6.17%, 250=0.02%
cpu : usr=0.62%, sys=3.29%, ctx=956696, majf=0, minf=13
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=507663,338748,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
READ: bw=9022KiB/s (9239kB/s), 2251KiB/s-2259KiB/s (2305kB/s-2314kB/s), io=7930MiB (8315MB), run=900001-900001msec
WRITE: bw=6028KiB/s (6172kB/s), 1503KiB/s-1512KiB/s (1539kB/s-1548kB/s), io=5298MiB (5555MB), run=900001-900001msec
Disk stats (read/write):
xvdb: ios=1511378/1245594, merge=0/22, ticks=1525247/4155320, in_queue=2205008, util=99.50%
EFS General Purpose
file1: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.1
Starting 4 processes
file1: Laying out IO file (1 file / 10240MiB)
fio: native_fallocate call failed: Operation not supported
Jobs: 4 (f=4): [f(4)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=20508: Sat Feb 6 21:24:36 2021
read: IOPS=73, BW=294KiB/s (301kB/s)(258MiB/900031msec)
slat (usec): min=3, max=11172k, avg=8297.13, stdev=72921.08
clat (msec): min=10, max=13528, avg=120.65, stdev=397.06
lat (msec): min=14, max=13568, avg=128.95, stdev=415.57
clat percentiles (msec):
| 1.00th=[ 36], 5.00th=[ 44], 10.00th=[ 48], 20.00th=[ 54],
| 30.00th=[ 58], 40.00th=[ 62], 50.00th=[ 66], 60.00th=[ 70],
| 70.00th=[ 75], 80.00th=[ 82], 90.00th=[ 94], 95.00th=[ 115],
| 99.00th=[ 1452], 99.50th=[ 1670], 99.90th=[ 4111], 99.95th=[ 9597],
| 99.99th=[13087]
bw ( KiB/s): min= 7, max= 760, per=27.72%, avg=325.15, stdev=256.13, samples=1626
iops : min= 1, max= 190, avg=81.21, stdev=64.04, samples=1626
write: IOPS=48, BW=195KiB/s (200kB/s)(172MiB/900031msec)
slat (usec): min=4, max=12026k, avg=7973.68, stdev=97935.35
clat (usec): min=9, max=13646k, avg=125701.44, stdev=456961.73
lat (msec): min=12, max=13864, avg=133.68, stdev=478.26
clat percentiles (msec):
| 1.00th=[ 37], 5.00th=[ 44], 10.00th=[ 49], 20.00th=[ 55],
| 30.00th=[ 59], 40.00th=[ 63], 50.00th=[ 66], 60.00th=[ 70],
| 70.00th=[ 75], 80.00th=[ 83], 90.00th=[ 95], 95.00th=[ 117],
| 99.00th=[ 1485], 99.50th=[ 1687], 99.90th=[ 9194], 99.95th=[12416],
| 99.99th=[13489]
bw ( KiB/s): min= 7, max= 598, per=28.13%, avg=220.00, stdev=170.06, samples=1596
iops : min= 1, max= 149, avg=54.91, stdev=42.52, samples=1596
lat (usec) : 10=0.01%
lat (msec) : 20=0.02%, 50=12.95%, 100=79.19%, 250=3.90%, 500=0.19%
lat (msec) : 750=0.25%, 1000=0.68%, 2000=2.61%, >=2000=0.21%
cpu : usr=0.12%, sys=0.67%, ctx=146755, majf=0, minf=8
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=66133,43920,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20509: Sat Feb 6 21:24:36 2021
read: IOPS=73, BW=293KiB/s (300kB/s)(257MiB/900030msec)
slat (usec): min=3, max=12061k, avg=8374.49, stdev=82376.05
clat (usec): min=12, max=13553k, avg=122864.34, stdev=420015.71
lat (msec): min=18, max=13627, avg=131.24, stdev=439.43
clat percentiles (msec):
| 1.00th=[ 36], 5.00th=[ 44], 10.00th=[ 49], 20.00th=[ 54],
| 30.00th=[ 58], 40.00th=[ 62], 50.00th=[ 66], 60.00th=[ 70],
| 70.00th=[ 75], 80.00th=[ 82], 90.00th=[ 94], 95.00th=[ 117],
| 99.00th=[ 1469], 99.50th=[ 1670], 99.90th=[ 6812], 99.95th=[ 9866],
| 99.99th=[13355]
bw ( KiB/s): min= 7, max= 808, per=27.44%, avg=321.84, stdev=255.40, samples=1638
iops : min= 1, max= 202, avg=80.42, stdev=63.81, samples=1638
write: IOPS=48, BW=195KiB/s (200kB/s)(171MiB/900030msec)
slat (usec): min=4, max=9339.9k, avg=7911.09, stdev=87069.71
clat (msec): min=17, max=13737, avg=123.18, stdev=431.11
lat (msec): min=17, max=13817, avg=131.09, stdev=451.21
clat percentiles (msec):
| 1.00th=[ 37], 5.00th=[ 45], 10.00th=[ 49], 20.00th=[ 55],
| 30.00th=[ 59], 40.00th=[ 63], 50.00th=[ 67], 60.00th=[ 71],
| 70.00th=[ 77], 80.00th=[ 83], 90.00th=[ 95], 95.00th=[ 118],
| 99.00th=[ 1485], 99.50th=[ 1703], 99.90th=[ 7148], 99.95th=[10134],
| 99.99th=[13489]
bw ( KiB/s): min= 8, max= 536, per=28.49%, avg=222.78, stdev=169.51, samples=1576
iops : min= 2, max= 134, avg=55.69, stdev=42.38, samples=1576
lat (usec) : 20=0.01%
lat (msec) : 20=0.01%, 50=12.47%, 100=79.68%, 250=3.92%, 500=0.17%
lat (msec) : 750=0.28%, 1000=0.65%, 2000=2.58%, >=2000=0.26%
cpu : usr=0.13%, sys=0.66%, ctx=146681, majf=0, minf=8
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=65869,43900,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20510: Sat Feb 6 21:24:36 2021
read: IOPS=73, BW=293KiB/s (300kB/s)(258MiB/900016msec)
slat (usec): min=3, max=8465.3k, avg=7962.96, stdev=39572.21
clat (usec): min=12, max=13775k, avg=123922.26, stdev=437645.82
lat (msec): min=18, max=13883, avg=131.89, stdev=449.52
clat percentiles (msec):
| 1.00th=[ 36], 5.00th=[ 44], 10.00th=[ 48], 20.00th=[ 54],
| 30.00th=[ 58], 40.00th=[ 62], 50.00th=[ 66], 60.00th=[ 70],
| 70.00th=[ 75], 80.00th=[ 82], 90.00th=[ 94], 95.00th=[ 117],
| 99.00th=[ 1435], 99.50th=[ 1636], 99.90th=[ 8490], 99.95th=[ 9866],
| 99.99th=[13489]
bw ( KiB/s): min= 8, max= 784, per=27.57%, avg=323.41, stdev=255.67, samples=1633
iops : min= 2, max= 196, avg=80.80, stdev=63.86, samples=1633
write: IOPS=49, BW=196KiB/s (201kB/s)(172MiB/900016msec)
slat (usec): min=4, max=11945k, avg=8458.35, stdev=123591.78
clat (msec): min=11, max=13759, avg=120.56, stdev=395.14
lat (msec): min=14, max=13775, avg=129.02, stdev=426.79
clat percentiles (msec):
| 1.00th=[ 36], 5.00th=[ 45], 10.00th=[ 49], 20.00th=[ 54],
| 30.00th=[ 58], 40.00th=[ 63], 50.00th=[ 66], 60.00th=[ 71],
| 70.00th=[ 75], 80.00th=[ 83], 90.00th=[ 95], 95.00th=[ 117],
| 99.00th=[ 1435], 99.50th=[ 1603], 99.90th=[ 3809], 99.95th=[ 9463],
| 99.99th=[13489]
bw ( KiB/s): min= 8, max= 609, per=28.59%, avg=223.54, stdev=170.40, samples=1580
iops : min= 2, max= 152, avg=55.88, stdev=42.60, samples=1580
lat (usec) : 20=0.01%
lat (msec) : 20=0.01%, 50=12.88%, 100=79.25%, 250=3.86%, 500=0.19%
lat (msec) : 750=0.26%, 1000=0.65%, 2000=2.68%, >=2000=0.21%
cpu : usr=0.13%, sys=0.64%, ctx=146791, majf=0, minf=9
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=65983,44155,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20511: Sat Feb 6 21:24:36 2021
read: IOPS=73, BW=293KiB/s (300kB/s)(258MiB/900016msec)
slat (usec): min=3, max=11536k, avg=8309.89, stdev=78193.81
clat (msec): min=13, max=13882, avg=121.87, stdev=424.10
lat (msec): min=13, max=13887, avg=130.18, stdev=442.50
clat percentiles (msec):
| 1.00th=[ 36], 5.00th=[ 44], 10.00th=[ 48], 20.00th=[ 54],
| 30.00th=[ 58], 40.00th=[ 62], 50.00th=[ 66], 60.00th=[ 70],
| 70.00th=[ 75], 80.00th=[ 82], 90.00th=[ 94], 95.00th=[ 117],
| 99.00th=[ 1469], 99.50th=[ 1720], 99.90th=[ 6812], 99.95th=[10268],
| 99.99th=[13355]
bw ( KiB/s): min= 8, max= 753, per=27.64%, avg=324.17, stdev=255.93, samples=1630
iops : min= 2, max= 188, avg=80.94, stdev=63.88, samples=1630
write: IOPS=48, BW=196KiB/s (201kB/s)(172MiB/900016msec)
slat (usec): min=4, max=12026k, avg=7952.12, stdev=92106.64
clat (usec): min=50, max=13783k, avg=123812.27, stdev=421857.04
lat (msec): min=14, max=13892, avg=131.77, stdev=443.19
clat percentiles (msec):
| 1.00th=[ 36], 5.00th=[ 44], 10.00th=[ 48], 20.00th=[ 54],
| 30.00th=[ 58], 40.00th=[ 63], 50.00th=[ 67], 60.00th=[ 71],
| 70.00th=[ 75], 80.00th=[ 83], 90.00th=[ 95], 95.00th=[ 118],
| 99.00th=[ 1502], 99.50th=[ 1687], 99.90th=[ 6477], 99.95th=[ 9731],
| 99.99th=[13355]
bw ( KiB/s): min= 8, max= 569, per=28.35%, avg=221.71, stdev=171.23, samples=1591
iops : min= 2, max= 142, avg=55.42, stdev=42.80, samples=1591
lat (usec) : 100=0.01%
lat (msec) : 20=0.01%, 50=13.29%, 100=78.87%, 250=3.89%, 500=0.17%
lat (msec) : 750=0.24%, 1000=0.69%, 2000=2.61%, >=2000=0.22%
cpu : usr=0.12%, sys=0.65%, ctx=146572, majf=0, minf=11
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=65972,44098,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
READ: bw=1173KiB/s (1201kB/s), 293KiB/s-294KiB/s (300kB/s-301kB/s), io=1031MiB (1081MB), run=900016-900031msec
WRITE: bw=783KiB/s (801kB/s), 195KiB/s-196KiB/s (200kB/s-201kB/s), io=688MiB (721MB), run=900016-900031msec
EFS Max I/O
file1: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.1
Starting 4 processes
file1: Laying out IO file (1 file / 10240MiB)
fio: native_fallocate call failed: Operation not supported
Jobs: 4 (f=4): [f(4)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
file1: (groupid=0, jobs=1): err= 0: pid=20617: Sat Feb 6 22:33:00 2021
read: IOPS=44, BW=177KiB/s (181kB/s)(155MiB/900008msec)
slat (usec): min=3, max=1025.5k, avg=14004.16, stdev=55094.99
clat (msec): min=11, max=15599, avg=203.57, stdev=723.39
lat (msec): min=14, max=16181, avg=217.57, stdev=767.70
clat percentiles (msec):
| 1.00th=[ 55], 5.00th=[ 66], 10.00th=[ 73], 20.00th=[ 82],
| 30.00th=[ 88], 40.00th=[ 93], 50.00th=[ 99], 60.00th=[ 104],
| 70.00th=[ 110], 80.00th=[ 118], 90.00th=[ 131], 95.00th=[ 144],
| 99.00th=[ 4933], 99.50th=[ 5470], 99.90th=[ 6275], 99.95th=[ 6745],
| 99.99th=[15368]
bw ( KiB/s): min= 7, max= 544, per=30.98%, avg=218.10, stdev=177.46, samples=1458
iops : min= 1, max= 136, avg=54.44, stdev=44.36, samples=1458
write: IOPS=29, BW=118KiB/s (121kB/s)(104MiB/900008msec)
slat (usec): min=5, max=11845k, avg=12893.30, stdev=92058.77
clat (usec): min=6, max=15754k, avg=203521.09, stdev=710834.42
lat (msec): min=11, max=15754, avg=216.42, stdev=755.18
clat percentiles (msec):
| 1.00th=[ 55], 5.00th=[ 67], 10.00th=[ 74], 20.00th=[ 83],
| 30.00th=[ 88], 40.00th=[ 94], 50.00th=[ 100], 60.00th=[ 105],
| 70.00th=[ 111], 80.00th=[ 120], 90.00th=[ 132], 95.00th=[ 146],
| 99.00th=[ 4866], 99.50th=[ 5403], 99.90th=[ 6477], 99.95th=[ 6745],
| 99.99th=[13624]
bw ( KiB/s): min= 7, max= 392, per=34.45%, avg=161.90, stdev=115.63, samples=1310
iops : min= 1, max= 98, avg=40.38, stdev=28.91, samples=1310
lat (usec) : 10=0.01%
lat (msec) : 20=0.01%, 50=0.46%, 100=52.90%, 250=44.20%, 500=0.05%
lat (msec) : 750=0.03%, 1000=0.04%, 2000=0.14%, >=2000=2.17%
cpu : usr=0.06%, sys=0.93%, ctx=94550, majf=0, minf=10
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=39782,26537,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20618: Sat Feb 6 22:33:00 2021
read: IOPS=43, BW=175KiB/s (180kB/s)(154MiB/900004msec)
slat (usec): min=3, max=1121.0k, avg=14131.26, stdev=57098.32
clat (msec): min=8, max=17004, avg=201.05, stdev=724.28
lat (msec): min=10, max=17341, avg=215.18, stdev=770.32
clat percentiles (msec):
| 1.00th=[ 55], 5.00th=[ 67], 10.00th=[ 73], 20.00th=[ 82],
| 30.00th=[ 88], 40.00th=[ 93], 50.00th=[ 99], 60.00th=[ 104],
| 70.00th=[ 110], 80.00th=[ 118], 90.00th=[ 131], 95.00th=[ 146],
| 99.00th=[ 5000], 99.50th=[ 5604], 99.90th=[ 6745], 99.95th=[ 7282],
| 99.99th=[15368]
bw ( KiB/s): min= 8, max= 521, per=31.47%, avg=221.56, stdev=175.76, samples=1425
iops : min= 2, max= 130, avg=55.39, stdev=43.94, samples=1425
write: IOPS=29, BW=117KiB/s (120kB/s)(103MiB/900004msec)
slat (usec): min=5, max=11488k, avg=12939.67, stdev=90499.57
clat (usec): min=7, max=16864k, avg=210838.70, stdev=765745.75
lat (msec): min=8, max=17197, avg=223.78, stdev=810.75
clat percentiles (msec):
| 1.00th=[ 56], 5.00th=[ 68], 10.00th=[ 74], 20.00th=[ 83],
| 30.00th=[ 89], 40.00th=[ 94], 50.00th=[ 100], 60.00th=[ 106],
| 70.00th=[ 112], 80.00th=[ 121], 90.00th=[ 133], 95.00th=[ 148],
| 99.00th=[ 5134], 99.50th=[ 5738], 99.90th=[ 7215], 99.95th=[ 7752],
| 99.99th=[16442]
bw ( KiB/s): min= 8, max= 376, per=34.02%, avg=159.88, stdev=114.79, samples=1320
iops : min= 2, max= 94, avg=39.97, stdev=28.70, samples=1320
lat (usec) : 10=0.01%
lat (msec) : 10=0.01%, 20=0.01%, 50=0.47%, 100=52.07%, 250=45.09%
lat (msec) : 500=0.05%, 750=0.03%, 1000=0.03%, 2000=0.15%, >=2000=2.10%
cpu : usr=0.06%, sys=0.90%, ctx=94572, majf=0, minf=11
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=39474,26387,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20619: Sat Feb 6 22:33:00 2021
read: IOPS=44, BW=176KiB/s (181kB/s)(155MiB/900007msec)
slat (usec): min=3, max=11683k, avg=14867.35, stdev=82362.34
clat (msec): min=11, max=16858, avg=204.65, stdev=715.21
lat (msec): min=11, max=17187, avg=219.52, stdev=765.14
clat percentiles (msec):
| 1.00th=[ 54], 5.00th=[ 67], 10.00th=[ 73], 20.00th=[ 82],
| 30.00th=[ 88], 40.00th=[ 93], 50.00th=[ 99], 60.00th=[ 104],
| 70.00th=[ 111], 80.00th=[ 118], 90.00th=[ 132], 95.00th=[ 146],
| 99.00th=[ 4732], 99.50th=[ 5537], 99.90th=[ 6611], 99.95th=[ 6879],
| 99.99th=[14429]
bw ( KiB/s): min= 8, max= 520, per=30.48%, avg=214.60, stdev=177.32, samples=1479
iops : min= 2, max= 130, avg=53.65, stdev=44.33, samples=1479
write: IOPS=29, BW=117KiB/s (120kB/s)(103MiB/900007msec)
slat (usec): min=4, max=1052.3k, avg=11700.04, stdev=51144.23
clat (usec): min=7, max=16668k, avg=203458.32, stdev=725566.69
lat (msec): min=10, max=17344, avg=215.16, stdev=761.82
clat percentiles (msec):
| 1.00th=[ 56], 5.00th=[ 67], 10.00th=[ 73], 20.00th=[ 83],
| 30.00th=[ 88], 40.00th=[ 94], 50.00th=[ 100], 60.00th=[ 105],
| 70.00th=[ 111], 80.00th=[ 120], 90.00th=[ 132], 95.00th=[ 146],
| 99.00th=[ 4866], 99.50th=[ 5537], 99.90th=[ 6477], 99.95th=[ 6745],
| 99.99th=[16576]
bw ( KiB/s): min= 8, max= 400, per=33.97%, avg=159.68, stdev=115.75, samples=1324
iops : min= 2, max= 100, avg=39.92, stdev=28.94, samples=1324
lat (usec) : 10=0.01%
lat (msec) : 20=0.01%, 50=0.54%, 100=52.27%, 250=44.71%, 500=0.05%
lat (msec) : 750=0.04%, 1000=0.05%, 2000=0.15%, >=2000=2.20%
cpu : usr=0.06%, sys=1.03%, ctx=94387, majf=0, minf=11
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=39683,26434,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
file1: (groupid=0, jobs=1): err= 0: pid=20620: Sat Feb 6 22:33:00 2021
read: IOPS=43, BW=176KiB/s (180kB/s)(154MiB/900003msec)
slat (usec): min=3, max=11508k, avg=14369.45, stdev=81491.16
clat (usec): min=7, max=17024k, avg=200537.92, stdev=722782.28
lat (msec): min=8, max=17349, avg=214.91, stdev=771.35
clat percentiles (msec):
| 1.00th=[ 55], 5.00th=[ 67], 10.00th=[ 73], 20.00th=[ 82],
| 30.00th=[ 88], 40.00th=[ 93], 50.00th=[ 99], 60.00th=[ 104],
| 70.00th=[ 110], 80.00th=[ 118], 90.00th=[ 131], 95.00th=[ 144],
| 99.00th=[ 4866], 99.50th=[ 5470], 99.90th=[ 6946], 99.95th=[ 7550],
| 99.99th=[16040]
bw ( KiB/s): min= 8, max= 561, per=30.93%, avg=217.76, stdev=176.59, samples=1452
iops : min= 2, max= 140, avg=54.44, stdev=44.15, samples=1452
write: IOPS=29, BW=118KiB/s (121kB/s)(104MiB/900003msec)
slat (usec): min=4, max=1116.2k, avg=12454.71, stdev=55514.46
clat (msec): min=19, max=16182, avg=209.52, stdev=735.52
lat (msec): min=19, max=16427, avg=221.97, stdev=775.08
clat percentiles (msec):
| 1.00th=[ 54], 5.00th=[ 67], 10.00th=[ 74], 20.00th=[ 82],
| 30.00th=[ 89], 40.00th=[ 94], 50.00th=[ 100], 60.00th=[ 105],
| 70.00th=[ 112], 80.00th=[ 120], 90.00th=[ 133], 95.00th=[ 148],
| 99.00th=[ 4933], 99.50th=[ 5604], 99.90th=[ 6812], 99.95th=[ 7483],
| 99.99th=[15234]
bw ( KiB/s): min= 8, max= 424, per=34.20%, avg=160.76, stdev=116.03, samples=1323
iops : min= 2, max= 106, avg=40.19, stdev=29.01, samples=1323
lat (usec) : 10=0.01%
lat (msec) : 10=0.01%, 20=0.01%, 50=0.45%, 100=52.41%, 250=44.69%
lat (msec) : 500=0.06%, 750=0.04%, 1000=0.05%, 2000=0.14%, >=2000=2.15%
cpu : usr=0.05%, sys=1.07%, ctx=94392, majf=0, minf=13
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=39534,26590,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
READ: bw=704KiB/s (721kB/s), 175KiB/s-177KiB/s (180kB/s-181kB/s), io=619MiB (649MB), run=900003-900008msec
WRITE: bw=471KiB/s (482kB/s), 117KiB/s-118KiB/s (120kB/s-121kB/s), io=414MiB (434MB), run=900003-900008msec
Results
The workload of EFS looks like this, where the first part of the timeline is the General Purpose and the second one is the Max I/O.

Now if we get the IOPS from both EFS types:
GP:
Reads:
iops : min= 1, max= 190, avg=81.21, stdev=64.04, samples=1626
iops : min= 1, max= 202, avg=80.42, stdev=63.81, samples=1638
iops : min= 2, max= 196, avg=80.80, stdev=63.86, samples=1633
iops : min= 2, max= 188, avg=80.94, stdev=63.88, samples=1630
Writes:
iops : min= 1, max= 149, avg=54.91, stdev=42.52, samples=1596
iops : min= 2, max= 134, avg=55.69, stdev=42.38, samples=1576
iops : min= 2, max= 152, avg=55.88, stdev=42.60, samples=1580
iops : min= 2, max= 142, avg=55.42, stdev=42.80, samples=1591
Max I/O:
Reads:
iops : min= 1, max= 136, avg=54.44, stdev=44.36, samples=1458
iops : min= 2, max= 130, avg=55.39, stdev=43.94, samples=1425
iops : min= 2, max= 130, avg=53.65, stdev=44.33, samples=1479
iops : min= 2, max= 140, avg=54.44, stdev=44.15, samples=1452
Writes:
iops : min= 1, max= 98, avg=40.38, stdev=28.91, samples=1310
iops : min= 2, max= 94, avg=39.97, stdev=28.70, samples=1320
iops : min= 2, max= 100, avg=39.92, stdev=28.94, samples=1324
iops : min= 2, max= 106, avg=40.19, stdev=29.01, samples=1323
I provisioned 4MB/s specifically because of 10GB with a provision of 4MB/s costs about 30$ p/m. Going higher in MB/s is quite expensive. I felt this was a “sane” value. Funny thing though, my more expensive setup was worse than my first run with just the General Purpose one.
Now I did invest some time to comprehend the EFS solution regarding hard & soft limits, pricing, various performance types, and the throughput. Yet honestly, I have no idea why this happened. My best bet is that the GP used a freebie on throughput.
If you take a look again at the previous screenshot, you can see that the GP was quite fast on uploading the 10GB file, while the Max I/O took a while (in line with 4MB/s):
Anyhow, if we look at EBS:
Reads:
iops : min= 124, max= 1412, avg=562.72, stdev=403.76, samples=1800
iops : min= 124, max= 1364, avg=565.18, stdev=405.62, samples=1800
iops : min= 106, max= 1420, avg=563.72, stdev=404.78, samples=1800
iops : min= 120, max= 1358, avg=564.59, stdev=406.06, samples=1800
Writes:
iops : min= 72, max= 906, avg=375.73, stdev=270.89, samples=1800
iops : min= 84, max= 978, avg=378.11, stdev=272.58, samples=1800
iops : min= 82, max= 902, avg=376.66, stdev=270.12, samples=1800
iops : min= 62, max= 908, avg=376.69, stdev=269.78, samples=1800
These are just sane values we can work with for most use-cases. Perhaps we have to buff EBS a bit more (which is possible) for higher performance, but the base-line is solid.
Stability
The thing the tests did not show was stability on the performance and how the systems deals with its storage. To give you an example, if I would run the read/write test and try to do something on the file system, it either locks out or is terrible slow.
[email protected]:/efs# time ls -lah
total 11G
drwxr-xr-x 2 root root 6.0K Feb 6 21:07 .
drwxr-xr-x 25 root root 4.0K Feb 6 20:15 ..
-rw-r--r-- 1 root root 10G Feb 6 21:19 fio-rand-RW
real 0m11.318s
user 0m0.000s
sys 0m0.036s
Yes, that took 11.3 seconds.
Also, a funny thing, when I placed the 10GB file and did a list in my folder, it took the time of uploading the file (about 30-40 minutes) before my list command was completed.
Obviously, I’m here “abusing” EFS, but I really wanted to show you why it’s not suitable for applications.
Use-cases for EFS
So it does have use-cases, for instance in a CMS to include documents and other files. I guess various SAP solutions can use it. It can be really good on that. It just scales with how much data you have on it and you can provision throughput based on what you want. It allows for many hosts to mount the volume and has quite some read/write power. Roughly 35.000 actions per second.
Yet this is just not great for workloads you expect on Kubernetes. If EBS is not an option and S3 is not supported, I assume you can go for EFS if your workload does not go mental on it. I.e. if you store artifacts on it, fine. If you use it to process data ON EFS, I would say: nope.
EFS is a can of worms though.
When I was fiddling around, I felt I have to address this. I’m just going to post some quotes from the AWS website.
Amazon EFS offers a Standard and an Infrequent Access storage class. The Standard storage class is designed for active file system workloads and you pay only for the file system storage you use per month.
Enable Lifecycle Management when your file system contains files that are not accessed every day to reduce your storage costs
Files smaller than 128 KiB are not eligible for Lifecycle Management and will always be stored on Amazon EFS Standard storage class.
When reading from or writing to Amazon EFS IA, your first-byte latency is higher than that of Amazon EFS Standard.
Throughput of bursting mode file systems scales linearly with the amount of data stored. If you need more throughput than you can achieve with your amount of data stored, you can configure Provisioned Throughput.
EFS supports one to thousands of Amazon EC2 instances connecting to a file system concurrently
Amazon EFS’s distributed design avoids the bottlenecks and constraints inherent to traditional file servers
This distributed data storage design means that multi-threaded applications, and applications that concurrently access data from multiple Amazon EC2 instances can drive substantial levels of aggregate throughput and IOPS
Due to this per-operation latency, overall throughput generally increases as the average I/O size increases, since the overhead is amortized over a larger amount of data
“Max I/O” performance mode is optimized for applications where tens, hundreds, or thousands of EC2 instances are accessing the file system
With bursting mode, the default throughput mode for Amazon EFS file systems, the throughput available to a file system scales as a file system grows.
Also, because many workloads are read-heavy, read operations are metered at a 1:3 ratio to other NFS operations (like write).
All file systems deliver a consistent baseline performance of 50 MB/s per TB of Standard class storage
All file systems (regardless of size) can burst to 100 MB/s,
File systems with more than 1TB of Standard class storage can burst to 100 MB/s per TB
Since read operations are metered at a 1:3 ratio, you can drive up to 300 MiBs/s per TiB of read throughput.
Provisioned Throughput also includes 50 KB/s per GB (or 1 MB/s per 20 GB) of throughput in the price of Standard storage.
and there are more…

Look, I get it. Storage is hard. AWS tries to give options to users but also requires $$$ depending on what your requirements are. The problem is however, it’s getting too many variables. It’s getting hard to understand what is happening, why that’s happening, and how much that will cost you.
So don’t get me wrong, I do think AWS made something really awesome with EFS (if you use it correctly) but its setup and billing model is an abomination.
Furthermore, I see quite some statements regarding IOPS and throughput but the reality is different depending on your use-case. The test I did with random read/writes gave about 50 IOPS for both reads and writes on average. So, 100 IOPS to make it easier. AWS states:
“In General Purpose mode, there is a limit of 35,000 file operations per second. Operations that read data or metadata consume one file operation, operations that write data or update metadata consume five file operations.”
Talking about that can of worms again, but let’s continue:
“This means that a file system can support 35,000 read operations per second, or 7,000 write operations, or some combination of the two. For example, 20,000 read operations and 3,000 write operations (20,000 reads x 1 file operation per read + 3,000 writes x 5 file operations per write = 35,000 file operations).”
Well, that was not really in line with my test. So after searching and searching I could not find exact numbers on IOPS and how this was calculated. Then I found this slide:

So I believe these “35,000” file operations are based on concurrency. Multiple hosts asking for “some file”. Often this is not the case. Therefore I believe the base-line IOPS for EFS is 100.
What if you want good storage for multi-AZ?
Now we know that EFS is not a solid option, unless we change the behavior of our application, we can think about a solution for having EBS-like storage but for multi-AZ.
The fact is, there is not an AWS solution for this. No storage option does not focus on RWX type of “file sharing” but rather plain cool EBS that can be mounted in every AZ.
If you do not want the trade-off that I discussed before, there is only one other solution and that is creating your own storage stack :)
My two favorites are:
Longhorn: https://github.com/longhorn/longhorn and Rook/Ceph: https://github.com/rook/rook
It’s enterprise-grade distributed storage with no single point of failure, cloud-native and it’s distributed in a way that it’s like EBS, without the zone limitation.
Now, this does add complexity and something you have to manage. Yet if you want to run your clusters on 3 AZ’s, have workloads that can use simple block storage for processing, and/or don’t want to run inefficiently on Kubernetes: This might be just it.
Perhaps in a follow-up, I’ll do a deeper dive into these solutions.
Recap
- EFS is good as a product but only suitable for certain workloads and requires A LOT of thought on how you want to use it
- S3 is not a file system, but an excellent way to store “media”, backups, data-lake, etc.
- EBS is solid for storage; boot volumes, whatever database (I’m not saying you should run databases on k8s though)
If you have certain requirements, be sure that the storage you pick can support that. Also, make sure you keep your requirements in line. Don’t abuse storage for your requirements. If you need RWX volumes and EFS is your only option, perhaps consider rewriting your application to modern standards.
Multi-zone mounts are not supported by EBS, which is really unfortunate.
My take on EBS multi-zone support
I believe AWS makes a lot of money on EFS. Legacy applications need to use it if they are pushed into the cloud (I’m watching you SAP). EFS does have legit use-cases but I strongly believe it’s often used because there is no alternative and the only solution to make something work. Therefore the incentive to make something like EBS for multi-zones is gone.
0,30$ per GB stored and 6$ per MB/s provisioned per month is also no incentive to create something else.
For reference, EBS costs $0.08/GB-month, 125 MB/s free and $0.04/provisioned MB/s-month over 125. That includes the up to 3,000 IOPS standard.