You are on page 1of 4

Performance Tuning Linux Instances on EC2

Brendan Gregg's Blog home

Performance Tuning Linux Instances on

03 Mar 2015

At the last AWS re:Invent, I gave a talk on "Performance Tuning EC2

Instances", where I showed how my team (Performance and Reliability
Engineering) tunes Linux EC2 instances at Ne lix. This includes instance
selec on, EC2 features, Linux kernel tuning, and the use of observability.

This is the most comprehensive tuning talk I've given, and summarizes the
different ways we tune at the instance level. It should be useful for anyone
running Linux in the cloud, not just in EC2.

The slides are on slideshare:

1 of 81

It was also videoed, which is on youtube :

1 dari 4 20-01-2017 07:18

Performance Tuning Linux Instances on EC2

I o en share my work on performance observability, but not tuning. Observability is where the bigger
wins are, as you can discover and then eliminate unnecessary work. It can also help show that tuning
is required. But I've also been meaning to share same examples of tuning, and had my chance at AWS

In the 3rd sec on of the talk, I included the tunables we are using on Ubuntu Trusty, to show examples
of what is possible. I've included them below for easy browsing. Please watch the video for context.

WARNING: These tunables were developed in late 2014, for Ubuntu Trusty instances on EC2.


schedtool B PID

Virtual Memory

vm.swappiness = 0 # from 60

Huge Pages

# echo never > /sys/kernel/mm/transparent_hugepage/enabled # from madvise

File System

vm.dirty_ratio = 80 # from 40
vm.dirty_background_ratio = 5 # from 10
vm.dirty_expire_centisecs = 12000 # from 3000
mount -o defaults,noatime,discard,nobarrier

Storage I/O

/sys/block/*/queue/rq_affinity 2
/sys/block/*/queue/scheduler noop
/sys/block/*/queue/nr_requests 256
/sys/block/*/queue/read_ahead_kb 256
mdadm chunk=64 ...


2 dari 4 20-01-2017 07:18

Performance Tuning Linux Instances on EC2

net.core.somaxconn = 1000
net.core.netdev_max_backlog = 5000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 12582912 16777216
net.ipv4.tcp_rmem = 4096 12582912 16777216
net.ipv4.tcp_max_syn_backlog = 8096
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240 65535
net.ipv4.tcp_abort_on_overflow = 1 # maybe

Hypervisor (Xen)

echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource

Se ng the clocksource came from a performance regression we found when moving to Ubuntu
Trusty, which can be fixed by switching clocksource to TSC. Best case example (so far): CPU usage
reduced by 30%, and average app latency reduced by 43%. Beware of clock dri , as in the (distant)
past TSC has been unreliable.

In the talk I described these tunables as our medicine cabinet, and to "consider these best before
2015". Tuning is a process, not a product. Copy-n-pas ng these tunables is a li le like taking someone
else's medica on; doing so years later is like taking someone else's expired medica on.

As an update: slide 62 shows "Broken Java stacks" in a flame graph, which we now have a workaround
for (an OpenJDK patch I wrote). See my Linux Profiling at Ne lix post, where I have an example flame
graph with working Java stacks.

AWS re:Invent was a massive event, and there were many talks I missed. Fortunately they were
recorded, and Adrian Cockcro published a list of interes ng talks which are worth checking out.

3 dari 4 20-01-2017 07:18

Performance Tuning Linux Instances on EC2

1 Comment Brendan Gregg's Blog Muhammad Dzulq

Recommend 9 Share Sort by Best

Brendan Gregg's Blog requires you to verify your email address before posting. Send verification
email to

Join the discussion

Nick Flows a month ago

Reply Share


Choosing a Linux Tracer (2015) Linux bcc/BPF Node.js USDT Tracing

Yes, that'd make for an Yes, that's what I'm doing in
interesting angle. I have summarized it in the this post. See the very first example.
past, eg, the image at the top of

Netflix Instance Analysis Requirements Who is waking the waker (Linux chain
graph prototype)
These are very instance
specific views, when you look at the APM tools I have actually been using
and numbers you posted from Gartner (which I wakeup flame graphs to help find latency
outliers. All I did was not record stack traces

Subscribe Add Disqus to your site Add Disqus Add Privacy

You can comment here, but I can't guarantee your comment will remain here forever: I might switch
comment systems at some point (eg, if disqus add adver sements).

Copyright 2016 Brendan Gregg.

About this blog

4 dari 4 20-01-2017 07:18

You might also like