A testimony to Linux resilience

A client released a new version of their website onto 20 AWS m1.medium instances (current site at peak load runs approximately 60 m1.medium webservers).
It was clearly an unsuccessful release, but what was surprising was the system did not actually crash, it was effectively a meltdown, but servers were still operational with load averages > 100. I was impressed with the ability for Linux to still (just) function.

parallel-ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 -i -h <servers>  uptime
 18:01:00 up 18:44,  0 users,  load average: 104.26, 110.03, 113.12
 18:01:00 up 18:56,  1 user,  load average: 62.33, 87.75, 90.40
 18:01:03 up 18:44,  0 users,  load average: 105.28, 115.33, 115.61
 18:01:03 up 18:44,  0 users,  load average: 149.35, 155.74, 133.68
 18:01:03 up 18:51,  0 users,  load average: 124.63, 121.31, 115.91
 18:01:03 up 18:44,  0 users,  load average: 118.99, 109.92, 110.60
 18:01:04 up 18:44,  0 users,  load average: 121.73, 118.40, 113.50
 18:01:04 up 18:44,  0 users,  load average: 113.89, 120.56, 114.64
 18:01:05 up 18:44,  0 users,  load average: 119.30, 119.71, 115.65
 18:01:05 up 18:44,  0 users,  load average: 126.33, 120.33, 119.02
 18:01:05 up 18:44,  0 users,  load average: 117.47, 113.01, 112.84
 18:01:05 up 18:44,  0 users,  load average: 172.21, 158.62, 135.19
 18:01:05 up 18:44,  0 users,  load average: 115.81, 114.96, 116.18
 18:01:05 up 18:44,  0 users,  load average: 122.25, 115.32, 115.27
 18:01:05 up 18:44,  0 users,  load average: 164.13, 168.04, 153.03
 18:01:05 up 18:44,  0 users,  load average: 123.80, 114.94, 110.29
 18:01:06 up 18:44,  0 users,  load average: 173.64, 173.80, 158.76
 18:01:06 up 18:44,  0 users,  load average: 132.52, 140.94, 135.43
 18:01:06 up 18:44,  0 users,  load average: 166.17, 151.68, 135.23
 18:01:06 up 18:44,  0 users,  load average: 170.14, 164.03, 145.31

The AWS m1.medium is a single CPU instance.

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 45
model name	: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
stepping	: 7
cpu MHz		: 1800.000
cache size	: 20480 KB
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm up rep_good aperfmperf unfair_spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm arat epb xsaveopt pln pts dts
bogomips	: 3600.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:
Tagged with: Uncategorized

Related Posts

An unexplained connection experience

The “Too many connections” problem is a common issue with applications using excessive permissions (and those that grant said global permissions). MySQL will always grant a user with SUPER privileges access to a DB to investigate the problem with a SHOW PROCESSLIST and where you can check the limits.

Read more

Additional DB objects in AWS RDS

To expand on Jervin’s Default RDS Account Privileges , RDS for MySQL provides a number of routines and triggers defined the the ‘mysql’ meta schema. These help in various tasks because the SUPER privilege is not provided.

Read more

Cloning MySQL 5.6 instances

A tip for all those cloud users that like cloning database servers (as reported in my book Effective MySQL – Replication Techniques in Depth ). Starting with MySQL 5.6, MySQL instances have a UUID .

Read more