MySQL very high CPU usage (and other processes)

Hi everyone

TL;DR (or Summary) – Leap second added on 30 June 2015 caused MySQL (and other processes to have very high CPU usage)

This is just a quick post on an issue that I faced today (01/07/2015). I arrived at work and found a warning from our monitoring tool that we had high cpu usage beginning at 01.00 AM. On top we saw several processes using a very high amount of CPU, namely:

  • mysql
  • jenkins
  • tomcat
  • ruby (god)
  • ksoftirqd (several of these)

On Tomcat we can have legitimate high cpu usage, but jenkins was not building anything and mysql (version 5.5.31) was not used in anything important. We ran through the logs and analytics to check if we were having an abnormally high visitor access and found nothing, nada, zero. We logged into mysql and issued the SHOW PROCESSLIST command and nothing was happening.

ksoftird was particularly weird, because it is “is a per-cpu kernel thread that runs when the machine is under heavy soft-interrupt load” which didn’t make any sense to us.

What the…. ? (X-Files theme song playing in the background)

After a certain time and more attempts (like trying to use jstack on tomcat to see if it showed anything unusual) we decided to restart mysql, tomcat, jenkins and god, but it all went back again to high cpu usage. We even restarted the server without better luck.

We then read this stack overflow question more carefully and noticed the “leap second issue”, and someone said “hey I remember seeing something about a leap second not to long ago”… I went and check wikipedia, and sure a leap second was added on June 30th 2015.

We tried the following:

$ /etc/init.d/ntpd stop
$ date -s "`date`"
$ /etc/init.d/ntpd start

Some people said that restarting ntp was not required, but… we did it anyway 🙂 And, it worked. Magically everything went back to normal. Apparently we weren’t the only ones affected. I think the next time I see a leap second announcement I’m going to put it on my calendar and warning to a day before 🙂

Mozilla’s blog also documents the issue.

I know this issue is documented, but I wanted to have one more post contributing to people finding this stuff on the internet, could save someone a couple of hours next time.

Happy coding and stay safe of leap seconds!

P.S – Reddit comments over here, Hacker News comments over here