20. 06. 2025 Reinhold Trocker Uncategorized

elastic integration with big memory usage? keep host accessible!

In some environments, elastic agent integrations can unexpectedly consume excessive memory. This can be due to various reasons—misbehaving integrations, memory leaks, or simply under-provisioned hosts. When this happens, the Linux kernel may invoke the OOM (Out of Memory) killer of systemd, terminating the elastic agent service and usually disrupting ingestion of data.

How to Detect the Issue

system logs

If your elastic agent is being killed unexpectedly, check the system logs:

# grep -i out.of.memory /var/log/messages{-*,} |tail
...
/var/log/messages-20250605:Jun  5 14:21:11 HOSTNAME kernel: Memory cgroup out of memory: Killed process 3481 (agentbeat) total-vm:35608848kB, anon-rss:28517004kB, file-rss:69696kB, shmem-rss:0kB, UID:0 pgtables:62396kB oom_score_adj:0

This is a clear sign that the agent exceeded available memory and was forcibly terminated.

system load

Choose any method to monitor the system load; for example the CPU load check of Neteye Monitoring

How to prevent unresponsive host

The host becomes unaccessible due to huge load values. To safeguard your system, use systemd‘s MemoryMax directive to cap the Elastic Agent’s memory usage. A good rule of thumb is to reserve 3 GB for the OS and other services, and allocate the rest to the agent.

# create directory for elastic agent service custom values
mkdir -p /etc/systemd/system/elastic-agent.service.d

# for a system with 16 GB reserve maximum 13 GB for elastic agent
echo -e "[Service]\nMemoryMax=13G" > /etc/systemd/system/elastic-agent.service.d/memorymax.conf

# tell systemd to accept new config
systemctl daemon-reexec
systemctl daemon-reload
systemctl restart elastic-agent.service

So the agent stays within safe memory limits, improving system stability and reliability.

Next steps

The above only allows you to continue to access the machine that hosts the elastic agent. You will still have to investigate the reason of big memory usage.
If you want a quick solution: try to “throw money on it”, or in this case, throw memory on it and give the machine 64 GB more than it used to have and limit the service to 64 GB. In my experience 64 GB is usually enought for every process

real world example

I have found this behaviour (using a lot of RAM) with the elastic MISP integration. At the end, a gave the elastic agent service 45 GB, the MISP integration subprocess uses around 32 GB max.

Reinhold Trocker

Reinhold Trocker

IT professional, IT security, (ISC)2 CISSP, technical consultant

Author

Reinhold Trocker

IT professional, IT security, (ISC)2 CISSP, technical consultant

Leave a Reply

Your email address will not be published. Required fields are marked *

Archive