[LINUX] [JVM] OOM (Out Of Memory) Necessary knowledge for troubleshooting

Introduction

When dealing with OOM failures, I visited many sites and repeated input. At that time, I had a lot of trouble getting knowledge systematically. So I wrote this article, hoping that it would be helpful for those who are similarly struggling.

If you already have knowledge of JVM and want to know how to deal with OOM troubles, please refer to this article. [JVM] Let's challenge OOM (Out Of Memory) failure response ** Please check here as it was added **

Target audience

・ JVM? I don't know that! People ・ Engineers who have the same problems such as OOM -Destruction god that starts a process without setting the JVM log etc.

What you can get from this article

-Set minimum boot options to prepare for OOM failure response ・ You will be able to identify the cause of OOM and take action to resolve it. ・ Know what you absolutely need to know about Linux

Received an order to deal with OOM failures

It took about a month to repair a certain function. After testing and preparing for release, we're ready to release! At that time ** An event occurred in which the services listed in the verification environment were dropped once a week. ** ** I knew the reason was Out Of Memory (OOM), but I don't know why it happened. Besides, there are no people around me who are familiar with the JVM, and there is no data such as heap dumps (more on this later). For example, I felt like I was abandoned on an uninhabited island with my laptop.

** I've heard words like JVM, GC, heap, but I don't understand the meaning. I decided to summarize what I learned to solve this OOM problem in this article. ** **

Input

First of all, from the input section. It is impossible without the necessary knowledge to investigate. Here, we will focus on learning how the JVM and GC work.

What is a JVM?

In a nutshell, it's the software needed to run Java programs. Ruby and Scala also run on the JVM. A feature of the JVM is that it allows programs to run platform-independently. ** In other words, it can run independently of the OS (Windows, Mac, Linux, etc.) environment. ** **

JVM configuration

The memory area of the OS is allocated to the JVM at startup. This will be explained later, but it can be set by a startup option (eg java -Xms100m).

The JVM has the following structure. JVM.png

These areas will be described in detail below.

Java heap

** In a nutshell, the memory area used by user-created programs. ** ** In other words, the memory space where objects and arrays used in Java programs are stored. I think I instantiate an object when I run a program, but memory is allocated at that time.

Strictly speaking, the Java heap has separate roles from the New area and the Old area.

New area

** The area where newly created objects are stored first ** Actually, even in the New area, the area is divided into Eden and survivor. Newly created objects etc. are first stored in the Eden area. When the memory of the Eden area becomes full, it is stored in the Survivor (strictly speaking, "From area" and "To area") to save the object.

I'll explain in detail later, but I'd like you to put it in the corner of your head a little. Scavenger GC is done for the New area.

Old area

** Area where objects with a long life are stored among the objects stored in the New area. ** ** A long life is an object that has been used for a long time in the process. For example, an object that is used only in a method in processing is a short-lived object.

puclic hoge() {
    Fuga fuga = new fuga();
}

I'll explain in detail later, but I'd like you to put it in the corner of your head. ** Full GC is performed on the Old area. ** **

C heap

The memory space used by the JVM to run native libraries.

Permanent area

A memory space where class and method information is stored. Since it is stored first, the size does not change much.

Thread stack

Java thread stack area

How GC works

A mechanism that deletes objects that are no longer in use.

For those who don't understand the image, for example First of all, imagine a rice field. Seedlings are planted in the rice fields, and when watered, they become rice. (Image that creates the object) Let's say there is a guy who fails on the way from seedlings to rice. (Image that the object is no longer referenced) It is the role of the GC to automatically take in the person who has failed. This will increase the area for planting new seedlings, which is a welcome mechanism for rice field owners.

In order for the GC to be an object that will be deleted, the following points must be met. **-The object is not referenced from anywhere. ** **

In addition, GC is roughly divided into two patterns. Scavenge GC (also called copy GC / minor GC) and Full GC (also called major GC).

Scavenge GC The generated object is assigned to the New area. ** When the New area is full, GC is performed on the New area. ** ** This is called Scavenger GC. The objects to be deleted are deleted by Scavenge GC, and the objects that are still in use and have a long life are moved to the Old area. In this way, the New area is secured and the new object is prepared to be accepted. It's like a short-term homestay.

Full GC ** When the Old area and Parmanent area are full, GC will be performed for the future area. ** ** This is Full GC.

What is OOM?

Roughly speaking If any of the areas introduced above run out of memory, the JVM will cause an Out Of Memory Error.

The important thing here is to determine where the OOM occurred. ** First of all, making a decision and identifying the area should be the first thing in OOM's troubleshooting! !! !! !! !! ** **

Java boot options (just read here and set the boot options will help)

Only the minimum options that should be absolutely explained. -Specify the size of the Java heap

-Xms -Xmx

-Xms is the initial Java heap size. Generally equal to -Xmx. -Xmx is the maximum Java heap size. You can see the above information when starting a Java process with the following command.

ps aux | grep java

・ Check GC occurrence status and memory leak status

-verbose:gc(-Xloggc:File path listed here)、-XX:+PrintGCTimeStamps、-XX:+PrintGCDetails

By setting this, the information of the GC log will be displayed in detail. If you want to see the GC log of the Java process that is running, check it with the following command.

#Check java process
jcmd -l

#above<pid>Enter to check the GC log
jstat -gcutil -h10 <pid> 10000

-Getting a heap dump

-XX:+HeapDumpOnOutOfMemoryError

By setting this, heap dump will be acquired when stopping at OOM. ** A heap dump is like a snapshot of the heap status. ** ** Since the heap size at a certain point is acquired, the memory usage rate can be compared by comparing immediately after startup, one day later, etc. It is easy to understand the transition and the cause.

Also, if you want to shoot at any time, execute the following command to get it.

#Check java process
jcmd -l

#above<pid>To get a heap dump by typing
jmap -dump:format=b,file=filename <pid>

For details, refer to this link. http://d.hatena.ne.jp/learn/touch/20090218/p1

Once summarized

This is the end of the explanation of the JVM mechanism and the GC mechanism. As much as possible, I wrote this article to systematically understand the JVM so that you can dig into the details on other sites.

Output

It's been a little too long, so I'll put it together in another article.

[JVM] Let's challenge OOM (Out Of Memory) failure response

Recommended Posts

[JVM] OOM (Out Of Memory) Necessary knowledge for troubleshooting
Watch out for the return value of __len__