[LINUX] What to do when you get "I can't see the site !!!!"

Just looking at the title makes my heart squeaky ... I wrote a memo from my experience so far About half for myself

Indicators that can be used to investigate the cause

HTTP status code

https://developer.mozilla.org/ja/docs/Web/HTTP/Status Of these, the ones that cause an error are the 400 series and 500 series. The rough differences are as follows

--400 series: Access itself is not possible --Name resolution is not possible

Correspondence changes depending on which one, so first separate here

logfile

If you haven't tampered with the output path on linux, you'll probably find a rough log in / var / log /.

apache: /var/log/httpd/
nginx : /var/log/nginx/
php for nginx + php-Check fpm: /var/log/php-fpm/

If you want to see what's working, ps aux Since a large amount of information will come out, if there is a hit, also use grep together

AWS monitoring

With AWS, you can check a lot of information from the console

Correspondence

1, calm down

It's rather important. Human temper does not do anything good ... Let's calm down by organizing the current situation or consulting with a great person

2, check the status code

As mentioned above, there are many reasons why you cannot access, so I will isolate it from now on. Most browsers should have a status code on the screen

3-1 and 400 series

It is easy to deal with because the code is divided finely depending on the cause I often see the following

3-2, 500 series

I often see the following It starts from checking the error log for the time being Correspondence contents vary depending on the log, so I will omit it

From here on, an example

Correspondence of 404 Not Found

There are various causes, so I will put it in a separate frame. What is possible

  1. The server is down
  2. Name resolution is not possible Since it is around, I will check this area.

Try ping

ping {IP/hostname}

I will test the communication with (This is localhost as a dummy)

$ ping localhost
PING localhost (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=6.893 ms
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.115 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.076 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.117 ms
^C
--- localhost ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.076/1.800/6.893/2.940 ms

If I live, I will come back for the time being. If you want to specify the port number, you can not do it with plain ping, so use another method. I use nping. Convenient https://qiita.com/Yu-s/items/4b4f683fda374c8ddcc9

Try to log in

Mostly you should be able to log in with ssh or something If you can't log in with the command you should have been able to do before, it's likely that you're down.

Check from the console on AWS

EC2 Dashboard> Instances> Instance Status You can check from. When it becomes stop, it has fallen. (If you don't use aws-cli or autoscale, it's possible that someone stopped it intentionally ... it shouldn't stop automatically ...) You can also check if the status check has failed, so even if this fails, it will fail.

However, please note that it may be running even if the instance is restarting automatically (= it is actually down).

If you can confirm that the server is alive so far, but you can not access it with the domain, you probably can not resolve the name

Try dig

This is a quick check https://www.atmarkit.co.jp/ait/articles/1711/09/news020.html

You can also do it with nslookup https://www.atmarkit.co.jp/ait/articles/1710/27/news021.html

$ dig www.google.com

; <<>> DiG 9.10.6 <<>> www.google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3344
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;www.google.com.			IN	A

;; ANSWER SECTION:
www.google.com.		89	IN	A	172.217.24.132

;; Query time: 13 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Sep 11 13:51:58 JST 2020
;; MSG SIZE  rcvd: 59

Name resolution cannot be done without ;; ANSWER SECTION:

Support for 500 Internal Server Error

It is possible that an error has occurred

--Server software --The framework that runs the code

Since there are two choices, I will look at the two types of logs for the time being. There are various ways to deal with it depending on the content of the error, but the following are the errors I have encountered.

Forget log rotation

The following appeared in the error log of nginx

write() to "/var/log/nginx/access.log" was incomplete: 83 of 314 while logging request

"I couldn't write the access log." When I went to see it, only the access log of the day was messed up, so I guessed that the storage was exhausted. If you delete the corresponding file, it should be solved for the time being, but even if you delete the access log, it is not solved ... (Maybe there were other heavy files as well) I missed the time to look for it and autoscaled it, so I started a new server and replaced it to deal with it for the time being.

So, in the meantime, when I investigated the root cause, I noticed that it was not logrotated ... I updated nginx a few days ago, but at that time it seems that I forgot to restore the configuration file around log rotation. In addition to this, if storage and memory are exhausted, you will not be able to access it, so it may be good to make a note of the command to check it.

$ df -h //Storage check
$ free -m //Memory check

Forgot to reflect the setting change in the startup setting @ Autoscale

One day the site suddenly went down and I got a 500 error, so check the error log Looking at the cakephp2 log, I see an error related to a new feature that I made a few days ago! Should have been fixed ...? I immediately noticed that, but I forgot to reflect the setting change in the startup settings ... Apparently Traffic goes up => Start autoscale server that reproduces the error => Inaccessible It seems that it was the flow of.

It was a kind of thing that could be fixed by deleting the cache file due to an error around the cache of cakephp2, so Clear cache again => Create AMI in that state => Specify in startup settings I was able to respond with

Summary

I'm impatient to death, but ... ・ Calm down ・ Isolation of the cause ・ Consult a great person It's pretty good.

Recommended Posts

What to do when you get "I can't see the site !!!!"
What to do when you can't bind CaboCha to Python
What to do if you get the error ʻERR_FEATURE_UNAVAILABLE_ON_PLATFORM` when using ts-node-dev on Linux
What to do if you get an error when trying to load mnist
What to do if you get an error when installing Dlib (Ubuntu)
What to do if you get "The session could not be opened" when installing CentOS on VirtualBox
What to do if you can't pip install mysqlclient
What to do if you get a "Wrong Python Platform" warning when using Python with the NetBeans IDE
What to do if you get an OpenSSL error when installing Python 2 with pyenv
What to do if you get a memory error when converting from PySparkDataFrame to PandasDataFrame
What to do if you get an error when importing matplotlib in Python (Mac)
What to do if you get an Import Error when importing matplotlib with Jupyter
What to do if you can't hit the arrow keys in the Python interactive console
[AWS] What to do when you want to pip with Lambda
What to do if you get "coverage unknown" in Coveralls
What to do if you can't sort files with subscripts
What to do if you can't log in as root
What to do if you can't use WiFi on Linux
ImportError: No module What to do when you are told
What to do when Ubuntu crashes
What to do to get tensorflow-gpu to work
What to do if you get the error "Error: opencv3: Does not support building both Python 2 and 3 wrappers" when installing openCV 3
What to do when you get an error saying "Name resolution temporarily failed" on linux
What to do if you get an error when running "certbot renew" in CakePHP environment
What to do if you get an Undefined error when trying to use pip with pyenv
What to do if you can't install pyaudio with pip #Python
What to do if you get a minus zero in Python
What to do when the jupyterlab extension settings are not reflected
What to do if you get a UnicodeDecodeError with pip install
What to do if you can't build your project with Maven
What to do when the value type is ambiguous in Python?
What I referred to when studying tkinter
I can't get the element in Selenium!
What to do if you get a must override `get_config` error when trying to model.save in Keras
What to do if you get angry if you don't have libxml / xmlversion.h when installing lxml on CentOS
Let's summarize what you want to do.
What to do if you get `locale.Error: unsupported locale setting` when getting the day of the week from a date in Python
What skills do I need to program with the FBX SDK Python?
What to do if you can't find well with grep's -f option
What to do when the result downloaded via scrapy is in English
What to do if you can't find PDO in Laravel or CakePHP
What to do if you can't use scikit grid search in Python
[Docker] What to do when error Couldn't find the binary git appears
What to do if you get angry in TensorFlow v2 without attribute'app'
What to do if you get stuck during Anaconda installation on Linux
What to do if you get a TypeError with numpy min, max
What to do if you get Could not fetch URL 443 with pip
What to do when the warning "The environment is in consistent ..." appears in the Anaconda environment
[Django] What to do if an Integrity Error occurs when registering data from the management site to the database
What to do if you get a Permission denied (public key) error when trying to pull on Github
[Python] What to do if you get a ModuleNotFoundError when importing pandas using Jupyter Notebook in Anaconda
What to do if you get the error Target WSGI script'/var/www/xxx/xxx.wsgi' cannot be loaded as python module
What to do when you get angry that libxml / xmlversion.h does not exist when you put lxml with pip
What to do if you get the message "" ~ .pkg "is corrupted and cannot be opened" when installing wxPython on Mac OS X
[Python] What I did to do Unit Test
When you want to update the chrome driver.
What to do when PermissionError of tempfile.mkstemp occurs
What I did when updating from Python 2.6 to 2.7
What to do to get google spreadsheet in python
Example of what to do when the sample script does not work (OpenCV-Python)
I want to visualize the transfer status of the 2020 J League, what should I do?