Python template for log analysis at explosive speed

Summary

During work, I often analyze ** free-format logs output by development machines that were put in by someone other than myself. I write the log analysis code according to it every time, but as I have done log analysis many times, I have created templates and techniques that will improve, so I will develop it.

Template code

#!/usr/bin/env python3

import sys,re

for line in sys.stdin: #Read from standard output
    line = line.strip() #Remove spaces and line breaks at the beginning and end of lines
    print(line)

Only this!

The above code is written in Python3, but Python2 is no different except that the Shebang and print functions become print statements.

Use cases and techniques

Try to analyze and aggregate the rows using the template.

#!/usr/bin/env python3

import sys,re #Handles standard input and regular expressions

#Information that crosses lines is stored in the variable declared here.
hoge_count = 0

#Turn the loop line by line
for line in sys.stdin:
    line = line.strip()
    # print(line) #For debugging, comment out if not needed

    #If you're looking for a line that starts with a particular string, than a regular expression.startwith()Is convenient
    if line.startswith('hoge'):
      hoge_count += 1

    if line.startswith('fuga'):
      #If you want to look ahead to the next line.readline()Call
      next_line = sys.stdin.readline().strip()
      print("next to fuga =", next_line)

    #Spaces and comma delimiters.split()use
    if line.startswith('moge'):
       moge_cols = line.split(' ')
       print("moge line cols =", moge_cols)

    #Use regular expressions for complex matches
    m = re.match('(.*)_muga_(.*)', line)
    if m:
       print("muga line left: ", m.group(1), "right:", m.group(2))

#Output of aggregation result
print("hoge_count =", hoge_count)

Let's eat this input.

input.txt


hoge1
hoge2
fuga
next fuga
moge 1 2 3
left_muga_right

Run


$ python3 analyze.py < input.txt

You can get this input.

output


next to fuga = next fuga
moge line cols = ['moge', '1', '2', '3']
muga line left:  left right: right
hoge_count = 2

Similarly, equivalent code can be written in Python2.

Summary

Introduced code templates and techniques for log analysis in 5 minutes. Written in Python3, but it is also possible in Python2.

Recommended Posts

Python template for log analysis at explosive speed
Preprocessing template for data analysis (Python)
Python data analysis template
Calculate Gaussian kernel at explosive speed even with python
Python for Data Analysis Chapter 4
Python template for Codeforces-manual test-
Python for Data Analysis Chapter 2
[Python] Competitive template [At Coder]
Python for Data Analysis Chapter 3
Explosive speed! Using Python Simple HTTP Server for kintone development
Try multivariable correlation analysis using Graphical lasso at explosive speed
Make a rain notification bot for Hangouts Chat at explosive speed
Python visualization tool for data analysis work
Create a Python development environment locally at the fastest speed (for beginners)
Template for writing batch scripts in python
[TPU] [Transformers] Make BERT at explosive speed
python log
Data analysis in Python Summary of sources to look at first for beginners
Template for creating command line applications in Python
[CovsirPhy] COVID-19 Python Package for Data Analysis: Data loading
Logging settings for daily log rotation in python
Explosive speed with Python (Bottle)! Web API development
Astro: Python modules / functions often used for analysis
2016-10-30 else for Python3> for:
python [for myself]
Data analysis python
python at docker
python argparse template
[Python] Tkinter template
Data analysis for improving POG 1 ~ Web scraping with Python ~
[For beginners] How to study Python3 data analysis exam
Create machine learning projects at explosive speed using templates
Python 3.4 Create Windows7-64bit environment (for financial time series analysis)
Python netCDF4 read speed and nesting of for statements
3. Natural language processing with Python 4-1. Analysis for words with KWIC
Implement APIs at explosive speed using Django REST Framework
[CovsirPhy] COVID-19 Python package for data analysis: SIR-F model
[CovsirPhy] COVID-19 Python package for data analysis: S-R trend analysis
[CovsirPhy] COVID-19 Python Package for Data Analysis: SIR model
[CovsirPhy] COVID-19 Python Package for Data Analysis: Parameter estimation
Try to solve Sudoku at explosive speed using numpy
Perform half-width / full-width conversion at high speed with Python