[LINUX] Get the sum of each of multiple columns with awk

1. Check the field (item)

Check the field (item name) before awk. The field is usually on the first line of the (?) File, so use less or head to check it.

head -1 File name

I think that field names are often written with commas without line breaks or separated by pipes. Rather than counting visually to see "what number of fields the target column is" It is safe and efficient to change the delimiter to a newline using a Sakura editor. Copy the output field.

2. Align with Sakura Editor

In the Sakura editor, Ctrl + r (replace), check the regular expression in the dialog box, Replace the delimiter with "\ r \ n (line feed)". image.png image.png

You can tell which column is in the row number.

3. (Align in Excel)

For those who do not use Sakura Editor. Introducing how to replace delimiters in Excel. (* Regular expressions cannot be used in Windows Notepad, so they cannot be replaced with line breaks, etc.)

First, paste the character string into Excel. image.png Select Ctrl + h or Replace from Home tab> Edit group to display the dialog. image.png "Search string": (Enter the delimiter) "Character string after replacement": Press ** Ctrl + j ** (a symbol like a dot is displayed, but this means a line break) After the above, click Replace All image.png Since it is output in one cell, select all copy after entering edit mode by F2 or double click. Paste after exiting edit mode image.png In this state, go to Home tab> Edit group> Search and select> Select conditions and jump From the selection options dialog, select a blank cell and OK image.png Right click on the selected blank cell> Delete> Select entire row and OK image.png The data is arranged vertically.

4. awk command

** If you want to get the sum of \ $ 40 and \ $ 50 respectively **

Pipe the cated value to awk. ▼

cat [file name] | awk -F "," '{ print a += $40; b += $50 } END { print a, b }' |tail -1


  1. Get all the data in the file with cat (don't forget to use zcat for gz files)
  2. Pipe it to the awk command and use the -F option to tell the command what is used as the file delimiter.
  3. The addition operator "+ =" adds the left and right operands and assigns them to the left side, so "a + = \ $ 40" means "a = a + \ $ 40". That is, assign "a + \ $ 40" to a, and repeat until the end of the column. Complete the process once with a semicolon so that b can do the same process. End the processing in {} with END.
  4. Print a and b after END to print the total value.
  5. If this is left as it is, all unnecessary calculations will be output, so pass it to tail -1 and output only the last line so that only the last line is output.

