Oracle's sqlldr utility is used when loading a file into a table sqlldr outputs the following progress to the standard output, so you can check the execution status. However, if the number of loads is large, it is not possible to know how long it will take to complete the load.
SQL*Loader: Release 12.1.0.2.0 -Production on Thu January 23 14:36:14 2020
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
Use path:Conventional type
You have reached the commit point.-Number of logical records 64
You have reached the commit point.-Number of logical records 128
You have reached the commit point.-Number of logical records 192
You have reached the commit point.-Number of logical records 256
You have reached the commit point.-Number of logical records 320
You have reached the commit point.-Number of logical records 384
You have reached the commit point.-Number of logical records 448
You have reached the commit point.-Number of logical records 512
You have reached the commit point.-Number of logical records 576
So I wondered if I could use the Linux pv command to monitor the execution status.
Pv command is available on Linux The pv command is a handy tool that runs between piped commands and displays the current status and estimated time to completion from the total amount of data and the amount of data that has passed through the pipe.
For example, if you want to compress a large file with the gzip command
$ gzip bigfile
Will create a compressed bigfile.gz However, it doesn't output any message during execution, so I don't know when it will end.
So, using the pv command,
$ pv bigfile | gzip -c > bigfile.gz
When you execute
$ pv bigfile | gzip -c > bigfile.gz
129MB 0:00:15 [92.8MB/s] [===========>                       ] 36% ETA 0:00:31
The total amount, elapsed time, passing amount per second, current execution status (gauge and%), remaining time are displayed as
As you can see, pv is very convenient, but it monitors the amount of data that passes through the pipe, so it can only support piped command formats.
As mentioned earlier, sqlldr is currently writing to standard output how many loads it has loaded. If only this number of cases can be taken out and piped to pv, it is quite so.
The number of records is sufficient, not the number of bytes to load, as long as you know the execution status. So, let's create a shell script that executes a simple awk, read the standard output of sqlldr, and output the number of characters equivalent to the number of cases.
loadcount
#!/bin/bash
awk -F'number' '
BEGIN { SV = 0; }
NF==2{
    for (i = 0; i < $2 - SV; i++) {
        printf("1");
    }
    SV = $2;
}'
This script needs to be executable, so give it execute permission after saving it in the editor
$ chmod +x loadcount
The processing contents are as follows -When the number of fields (NF) is 2 (= when the number of cases is yyy) with the character string "number of cases" as a delimiter, the "number of cases" Extract the trailing string ($ 2) -The 1-byte character "1" is output for the number of differences from the number of items retrieved last time. In other words, when sqlldr loads 1000 records, 1000 "1" s are output.
If the input is a pipe, the pv command does not know the total amount, in which case you can specify the total amount with the -s option. In this case, the total amount is the number of records in the file to be loaded. The number of records (number of lines) can be obtained using the wc command.
$ wc -l File name
The final form is as follows
$ sqlldr USER/PASS@SID control=t1.ctl | ./loadcount | pv -p -t -e -s `cat t1.csv|wc -l` > /dev/null
0:00:20 [========>                                            ] 17% ETA 0:01:33
-USER / PASS @ SID is a descriptor for connecting to Oracle -T1.ctl is an input control file for sqlldr -T1.csv is the data file to be loaded
I was able to successfully monitor the execution status of sqlldr
Recommended Posts