I feel that it is too pinpoint as a subject and weak, but I wrote it.
The PostgreSQL copy command is You can process at high speed when storing a file in a table. However, although CSV can be registered as it is, If you want to complete the value
So, if you are processing in order, Since writing to the storage runs twice (file and DB), I think it is better to unify it and implement it.
It looks like this in simple writing,
public static long copy(Connection conn, String filePath, String tableName) throws Exception {
CopyManager copyManager = new CopyManager((BaseConnection) conn);
Reader reader = new BufferedReader(new InputStreamReader(new FileInputStream(filePath), "UTF8"));
String sql = "copy " + tableName + " FROM STDIN WITH DELIMITER ','";
long result = copyManager.copyIn(sql, reader);
reader.close();
return result;
}
Pass the connection to CopyManager Create an instance of the CSV Reader class interface and Pass it to copyIn and pour it. At this time, it is assumed that CSV has the column structure of the table as it is.
Create your own Reader and I decided to add a system column at the timing of reading, Create a Reader class that implements the following Read method.
public class CsvFileWithSysColReader extends Reader {
/**Queue for storing character strings with columns*/
private final Queue<Character> csvBuffer;
/**I will set it with the argument in the Reader constructor of CSV*/
private final BufferedReader reader = new ArrayDeque<>();
//~~ Abbreviation ~~
/**
*constructor
*
* @param reader Buffered Reader to wrap
* @throws IOException
*/
public CsvFileWithSysColReader(final BufferedReader reader)
throws IOException {
//It may be good to have a Boolean such as skipPeader as an argument to have a header skip function.
this.reader = reader;
}
@Override
public int read(final char[] cbuf, final int off, final int len) throws IOException {
int readCount = len;
//If the CSV read buffer size is less than the number of digits read by the Copy command, read from the file.
while (this.csvBuffer.size() < len) {
//If it is the last of the file, it exits and returns only the current number of digits.
if (loadLine() == 0) {
readCount = this.csvBuffer.size();
break;
}
}
//If there is a read, the number of digits is returned.
if (readCount != 0) {
for (int i = off; i < readCount; i++) {
cbuf[i] = this.csvBuffer.poll();
}
return readCount;
} else {
//If not, at the end-Return 1.
return -1;
}
}
/**
*Row load
*
* @return Number of load result characters
* @throws IOException
*/
private int loadLine() throws IOException {
final String line = this.reader.readLine();
if (line == null) {
//If there is no acquisition line, 0 is returned and the process ends.
return 0;
} else {
//addSysCol: A function that adds system columns, edit as you like.
//It is also possible to add an error column by checking around here
final String lineWithSysCol= addSysCol(line);
for (int i = 0; i < lineWithSysCol.length(); i++) {
this.csvBuffer.add(lineWithSysCol.charAt(i));
}
return lineWithSysCol.length();
}
}
//~~ Abbreviation ~~
}
And execute like this.
public static long copy(final Connection conn, final String filePath, final String tableName) throws Exception {
final CopyManager copyManager = new CopyManager((BaseConnection) conn);
final Reader reader = new CsvFileWithSysColReader(new BufferedReader(new InputStreamReader(new FileInputStream(filePath), "UTF8")));
final String sql = "copy " + tableName + " FROM STDIN WITH DELIMITER ','";
final long result = copyManager.copyIn(sql, reader);
reader.close();
return result;
}
The implementation of CsvFileWithSysColReader.addSysCol (String line) above expands the range as follows.
In application development I / O is often fixed, so If you create a wrap class with Interface and absorb the processing there There are many things that can be done.
By the way Actually, I implemented it with various functions, I dropped the function from there and wrote only the main points, I'm sorry if there is any omission. .. ..
Recommended Posts