Insert Eijiro's dictionary data into an Oracle table

At the beginning

The procedure when inserting Eijiro's dictionary data into an Oracle table is described. This time, I will describe the access from Java of Oracle DB from the extraction of dictionary data. The matter of inputting data from dictionary data to Oracle DB will be described in the next article.

Convert Eijiro's dictionary data into text

Extraction of dictionary data

I created a text file by referring to How to make SVL problem collection from Eijiro. By the way, I set it like ↓ image.png image.png

I set it as above --Word level is not displayed when outputting in PDIC format --When outputting in CSV format, if there is a line break in the middle of the line, the program processing becomes complicated. --If you output in user-defined format, you can output in HTML format (XML format), which makes text file processing much easier.

By the way, I output in HTML file format and use Jsoup to parse the text file

Character code conversion of extracted data

When you open the extracted text file

image.png

The characters are garbled like. At this rate, Jsoup can't handle it well (no, it's true, but it's troublesome), so I made a program to convert the character code in Java.

FileReadTest2.java


package file;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;

public class FileReadTest2 {

	private static String FILE_PATH ="C:/Users/***/Desktop/puisan_101.txt";
	private static String FILE_OUT_PATH ="C:/Users/***/Desktop/puisan_102.txt";
	
	public static void main(String args[]) throws Exception{
		BufferedReader reader=new BufferedReader(new InputStreamReader(new FileInputStream(FILE_PATH),"UTF-8"));
		BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(FILE_OUT_PATH),"UTF-8"));
		reader.lines().forEach(s->fileWriter(s,writer));
		reader.close();
		writer.close();
	}
	
	public static void fileWriter(String s,BufferedWriter writer){
		try{
			writer.write(s);
			writer.newLine();
		}catch(IOException e){
			e.printStackTrace();
			throw new RuntimeException("There was an error in file I / O");
		}
	}
}

When you run the above program image.png HTML file is created without garbled characters like

DB table definition

Create a table called WORD_DICTIONARY in orace12c and define the following attributes

name Mold Description
WORD VARCHAR2(200) word
MEANING CLOB Japanese translation of words
SHORT_VER VARCHAR2(2000) Part of MEANING(2000 bytes)To store
WORD_LEVEL VARCHAR2(20) Word level
WORD_INDEX UMBER(10) A number uniquely assigned to each word
SEARCH_WORD VARCHAR2(200) Lowercase word for search

Program that operates DB

We standardize the processing around Connection, implement low-level insert statements, and extend the batch method of QueryRunner, which is not easy to use.

OperateDB.java


package word;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.List;
import org.apache.commons.dbutils.QueryRunner;

public class OperateDB {
    public static ThreadLocal<Connection> THREAD_LOCAL = new ThreadLocal<Connection>();

	//From oracle12c, it is necessary to connect from the service name instead of the SID, so it is necessary to write the entire tns definition
	private static final String URL ="jdbc:oracle:thin:"		
			+ "@(DESCRIPTION=(ADDRESS = (PROTOCOL = TCP)"
			+ "(HOST = localhost)(PORT = 1521))"
			+ "(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME "
			+ "= triple_pb.co.jp)))";
	private static final String UID = "admin"; 
	private static final String PASS_WORD = "admin"; 
    
    
    public static void createConnection(){
    	try{
    		if(THREAD_LOCAL.get()==null){
    			Connection con = DriverManager.getConnection(URL, UID, PASS_WORD);
    			con.setAutoCommit(false);
    			THREAD_LOCAL.set(con);
    		}
    	}catch(Exception e){
    		e.printStackTrace();
    	}
    }
    
    public static void closeConnection(){
    	try{
    		if(THREAD_LOCAL.get()!=null){
    			THREAD_LOCAL.get().close();
    		}
    	}catch(Exception e){
    		e.printStackTrace();
    	}
    }
    
    public static void insert(List<Object[]> list,String sql){
    	try{
            new MyQueryRunner()			
            .batch(THREAD_LOCAL.get(), //Get a connection
            		sql, //Query to execute
            		list  //Parameters to pass to the query
      	     );    			
            THREAD_LOCAL.get().commit();
    	}catch(Exception e){
    		e.printStackTrace();
    		try{
           		THREAD_LOCAL.get().rollback();
    		}catch(SQLException ex){
    			ex.printStackTrace();
    		}
    	}
    }
    
     public static class MyQueryRunner extends QueryRunner{
    	 public int[] batch(Connection con,String sql,List<Object[]> list)throws SQLException{
    	    Object[][]params = new Object[list.size()][];
    	    for(int i=0; i<list.size();i++){
    	    	params[i] = list.get(i);
    	    }
    	    return super.batch(con,sql,params);
    	  }
      }
}



Recommended Posts

Insert Eijiro's dictionary data into an Oracle table
Insert Eijiro's dictionary data into Oracle DB part2
Insert data into DB using Yaml file