I tried to summarize what I learned in Spring Batch, so I will write an article. Regarding the environment construction of Spring Batch, I made it based on the following article. -Try to create a simple batch service with Spring Boot + Spring Batch
When I tried to run it in the environment I built when I built the environment with reference to various sites, I was angry that there was no table such as "BATCH_JOB_EXECUTION". It seems that it is necessary to prepare a dedicated table in order to run Spring Batch. See below for the required tables.
-JobRepository Metadata Schema
However, it seems difficult to insert these yourself. Therefore, spring prepares sql for various platforms. If you search for "spring batch schema-〇〇 (platform name) .sql", you will find a hit. I used postgresql so I used "shema-postgresql.sql". I referred to the following.
-[Schema-postgresql.sql](https://github.com/spring-projects/spring-batch/blob/master/spring-batch-core/src/main/resources/org/springframework/batch/core/schema -postgresql.sql) -[Schema-drop-postgresql.sql](https://github.com/spring-projects/spring-batch/blob/master/spring-batch-core/src/main/resources/org/springframework/batch/core /schema-drop-postgresql.sql)
I think that you will have to start the app many times to check the operation of the app, but I think it is difficult to create data each time. Therefore, we will deal with it by initializing the table every time. Spring Boot provides a mechanism to initialize the table when the application is executed. I referred to the following site.
-SpringBoot DB initialization method -Spring Boot + PostgreSQL setting method
It seems that it will be started automatically if you place "schema-〇〇.sql" under "src / main / resources" created when the Spring Boot project is created. I used postgresql, so I did the following:
application.properties
spring.datasource.driver-class-name=org.postgresql.Driver
spring.datasource.url=jdbc:postgresql://localhost:5432/testdb
#spring.datasource.username=postgres
#spring.datasource.password=postgres
spring.datasource.initialization-mode=always
It seems that SpringBoot automatically defines the Bean for DataSource, and the engineer only needs to describe the DB settings in application.properties.
In addition to what I confirmed in "[I get angry if there is no table for Spring Batch](https://qiita.com/kyabetsuda/items/f011533621cff7f53c63# I get angry if there is no table for springbatch)", it is as follows. SQL is prepared.
schema-all.sql
DROP TABLE IF EXISTS people;
CREATE TABLE people (
person_id SERIAL NOT NULL PRIMARY KEY,
first_name VARCHAR(20),
last_name VARCHAR(20)
);
-- Autogenerated: do not edit this file
DROP TABLE IF EXISTS BATCH_STEP_EXECUTION_CONTEXT;
DROP TABLE IF EXISTS BATCH_JOB_EXECUTION_CONTEXT;
DROP TABLE IF EXISTS BATCH_STEP_EXECUTION;
DROP TABLE IF EXISTS BATCH_JOB_EXECUTION_PARAMS;
DROP TABLE IF EXISTS BATCH_JOB_EXECUTION;
DROP TABLE IF EXISTS BATCH_JOB_INSTANCE;
DROP SEQUENCE IF EXISTS BATCH_STEP_EXECUTION_SEQ ;
DROP SEQUENCE IF EXISTS BATCH_JOB_EXECUTION_SEQ ;
DROP SEQUENCE IF EXISTS BATCH_JOB_SEQ ;
CREATE TABLE BATCH_JOB_INSTANCE (
JOB_INSTANCE_ID BIGINT NOT NULL PRIMARY KEY ,
VERSION BIGINT ,
JOB_NAME VARCHAR(100) NOT NULL,
JOB_KEY VARCHAR(32) NOT NULL,
constraint JOB_INST_UN unique (JOB_NAME, JOB_KEY)
) ;
CREATE TABLE BATCH_JOB_EXECUTION (
JOB_EXECUTION_ID BIGINT NOT NULL PRIMARY KEY ,
VERSION BIGINT ,
JOB_INSTANCE_ID BIGINT NOT NULL,
CREATE_TIME TIMESTAMP NOT NULL,
START_TIME TIMESTAMP DEFAULT NULL ,
END_TIME TIMESTAMP DEFAULT NULL ,
STATUS VARCHAR(10) ,
EXIT_CODE VARCHAR(2500) ,
EXIT_MESSAGE VARCHAR(2500) ,
LAST_UPDATED TIMESTAMP,
JOB_CONFIGURATION_LOCATION VARCHAR(2500) NULL,
constraint JOB_INST_EXEC_FK foreign key (JOB_INSTANCE_ID)
references BATCH_JOB_INSTANCE(JOB_INSTANCE_ID)
) ;
CREATE TABLE BATCH_JOB_EXECUTION_PARAMS (
JOB_EXECUTION_ID BIGINT NOT NULL ,
TYPE_CD VARCHAR(6) NOT NULL ,
KEY_NAME VARCHAR(100) NOT NULL ,
STRING_VAL VARCHAR(250) ,
DATE_VAL TIMESTAMP DEFAULT NULL ,
LONG_VAL BIGINT ,
DOUBLE_VAL DOUBLE PRECISION ,
IDENTIFYING CHAR(1) NOT NULL ,
constraint JOB_EXEC_PARAMS_FK foreign key (JOB_EXECUTION_ID)
references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
) ;
CREATE TABLE BATCH_STEP_EXECUTION (
STEP_EXECUTION_ID BIGINT NOT NULL PRIMARY KEY ,
VERSION BIGINT NOT NULL,
STEP_NAME VARCHAR(100) NOT NULL,
JOB_EXECUTION_ID BIGINT NOT NULL,
START_TIME TIMESTAMP NOT NULL ,
END_TIME TIMESTAMP DEFAULT NULL ,
STATUS VARCHAR(10) ,
COMMIT_COUNT BIGINT ,
READ_COUNT BIGINT ,
FILTER_COUNT BIGINT ,
WRITE_COUNT BIGINT ,
READ_SKIP_COUNT BIGINT ,
WRITE_SKIP_COUNT BIGINT ,
PROCESS_SKIP_COUNT BIGINT ,
ROLLBACK_COUNT BIGINT ,
EXIT_CODE VARCHAR(2500) ,
EXIT_MESSAGE VARCHAR(2500) ,
LAST_UPDATED TIMESTAMP,
constraint JOB_EXEC_STEP_FK foreign key (JOB_EXECUTION_ID)
references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
) ;
CREATE TABLE BATCH_STEP_EXECUTION_CONTEXT (
STEP_EXECUTION_ID BIGINT NOT NULL PRIMARY KEY,
SHORT_CONTEXT VARCHAR(2500) NOT NULL,
SERIALIZED_CONTEXT TEXT ,
constraint STEP_EXEC_CTX_FK foreign key (STEP_EXECUTION_ID)
references BATCH_STEP_EXECUTION(STEP_EXECUTION_ID)
) ;
CREATE TABLE BATCH_JOB_EXECUTION_CONTEXT (
JOB_EXECUTION_ID BIGINT NOT NULL PRIMARY KEY,
SHORT_CONTEXT VARCHAR(2500) NOT NULL,
SERIALIZED_CONTEXT TEXT ,
constraint JOB_EXEC_CTX_FK foreign key (JOB_EXECUTION_ID)
references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
) ;
CREATE SEQUENCE BATCH_STEP_EXECUTION_SEQ MAXVALUE 9223372036854775807 NO CYCLE;
CREATE SEQUENCE BATCH_JOB_EXECUTION_SEQ MAXVALUE 9223372036854775807 NO CYCLE;
CREATE SEQUENCE BATCH_JOB_SEQ MAXVALUE 9223372036854775807 NO CYCLE;
Now, the table required for execution is initialized every time it is executed.
In Spring Batch, processing is basically performed using Reader, Processor, Writer classes, but when I defined these by myself, an infinite loop occurred. I had a lot of trouble, but the following site was helpful
-Spring Batch chunk managed series processing
In Spring Batch, the process seems to loop until ItemReader returns null. Therefore, it seems that ItemReader must be devised and implemented so that null is returned when the processing is completed. It was implemented that way on the following sites.
-Implementation of Spring Batch original ItemReader
The following is implemented with reference to the above site.
PersonItemReader.java
import java.util.List;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.NonTransientResourceException;
import org.springframework.batch.item.ParseException;
import org.springframework.batch.item.UnexpectedInputException;
import org.springframework.beans.factory.annotation.Autowired;
public class PersonItemReader implements ItemReader<Person>{
private List<Person> people = null;
private int nextIndex;
private final PersonService service;
public PersonItemReader(PersonService service) {
this.service = service;
}
@Override
public Person read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
if (people == null) {
people = service.selectAll();
nextIndex = 0;
}
Person person = null;
if (nextIndex < people.size()) {
person = people.get(nextIndex);
nextIndex++;
}
return person;
}
}
The defined ItemReader is bean-defined.
BatchConfiguration.java
@Configuration
@EnableBatchProcessing
public class BatchConfiguration {
...(abridgement)...
@Autowired
PersonService personService;
@Bean
public PersonItemReader reader() {
return new PersonItemReader(personService);
}
...(abridgement)...
}
In Spring Batch, it is common to use Reader, Processor, and Writer, but there are general implementation methods for each. I think it's like a design pattern. Below is the implementation method I have done. First is the Reader class.
PersonItemReader.java
import java.util.ArrayList;
import java.util.List;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.NonTransientResourceException;
import org.springframework.batch.item.ParseException;
import org.springframework.batch.item.UnexpectedInputException;
public class PersonItemReader implements ItemReader<List<Person>>{
private final PersonService service;
private final PersonCheckService checkService;
public PersonItemReaderForTest(PersonService service, PersonCheckService checkService) {
this.service = service;
this.checkService = checkService;
}
@Override
public List<Person> read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
List<Person> people = service.selectAll();
List<Person> ret = null;
for(Person person : people) {
if(checkService.check(person)) {
if(ret == null)
ret = new ArrayList<Person>();
ret.add(person);
}
}
return ret;
}
}
What is obtained from DB is passed to Processor in List format. Next is the Processor class.
PersonItemProcessor.java
import java.util.ArrayList;
import java.util.List;
import org.springframework.batch.item.ItemProcessor;
public class PersonItemProcessor implements ItemProcessor<List<Person>, List<Person>> {
@Override
public List<Person> process(final List<Person> people) throws Exception {
List<Person> transformedPeople = new ArrayList<Person>();
for(Person person : people) {
final String firstName = person.getFirstName().toUpperCase();
final String lastName = person.getLastName().toUpperCase();
final Person transformedPerson = new Person(firstName, lastName);
transformedPeople.add(transformedPerson);
}
return transformedPeople;
}
}
The List passed from Reader is processed and a new List is returned. Next is the Writer class.
PersonItemWriter.java
import java.util.List;
import org.springframework.batch.item.ItemWriter;
public class PersonItemWriterForTest implements ItemWriter<Object>{
PersonService service;
public PersonItemWriterForTest(PersonService service) {
this.service = service;
}
@Override
public void write(List<? extends Object> items) throws Exception {
List<Person> people = (List<Person>) items.get(0);
for(Person person : people) {
service.updatePerson(person);
}
}
}
In the Writer class, the List passed from the Processor is registered in the DB. But noteworthy is the following code in the Writer class
List<Person> people = (List<Person>) items.get(0);
To get List
-Spring Batch --Using an ItemWriter with List of Lists
The following is written on the above site.
Typically, the design pattern is:
Reader -> reads something, returns ReadItem Processor -> ingests ReadItem, returns ProcessedItem Writer -> ingests List<ProcessedItem>
If your processor is returning List<Object>, then you need your Writer to expect List<List<Object>>.
Typically, the design pattern is Reader, Processor returns a single item, and Writer processes a List of items. Since batch processing is performed, I thought that it would be normal to pass multiple data acquired from the DB to the Processor as it is in the form of List. However, it seems that Writer can receive the object returned by the Processor as a single object stored in the List. Therefore, by returning a single object with Processor, it is not necessary to perform the above processing of items.get (0). Reader also has an implementation method that returns a single object as introduced in "[Job loops infinitely](https://qiita.com/kyabetsuda/items/f011533621cff7f53c63#Job loops infinitely)". Sounds like a general one.
Spring Batch provides a mechanism for testing jobs and steps. Use JobLauncherTestUtils to test. I referred to the following site.
First, define a bean to use JobLauncherTestUtils.
BatchConfigurationForTest.java
import javax.sql.DataSource;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.database.builder.JdbcBatchItemWriterBuilder;
import org.springframework.batch.test.JobLauncherTestUtils;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.jdbc.datasource.DriverManagerDataSource;
@Configuration
@EnableBatchProcessing
public class BatchConfigurationForTest {
@Autowired
public JobBuilderFactory jobBuilderFactory;
@Autowired
public StepBuilderFactory stepBuilderFactory;
@Autowired
JobLauncher jobLauncher;
@Autowired
JobRepository jobRepository;
@Bean
public DataSource dataSource() {
DriverManagerDataSource dataSource = new DriverManagerDataSource();
dataSource.setDriverClassName("org.postgresql.Driver");
dataSource.setUrl("jdbc:postgresql://localhost:5432/ec");
// dataSource.setUsername(username);
// dataSource.setPassword(password);
return dataSource;
}
@Bean
PersonService personService() {
return new PersonService();
};
@Bean
public PersonItemReader reader() {
return new PersonItemReader(personService());
}
@Bean
public PersonItemProcessor processor() {
return new PersonItemProcessor();
}
@Bean
public JdbcBatchItemWriter<Person> writer(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder<Person>()
.itemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>())
.sql("INSERT INTO people (first_name, last_name) VALUES (:firstName, :lastName)")
.dataSource(dataSource)
.build();
}
@Bean
public NoWorkFoundStepExecutionListener noWorkFoundStepExecutionListener() {
return new NoWorkFoundStepExecutionListener();
}
@Bean
public Job importUserJob(Step step1) {
return jobBuilderFactory.get("importUserJob")
.incrementer(new RunIdIncrementer())
.flow(step1)
.end()
.build();
}
@Bean
public Step step1(NoWorkFoundStepExecutionListener listener, JdbcBatchItemWriter<Person> writer) {
return stepBuilderFactory.get("step1")
.<Person, Person> chunk(1)
.reader(reader())
.processor(processor())
.writer(writer)
.listener(listener)
.build();
}
@Bean
public JobLauncherTestUtils jobLauncherTestUtils() {
JobLauncherTestUtils utils = new JobLauncherTestUtils();
utils.setJob(importUserJob(step1(noWorkFoundStepExecutionListener(), writer(dataSource()))));
utils.setJobLauncher(jobLauncher);
utils.setJobRepository(jobRepository);
return utils;
}
}
JobLauncherTestUtils is defined as Bean at the bottom. The above is a new definition of the Configuration class for testing. The content itself does not change much from the Configuration class. In Spring Batch, you can set something called "listener" after the step and do something, but since you can also test the listener, Bean is defined together (lister class is described in the above reference site) It has been). Next is the test class. First is the class that tests the job.
JobTest.java
import static org.hamcrest.CoreMatchers.*;
import org.junit.Assert;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.test.JobLauncherTestUtils;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(classes=BatchConfigurationForTest.class)
public class JobTest {
@Autowired
private JobLauncherTestUtils jobLauncherTestUtils;
@Test
public void testJob() throws Exception {
JobExecution jobExecution = jobLauncherTestUtils.launchJob();
Assert.assertThat("COMPLETED", is(jobExecution.getExitStatus().getExitCode()));
}
}
In the test class, the Bean-defined JobLauncherTestUtils is Autowired. The job is executed by launchJob. Which job to execute is specified when the bean is defined.
Next is the class that tests the steps.
StepTest.java
import static org.hamcrest.CoreMatchers.*;
import static org.junit.Assert.*;
import org.junit.Assert;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.batch.core.ExitStatus;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.StepExecution;
import org.springframework.batch.test.JobLauncherTestUtils;
import org.springframework.batch.test.MetaDataInstanceFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(classes=BatchConfigurationForTest.class)
public class StepTest {
@Autowired
private JobLauncherTestUtils jobLauncherTestUtils;
@Autowired
NoWorkFoundStepExecutionListener tested;
@Test
public void testStep() {
JobExecution jobExecution = jobLauncherTestUtils.launchStep("step1");
Assert.assertThat("COMPLETED", is(jobExecution.getExitStatus().getExitCode()));
}
@Test
public void testAfterStep() {
StepExecution stepExecution = MetaDataInstanceFactory.createStepExecution();
stepExecution.setExitStatus(ExitStatus.COMPLETED);
stepExecution.setReadCount(0);
ExitStatus exitStatus = tested.afterStep(stepExecution);
assertThat(ExitStatus.FAILED.getExitCode(), is(exitStatus.getExitCode()));
}
}
When testing a step, pass the name of the step to be executed in the argument of launchStep (). testAfterStep () tests the bean-defined listener. setReadCount () represents the number of items read by the Reader class. The NoWorkFoundStepExecutionListener described on the reference site is implemented to return ExisStatus.FAILED when getReadCount () == 0.
This is the end of Spring Batch summary.
Recommended Posts