Things to consider when running a specified job using Spring Batch

Overview

I built the project and ran Spring Batch from IntelliJ IDEA by following the tutorial on Official Page. In addition to being able to use the functions of a normal Spring Boot application, it has job execution results, retry functions, etc., and I felt that the quality as a batch framework was high. However, when I execute the main class of the application from IntelliJ, all the defined jobs are executed, and I did not know how to execute only the specified job, so I will investigate it. Especially.

Application requirements

As a requirement for batch applications, I wanted to achieve at least the following.

Realization method

Since the existing application is a transaction script and it has become complicated, there is a background that we are considering introducing Spring Batch. Since the way to execute the existing application is similar to the interface of [CommandLineJobRunner](implementation with #CommandLineJobRunner) described later, I considered the implementation here, but this method could not achieve the requirement. However, I was able to meet the requirements by starting from JobLauncherCommandLineRunner. The following directory structure is used for various files.

├── build.gradle
└── src
    └── main
        ├── java
        │   └── hello
        │       ├── Application.java
        │       ├── BatchConfiguration.java
        │       ├── JobCompletionNotificationListener.java
        │       ├── Person.java
        │       └── PersonItemProcessor.java
        └── resources
            ├── application-development.properties
            ├── application-production.properties
            ├── application.properties
            ├── log4jdbc.log4j2.properties
            ├── sample-data.csv
            └── schema-all.sql

build.gradle The following dependencies have been added to enforce the connection to MySQL.

    runtime("mysql:mysql-connector-java")
    compile "org.lazyluke:log4jdbc-remix:0.2.7"
    compile("org.bgee.log4jdbc-log4j2:log4jdbc-log4j2-jdbc4.1:1.16")

application.properties When using the Spring Boot function, application-$ {profile} .properties associated with the specified profile is loaded into the framework separately from the parent application.properties, and the environment information is loaded into the Environment object. It is set to (: //docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/core/env/Environment.html). Environmental information is required by Datasource [spring.datasource. *](Https://docs.spring.io/spring-boot/docs/current/reference/html/boot-features-sql.html#boot- Features-connect-to-production-database) can be included.

appliaction-production.properties


spring.datasource.username=production_user
spring.datasource.password=production_password
spring.datasource.url=jdbc:log4jdbc:mysql://kasakaid.production.jp:3306/kasakaidDB?useSSL=true
spring.datasource.driver-class-name=net.sf.log4jdbc.DriverSpy

appliaction-development.properties


spring.datasource.username=root
spring.datasource.password=mysql
spring.datasource.url=jdbc:log4jdbc:mysql://127.0.0.1:3306/kasakaidDB?useSSL=false
spring.datasource.driver-class-name=net.sf.log4jdbc.DriverSpy

As mentioned above, if you prepare application.properties for each profile, you can switch the database to connect with the environment number or command line argument.

Spring Batch database schema

__ 2019/09/07 Addendum __

Spring Batch writes job run-time results, for example, this job succeeded or failed, to the database schema built into Spring Batch. Multiple DDLs for creating the schema are provided to support various databases such as MySQL and PostgreSQL, but these are build.gradle of the Spring Batch project. This file at build time with /spring-projects/spring-batch/blob/547533fab072289c062916e51c589de36ea3dfe2/spring-batch-core/build.gradle) /547533fab072289c062916e51c589de36ea3dfe2/spring-batch-core/src/main/sql/schema.sql.vpp) seems to be generated as a template. The behavior of creating the schema can be found in the Official Page.

If you use Spring Batch, it comes pre-packaged with SQL initialization scripts for most popular database platforms. Spring Boot can detect your database type and execute those scripts on startup. If you use an embedded database, this happens by default. You can also enable it for any database type, as shown in the following example:

spring.batch.initialize-schema=always

As you can see, you can set it to always run when Spring Batch starts. However, in an environment where applications are continuously running, including commercial environments, it is better not to always perform this schema creation operation when Spring Batch is started. This is because the batch process succeeds when Spring Batch is started, but when this schema creation operation is executed, the schema has already been generated, so an error is always thrown out when the schema is created. It seems that there is no setting to change the behavior in the current version, saying "If the schema is created, do nothing". Since this error becomes noise, there is no way to do it, so create a schema for Spring Batch in the corresponding environment in advance. The DDL itself can be confirmed by looking under org.springframework.batch.core in org.springframework.batch: spring-batch-core: .RELEASE with IntelliJ etc. From these, select the DDL of the database used in the relevant environment and execute it in the relevant environment.

Screen Shot 2019-09-07 at 15.18.40.png Screen Shot 2019-09-07 at 15.18.49.png

After creating the schema, you do not need to create the schema especially when Spring Batch is started, so set none so that the schema is not created in a commercial environment.

application-production.properties


spring.batch.initialize-schema=none

On the other hand, in-memory volatile databases such as H2 should usually be used during testing, so be sure to create a schema during testing. (However, if you only want to test from Service class etc. without using Spring Batch mechanism, you do not need to create a schema in particular)

application-test.properties


spring.batch.initialize-schema=always

log4jdbc.log4j2.properties This is the configuration file required to use log4jdbc. It has nothing to do with Spring Batch, but it is described to realize the requirements.

log4jdbc.log4j2.properties


log4jdbc.spylogdelegator.name=net.sf.log4jdbc.log.slf4j.Slf4jSpyLogDelegator

Specifying system properties

After making the above configuration, when executing the Spring Batch application, select two System Properties. Specify.

No System properties Description
1 spring.profiles.active Profile to activate
2 spring.batch.job.names The name of the job to run

The first spring.profiles.active is the profile to enable. The second [spring.batch.job.names](https://docs.spring.io/spring-boot/docs/current/reference/html/howto-batch-applications.html You can specify the name of the job to be executed with).

Execution and results

Run the batch with system properties. System properties can be specified as run-time arguments or set as environment variables. The example below executes two system properties as run-time arguments.

java -Dspring.profiles.active=development -Dspring.batch.job.names=importUserJob -jar gs-batch-processing-0.1.0.jar

You can access MySQL by specifying the development profile! Of course, Spring Boot ASCII art is also displayed.


  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::        (v2.0.5.RELEASE)

2019-01-20 23:16:58.028  INFO 86837 --- [           main] hello.Application                        : Starting Application on sakaidaikki-no-MacBook-Air.local with PID 86837 (/Users/kasakaid/dev/java/gs-batch-processing/complete/build/libs/gs-batch-processing-0.1.0.jar started by kasakaid in /Users/kasakaid/dev/java/gs-batch-processing/complete/build/libs)
2019-01-20 23:16:58.034  INFO 86837 --- [           main] hello.Application                        : The following profiles are active: development
2019-01-20 23:17:03.570  INFO 86837 --- [           main] org.hibernate.dialect.Dialect            : HHH000400: Using dialect: org.hibernate.dialect.MySQL5Dialect
2019-01-20 23:24:25.447  INFO 87366 --- [           main] o.s.b.c.r.s.JobRepositoryFactoryBean     : No database type set, using meta data indicating: MYSQL
2019-01-20 23:24:25.906  INFO 87366 --- [           main] o.s.b.c.l.support.SimpleJobLauncher      : No TaskExecutor has been set, defaulting to synchronous executor.
2019-01-20 23:24:25.934  INFO 87366 --- [           main] jdbc.audit                               : 1. Connection.getMetaData() returned com.mysql.jdbc.JDBC4DatabaseMetaData@14cd1699
2019-01-20 23:24:25.935  INFO 87366 --- [           main] jdbc.audit                               : 1. Connection.clearWarnings() returned 
2019-01-20 23:24:25.942  INFO 87366 --- [           main] o.s.jdbc.datasource.init.ScriptUtils     : Executing SQL script from class path resource [org/springframework/batch/core/schema-mysql.sql]

About realization with CommandLineJobRunner

Looking at Official page, if you use CommandLineJobRunner class, you can specify the job with the following arguments. There is.

No argument Description
1 jobPath The location of the XML file that will be used to create an ApplicationContext. This file should contain everything needed to run the complete Job
2 jobName The name of the job to be run.

<bash$ java CommandLineJobRunner io.spring.EndOfDayJobConfiguration endOfDay schedule.date(date)=2007/05/05

If batch is realized by this method, the class with the main function to be executed first must be the CommandLineJobRunner class. So, modify the buildJar task section of build.graldle to change the Start-Class from the default hello.Application class to CommandLineJobRunner.

bootJar {
    baseName = 'gs-batch-processing'
    version =  '0.1.0'
    manifest {
        attributes 'Start-Class': 'org.springframework.batch.core.launch.support.CommandLineJobRunner'
    }
}

If you execute the bootJar task in this state, a jar will be generated under build / libs. Unzip the jar and find META in jar file specifications If you check -INF / MANIFEST.MF, you can see that the changes in build.gradle are reflected. The Main-Class (https://docs.oracle.com/javase/tutorial/deployment/jar/appman.html) still specifies org.springframework.boot.loader.JarLauncher. And the Spring Boot specification Start-Class Is hello.Application by default, but changes to CommandLineJobRunner.

MANIFEST.MF


Manifest-Version: 1.0
Start-Class: org.springframework.batch.core.launch.support.CommandLineJobRunner
Main-Class: org.springframework.boot.loader.JarLauncher

Execute with arguments

You have successfully set the class with the main function in the jar file. Now, when you run this jar, the CommandLineJobRunner main function will be launched first. Execute by specifying the generated jar in the argument of -jar.

java -Dspring.profiles.active=development -jar gs-batch-processing-0.1.0.jar hello.BatchConfiguration importUserJob

A profile is still required as a system property, but the job name specified in spring.batch.job.names will be specified as the first argument as jobPath. The second argument specifies the importUserJob defined in @ Bean. I thought it would end normally, but when I ran the application, I got the following error.

21:12:52.267 [main] ERROR org.springframework.batch.core.launch.support.CommandLineJobRunner - Job Terminated in error: Error creating bean with name 'writer' defined in hello.BatchConfiguration: Unsatisfied dependency expressed through method 'writer' parameter 0; nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException: No qualifying bean of type 'javax.sql.DataSource' available: expected at least 1 bean which qualifies as autowire candidate. Dependency annotations: {}
org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'writer' defined in hello.BatchConfiguration: Unsatisfied dependency expressed through method 'writer' parameter 0; nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException: No qualifying bean of type 'javax.sql.DataSource' available: expected at least 1 bean which qualifies as autowire candidate. Dependency annotations: {}
	at org.springframework.beans.factory.support.ConstructorResolver.createArgumentArray(ConstructorResolver.java:732)

In the definition of writer included in the sample, one dataSource is specified with dataSource as an argument, but I think that the argument is 0.

BatchConfiguration.java


    @Bean
    public JdbcBatchItemWriter<Person> writer(DataSource dataSource) {
        return new JdbcBatchItemWriterBuilder<Person>()
            .itemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>())
            .sql("INSERT INTO people (first_name, last_name) VALUES (:firstName, :lastName)")
            .dataSource(dataSource)
            .build();
    }

Specify @SpringBootApplication

I wondered what happened to this, but there are only two annotations in the BatchConfiguration class:

I don't think there is any definition for Spring Boot, so I annotated from @Configuration to [@SpringBootApplication](https://docs.spring.io/spring-boot/docs/current/reference/html/using-boot -using-springbootapplication-annotation.html). This is because @ SpringBootApplication also includes @ Configuration.

BatchConfiguration.java


@SpringBootApplication
@EnableBatchProcessing
public class BatchConfiguration {

This will complete normally.

Profile is not reflected

However, when I look into the DB after executing the batch process, there is no data, including the people table. Looking at the execution result log, it seems that you are accessing hsql, which is in-memory. By the way, the string of ASCII art of Spring Boot does not appear either.

21:30:30.720 [HikariPool-1 connection adder] DEBUG com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Added connection org.hsqldb.jdbc.JDBCConnection@2f8bc07b
21:30:30.724 [main] DEBUG org.springframework.beans.factory.support.DefaultListableBeanFactory - Creating shared instance of singleton bean 'hikariPoolDataSourceMetadataProvider'
21:30:30.724 [main] DEBUG org.springframework.beans.factory.support.DefaultListableBeanFactory - Creating instance of bean 'hikariPoolDataSourceMetadataProvider'
21:30:32.546 [main] INFO org.springframework.batch.core.repository.support.JobRepositoryFactoryBean - No database type set, using meta data indicating: HSQL
21:30:33.032 [main] INFO org.springframework.jdbc.datasource.init.ScriptUtils - Executing SQL script from class path resource [org/springframework/batch/core/schema-hsqldb.sql]
21:30:33.044 [main] INFO org.springframework.jdbc.datasource.init.ScriptUtils - Executed SQL script from class path resource [org/springframework/batch/core/schema-hsqldb.sql] in 12 ms.
21:30:33.753 [HikariPool-1 connection closer] DEBUG com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - Closing connection org.hsqldb.jdbc.JDBCConnection@17776a8: (connection evicted)

I wondered why this was so, I compared the CommandLineJobRunner class with the SpringApplication class. What I found here is that CommandLineJobRunner is just creating a Spring context with AnnotationConfigApplicationContext. As a result, the bean is registered in the container, but no other behavior is expected.

CommandLineJobRunner.java


	int start(String jobPath, String jobIdentifier, String[] parameters, Set<String> opts) {

		ConfigurableApplicationContext context = null;

		try {
			try {
				context = new AnnotationConfigApplicationContext(Class.forName(jobPath));
			} catch (ClassNotFoundException cnfe) {
				context = new ClassPathXmlApplicationContext(jobPath);
			}

On the other hand, in Spring Application, the functions related to Environment are initialized. Also, it seems that the ASCII art string of Spring Boot is generated by the printBanner method in the initialized Environment instance.

SplingApplicaton.java


	public ConfigurableApplicationContext run(String... args) {
		StopWatch stopWatch = new StopWatch();
		stopWatch.start();
		ConfigurableApplicationContext context = null;
		Collection<SpringBootExceptionReporter> exceptionReporters = new ArrayList<>();
		configureHeadlessProperty();
		SpringApplicationRunListeners listeners = getRunListeners(args);
		listeners.starting();
		try {
			ApplicationArguments applicationArguments = new DefaultApplicationArguments(
					args);
			ConfigurableEnvironment environment = prepareEnvironment(listeners,
					applicationArguments); //The environment property is set here. application.The value of properties is set in the DataSourceProperties class and set in the Datasource.
			configureIgnoreBeanInfo(environment); 
			Banner printedBanner = printBanner(environment);

In the SpringApplication class, after various initializations are completed, the job is finally executed from the JobLauncherCommandLineRunner class. Since there is no SpringApplication initialization process in the CommandLineJobRunner class, the profile related functions cannot be used. This research focused on the profile function to achieve the requirement, but there may be other functions that cannot be used.

Consider @PropertySource

It turns out that application.properties, which should be loaded by default, is not available, let alone the functionality of the profile. So, using @PropertySource, use the property file I considered a policy to explicitly specify.

BatchConfiguartion.java


@SpringBootApplication
@EnableBatchProcessing
@PropertySource("application.properties")
public class BatchConfiguration {
}

Then add the environment at the end of the application.properties key.

application.properties


spring.batch.initialize-schema=ALWAYS
spring.datasource.username.development=root
spring.datasource.password.development=mysql
spring.datasource.url.development=jdbc:log4jdbc:mysql://127.0.0.1:3306/kasakaidDB?useSSL=false
spring.datasource.driver-class-name.development=net.sf.log4jdbc.DriverSpy
spring.datasource.username.production=production_user
spring.datasource.password.production=production_password
spring.datasource.url.production=jdbc:log4jdbc:mysql://production.kasakaid.io:3306/kasakaidDB?useSSL=false
spring.datasource.driver-class-name.production=net.sf.log4jdbc.DriverSpy

You should be able to access the system properties with System.getEnv ("spring.profiles.active"). I think that the information on the environment can be distinguished by this, but I think this method is very complicated.

Key is complicated

A large number of similar keys are lined up and difficult to distinguish

Judge the environment on your own

Using the Spring Boot feature, Spring will automatically pick up the appropriate application.properties based on your profile to determine what your current environment is. However, with this method, the implementer must always be aware of what the environment is. The benefits of using the framework are greatly lost. The implementation is like this.

SpringUtils.java


@Component
public SpringUtils {
    @Autowired
    Environment env;
    public String property(String key) {
        return env.getProperty(key + "." + System.getEnv("spring.profiles.active"));
    }
}

You have to do what Spring does automatically

What pops up this time is the setting of the Datasource property. In Spring Boot, the value of spring.datasource. * In application.properties is automatically set to Datasource. However, this method requires you to explicitly set the Datasource value.

Configuration.java


    @Autowired
    SpringUtils springUtils;
    @Bean
    public DataSource dataSource() {
        HikariDataSource ds = new HikariDataSource();
        ds.setDriverClassName(springUtils.property("spring.datasource.driver-class-name"));
        ds.setUsername(springUtils.property("spring.datasource.username"));
        ds.setPassword(springUtils.property("spring.datasource.password"));
        ds.setJdbcUrl(springUtils.property("spring.datasource.url"));
        return ds;
    }

CommandLineJobRunner Conclusion

As mentioned above, I have listed the disadvantages including imagination. There are three contents listed, but there is a problem that the implementer has to think about things that should not be considered. In addition, as development progresses, more inconveniences will appear, and there is no guarantee that the factors to be considered will collide and cause fatal inconvenience. For this reason, I came to think that it is better to stop executing Spring Batch with CommandLineJobRunner.

Recommended Posts

Things to consider when running a specified job using Spring Batch
Things to note when using Spring AOP in Jersery resource classes
I want to issue a connection when a database is created using Spring and MyBatis
Submit a job to AWS Batch with Java (Eclipse)
Things to watch out for when creating a framework
Things to keep in mind when using if statements
A memorandum when trying to create a GUI using JavaFX
Things to be aware of when using devise's lockable
[Introduction to Spring Boot] Submit a form using thymeleaf
Things to watch out for when using Deeplearning4j Kmeans
How to use In-Memory Job repository in Spring Batch
8 things to insert into DB using Spring Boot and JPA
Things to keep in mind when using Sidekiq with Rails
Spring Batch job launch parameters
How to make a hinadan for a Spring Boot project using SPRING INITIALIZR
Things to think about when deciding on a new system architecture
Easy way to create a mapping class when using the API
How to run a job with docker login in AWS batch
Things to forget when intercepting a request with Android's WebView # shouldInterceptRequest
A story I was addicted to when testing the API using MockMVC
Java beginner tried to make a simple web application using Spring Boot
Things to keep in mind when using Apache PDFBox® with AWS Lambda
How to not start Flyway when running unit tests in Spring Boot
Steps to create a simple camel app using Apache Camel Spring Boot starters
JSESSIONID could not be assigned to the URL when using Spring Security