Sunday, April 10, 2022

[FIXED] Spring batch JdbcPagingItemReader not able to read all events

April 10, 2022 spring, spring-batch, spring-boot, spring-integration

Issue

I had spring batch application like below (table name and query are edited for some general names)

when i execute this program, it was able to read 7500 events , i.e 3 times of chunk size and not able to read remaining records in oracle database. I had a table contain 50 million records and able to copy to another noSql database.

@EnableBatchProcessing
@SpringBootApplication
@EnableAutoConfiguration
public class MultiThreadPagingApp extends DefaultBatchConfigurer{

@Autowired
private JobBuilderFactory jobBuilderFactory;

@Autowired
private StepBuilderFactory stepBuilderFactory;

@Autowired
public DataSource dataSource;

@Bean
public DataSource dataSource() {
    final DriverManagerDataSource dataSource = new DriverManagerDataSource();
    dataSource.setDriverClassName("oracle.jdbc.OracleDriver");
    dataSource.setUrl("jdbc:oracle:thin:@***********");
    dataSource.setUsername("user");
    dataSource.setPassword("password");

    return dataSource;
}


@Override
public void setDataSource(DataSource dataSource) {}

@Bean
@StepScope
ItemReader<UserModel> dbReader() throws Exception {

    JdbcPagingItemReader<UserModel> reader = new JdbcPagingItemReader<UserModel>();
    final SqlPagingQueryProviderFactoryBean sqlPagingQueryProviderFactoryBean = new SqlPagingQueryProviderFactoryBean();        
    sqlPagingQueryProviderFactoryBean.setDataSource(dataSource);
    sqlPagingQueryProviderFactoryBean.setSelectClause("select * ");
    sqlPagingQueryProviderFactoryBean.setFromClause("from user");
    sqlPagingQueryProviderFactoryBean.setWhereClause("where id>0");
    sqlPagingQueryProviderFactoryBean.setSortKey("name");
    reader.setQueryProvider(sqlPagingQueryProviderFactoryBean.getObject());
    reader.setDataSource(dataSource);
    reader.setPageSize(2500);       
    reader.setRowMapper(new BeanPropertyRowMapper<>(UserModel.class));
    reader.afterPropertiesSet();
    reader.setSaveState(true);
    System.out.println("Reading users anonymized in chunks of {}"+ 2500);
    return reader;
}


@Bean
public Dbwriter writer() {
    return new Dbwriter(); // I had another class for this
}   

@Bean
public Step step1() throws Exception {
    ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
    taskExecutor.setCorePoolSize(4);
    taskExecutor.setMaxPoolSize(10);
    taskExecutor.afterPropertiesSet();

    return this.stepBuilderFactory.get("step1")
            .<UserModel, UserModel>chunk(2500)
            .reader(dbReader())
            .writer(writer())
            .taskExecutor(taskExecutor)
            .build();
}


@Bean
public Job multithreadedJob() throws Exception {
    return this.jobBuilderFactory.get("multithreadedJob")
            .start(step1())
            .build();
} 


@Bean
public PlatformTransactionManager getTransactionManager() {
    return new ResourcelessTransactionManager();
}

@Bean
public JobRepository getJobRepo() throws Exception {
    return new MapJobRepositoryFactoryBean(getTransactionManager()).getObject();
}

public static void main(String[] args) {
    SpringApplication.run(MultiThreadPagingApp.class, args);
}

}

Can you help me how can i efficiently read all the records using spring batch, or help me any other approach to handle this. I had tried one approch mentioned here : http://techdive.in/java/jdbc-handling-huge-resultset its taken 120 mins to read and save all records with single thread application. Since spring batch is best fit for this, I assume we can handle this scenario in quick time.

Solution

You are setting the saveState flag to true (BTW, it should be set before calling afterPropertiesSet) on a JdbcPagingItemReader and using this reader in a multithreaded step. However, it is documented to set this flag to false in a multi-threaded context.

Multi-threading with database readers is usually not the best option, I would recommend to use partitioning in your case.

Answered By - Mahmoud Ben Hassine
Answer Checked By - Clifford M. (JavaFixing Volunteer)

This Answer collected from stackoverflow and tested by JavaFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, April 10, 2022

[FIXED] Spring batch JdbcPagingItemReader not able to read all events

Issue

Solution

Popular Posts

Labels