Skip to content

Spring Batch Job Restart: What should be the correct values for chunk and readerPageSize with respect to grid size.  #4589

@PSHREYASHOLLA

Description

@PSHREYASHOLLA

This is posted here as I did not get a response on https://stackoverflow.com/questions/78199004/restarting-a-failed-job-is-not-processing-the-failed-chunk-data-again-continua

This is my step definition for spring batch partition approach,

@bean public Step step1() { return new StepBuilder("step1", jobRepository) .partitioner(slaveStep().getName(), partitioner()) .step(slaveStep()) .gridSize(febpEmployeeTaxCalculationBatchProperties.getTaxCalculationStep1PartitionGridSize()) .taskExecutor(actStmntTaskExecutor()) .build(); }

// slave step
@Bean
public Step slaveStep() 
{
    
    try {
        return new StepBuilder("slaveStep", jobRepository)
                .<EmployeeDetail, EmployeeTaxDetail>chunk(febpEmployeeTaxCalculationBatchProperties.getTaxCalculationStep1ReaderPageSize(),transactionManager)
                .reader(pagingItemReader(null,null))
                .processor(processor())
                .writer(customerItemWriter())
            .taskExecutor(actStmntTaskExecutor())
                .build();
    } catch (Exception ex) {
        throw new RuntimeException("Error creating slave step: " + ex.getMessage());
    }
}

/**

  • Act stmnt task executor.
  • @return the simple async task executor
    */
    public SimpleAsyncTaskExecutor actStmntTaskExecutor() {
    SimpleAsyncTaskExecutor acctStmtTaskExecuter = new SimpleAsyncTaskExecutor();
    acctStmtTaskExecuter.setConcurrencyLimit(febpEmployeeTaxCalculationBatchProperties.getTaxCalculationStep1TaskExecuterThreadConcurrencyLimit());
    acctStmtTaskExecuter.setThreadPriority(febpEmployeeTaxCalculationBatchProperties.getTaxCalculationStep1TaskExecuterThreadPriority());
    acctStmtTaskExecuter.setThreadNamePrefix("FEBP_TAX_CALCULATION_GEN");
    return acctStmtTaskExecuter;
    }

So here for restart to work,

1)Recommendation for restart is not using TaskExecuter. So should we not use TaskExecuter as mentioned above in step1() and slaveStep()? How to achieve concurrency ?

  1. So only SimpleAsyncTaskExecutor needs to be used? If yes is it at step or slavestep?

3)How should the chunk size, gridSize and readers pagesize and fetchsiz be calculated? As I see if chunk size is set to low values, restart is not re-writting failed data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions