In my understanding "chunk oriented processing" in Spring Batch helps me to efficiently process multiple items in a single transaction. This includes efficient use of interfaces from external systems. As external communication includes overhead, it should be limited and chunk-oriented too. That's why we have the commit-level for the ItemWriter
.
So what I don't get is, why does the ItemReader
still have to read item-by-item? Why can't I read chunks also?
In my step, the reader has to call a webservice. And the writer will send this information to another webservice. That's why I wan't to do as few calls as necessary.
The interface of the ItemWriter
is chunk-oriented - as you know for sure:
public abstract void write(List<? extends T> paramList) throws Exception;
But the ItemReader
is not:
public abstract T read() throws Exception;
As a workaround I implemented a ChunkBufferingItemReader
, which reads a list of items, stores them and returns items one-by-one whenever its read()
method is called.
But when it comes to exception handling and restarting of a job now, this approach is getting messy. I'm getting the feeling that I'm doing work here, which the framework should do for me.
So am I missing something? Is there any existing functionality in Spring Batch I just overlooked?
In another post it was suggested to change the return type of the ItemReader
to a List
. But then my ItemProcessor
would have to emit multiple outputs from a single input. Is this the right approach?
I'm graceful for any best practices. Thanks in advance :-)
This is a draft for an implementation of the read() interface method.
public T read() throws Exception {
while (this.items.isEmpty()) {
final List<T> newItems = readChunk();
if (newItems == null) {
return null;
}
this.items.addAll(newItems);
}
return this.items.pop();
}
Please note, that items
is a buffer for the items read in chunks and not requested by the framework yet.