Since Hibernate 5.2, we are able to use the stream()
method instead of scroll()
if we want to fetch large amount of data.
However, when using scroll()
with ScrollableResults
we are able to a hook into the retrieval process and to free memory up by either evicting the object from the persistent context after processing it and/or clearing the entire session every now and then.
My questions:
stream()
method, what happens behind the scenes? hibernate.jdbc.fetch_size
to some number (e.g. 1000) at JPA properties, then how is this combined well with scrollable results?The following works for me:
DataSourceConfig.java
@Bean
public LocalSessionFactoryBean sessionFactory() {
// Link your data source to your session factory
...
}
@Bean("hibernateTxManager")
public HibernateTransactionManager hibernateTxManager(@Qualifier("sessionFactory") SessionFactory sessionFactory) {
// Link your session factory to your transaction manager
...
}
MyServiceImpl.java
@Service
@Transactional(propagation = Propagation.REQUIRES_NEW, transactionManager = "hibernateTxManager", readOnly = true)
public class MyServiceImpl implements MyService {
@Autowired
private MyRepo myRepo;
...
Stream<MyEntity> stream = myRepo.getStream();
// Do your streaming and CLOSE the steam afterwards
...
MyRepoImpl.java
@Repository
@Transactional(propagation = Propagation.MANDATORY, transactionManager = "hibernateTxManager", readOnly = true)
public class MyRepoImpl implements MyRepo {
@Autowired
private SessionFactory sessionFactory;
@Autowired
private MyDataSource myDataSource;
public Stream<MyEntity> getStream() {
return sessionFactory.openStatelessSession(DataSourceUtils.getConnection(myDataSource))
.createNativeQuery("my_query", MyEntity.class)
.setReadOnly(true)
.setFetchSize(1000)
.stream();
}
...
Just remember, when you stream you really only need to be cautious of memory at the point of object materialisation. That is truly the only part of the operation susceptible to problems in memory. In my case I chunk the stream at 1000 objects at a time, serialise them with gson and send them to a JMS broker immediately. The garbage collector does the rest.
It's worth noting that Spring's transactional boundary awareness closes the connection to the dB at the end without needing to be explicitly told.