How to refresh JPA entities when backend database changes asynchronously?

andand picture andand · Nov 6, 2012 · Viewed 92.3k times · Source

I have a PostgreSQL 8.4 database with some tables and views which are essentially joins on some of the tables. I used NetBeans 7.2 (as described here) to create REST based services derived from those views and tables and deployed those to a Glassfish 3.1.2.2 server.

There is another process which asynchronously updates contents in some of tables used to build the views. I can directly query the views and tables and see these changes have occured correctly. However, when pulled from the REST based services, the values are not the same as those in the database. I am assuming this is because JPA has cached local copies of the database contents on the Glassfish server and JPA needs to refresh the associated entities.

I have tried adding a couple of methods to the AbstractFacade class NetBeans generates:

public abstract class AbstractFacade<T> {
    private Class<T> entityClass;
    private String entityName;
    private static boolean _refresh = true;

    public static void refresh() { _refresh = true; }

    public AbstractFacade(Class<T> entityClass) {
        this.entityClass = entityClass;
        this.entityName = entityClass.getSimpleName();
    }

    private void doRefresh() {
        if (_refresh) {
            EntityManager em = getEntityManager();
            em.flush();

            for (EntityType<?> entity : em.getMetamodel().getEntities()) {
                if (entity.getName().contains(entityName)) {
                    try {
                        em.refresh(entity);
                        // log success
                    }
                    catch (IllegalArgumentException e) {
                        // log failure ... typically complains entity is not managed
                    }
                }
            }

            _refresh = false;
        }
    }

...

}

I then call doRefresh() from each of the find methods NetBeans generates. What normally happens is the IllegalArgumentsException is thrown stating somethng like Can not refresh not managed object: EntityTypeImpl@28524907:MyView [ javaType: class org.my.rest.MyView descriptor: RelationalDescriptor(org.my.rest.MyView --> [DatabaseTable(my_view)]), mappings: 12].

So I'm looking for some suggestions on how to correctly refresh the entities associated with the views so it is up to date.

UPDATE: Turns out my understanding of the underlying problem was not correct. It is somewhat related to another question I posted earlier, namely the view had no single field which could be used as a unique identifier. NetBeans required I select an ID field, so I just chose one part of what should have been a multi-part key. This exhibited the behavior that all records with a particular ID field were identical, even though the database had records with the same ID field but the rest of it was different. JPA didn't go any further than looking at what I told it was the unique identifier and simply pulled the first record it found.

I resolved this by adding a unique identifier field (never was able to get the multipart key to work properly).

Answer

Craig Ringer picture Craig Ringer · Nov 7, 2012

I recommend adding an @Startup @Singleton class that establishes a JDBC connection to the PostgreSQL database and uses LISTEN and NOTIFY to handle cache invalidation.

Update: Here's another interesting approach, using pgq and a collection of workers for invalidation.

Invalidation signalling

Add a trigger on the table that's being updated that sends a NOTIFY whenever an entity is updated. On PostgreSQL 9.0 and above this NOTIFY can contain a payload, usually a row ID, so you don't have to invalidate your entire cache, just the entity that has changed. On older versions where a payload isn't supported you can either add the invalidated entries to a timestamped log table that your helper class queries when it gets a NOTIFY, or just invalidate the whole cache.

Your helper class now LISTENs on the NOTIFY events the trigger sends. When it gets a NOTIFY event, it can invalidate individual cache entries (see below), or flush the entire cache. You can listen for notifications from the database with PgJDBC's listen/notify support. You will need to unwrap any connection pooler managed java.sql.Connection to get to the underlying PostgreSQL implementation so you can cast it to org.postgresql.PGConnection and call getNotifications() on it.

An an alternative to LISTEN and NOTIFY, you could poll a change log table on a timer, and have a trigger on the problem table append changed row IDs and change timestamps to the change log table. This approach will be portable except for the need for a different trigger for each DB type, but it's inefficient and less timely. It'll require frequent inefficient polling, and still have a time delay that the listen/notify approach does not. In PostgreSQL you can use an UNLOGGED table to reduce the costs of this approach a little bit.

Cache levels

EclipseLink/JPA has a couple of levels of caching.

The 1st level cache is at the EntityManager level. If an entity is attached to an EntityManager by persist(...), merge(...), find(...), etc, then the EntityManager is required to return the same instance of that entity when it is accessed again within the same session, whether or not your application still has references to it. This attached instance won't be up-to-date if your database contents have since changed.

The 2nd level cache, which is optional, is at the EntityManagerFactory level and is a more traditional cache. It isn't clear whether you have the 2nd level cache enabled. Check your EclipseLink logs and your persistence.xml. You can get access to the 2nd level cache with EntityManagerFactory.getCache(); see Cache.

@thedayofcondor showed how to flush the 2nd level cache with:

em.getEntityManagerFactory().getCache().evictAll();

but you can also evict individual objects with the evict(java.lang.Class cls, java.lang.Object primaryKey) call:

em.getEntityManagerFactory().getCache().evict(theClass, thePrimaryKey);

which you can use from your @Startup @Singleton NOTIFY listener to invalidate only those entries that have changed.

The 1st level cache isn't so easy, because it's part of your application logic. You'll want to learn about how the EntityManager, attached and detached entities, etc work. One option is to always use detached entities for the table in question, where you use a new EntityManager whenever you fetch the entity. This question:

Invalidating JPA EntityManager session

has a useful discussion of handling invalidation of the entity manager's cache. However, it's unlikely that an EntityManager cache is your problem, because a RESTful web service is usually implemented using short EntityManager sessions. This is only likely to be an issue if you're using extended persistence contexts, or if you're creating and managing your own EntityManager sessions rather than using container-managed persistence.