Conceptual problems with IndexedDB (relationships etc.)

Felix picture Felix · Jan 1, 2012 · Viewed 8.8k times · Source

I'm writing a thesis about offline abilities of web applications. My task is to show the possibilities of offline storage through a web application with a server-side relational database and Ajax/JSON traffic between client and server. My first implementation used an approach with localStorage, saving each Ajax response as value with the request URL as key. The app works just fine. In the next step however, I want to (i.e. the thesis requires) implement a more advanced version with a client-side database. Since the server maintains a relational database, Web SQL Database would have been the intuitive choice. But, as we know, the standard is deprecated and I don't want to use a technology whose future is uncertain. Thus, I want to use IndexedDB to implement client-side database logic. Unfortunately, after reading a lot of material on the web that mostly keeps very much scratching the surface (todo-notes applications etc.), I still don't know how to proceed.

My task seems rather straightforward: implement the server-side database on the client with IndexedDB to replicate all data that was once fetched from the server. The problems, which make this far less straightforward are:

  • The server-side database is relational, IndexedDB is (more or less) object-oriented
  • There is no intuitive way to synchronize client- and server-side databases
  • There is no intuitive way to implement the relationships in IndexedDB that are implemented with foreign keys and JOINs on the server

Right now, I have a concept in mind which I'm really afraid to start to implement. I thought about creating an object store for every table in the server-database and program the relations objects in different object stores manually. In my application, which, in short, manages courses of a university, I'd have 7 object stores.

I want to demonstrate my idea with an example for a JSON response from the server (/* these are comments */):

{ "course": { /* course object */
    "id":1, 
    "lecturer": { "id":"1", /* lecturer object with many attributes */ },
    "semester": { "id":"1", /* semester object with many attributes */ }, 
    /* more references and attributes */
}}

The algorithm to store the data with IndexedDB would store each object that applies to an object store in the appropriate object store and replace the objects with references to these objects. For example, the above course object would look like the following in the object store 'course':

{ "course": { /* course object */
    "id":1, 
    "lecturer": 
    { "reference": { /* reference to the lecturer in the object store 'lecturer' */
        "objectstore":"lecturer",
        "id":"1" }
    },
    "semester":
    { "reference": { /* reference to the semester in the object store 'semester' */
        "objectstore":"semester",
        "id":"1" }
    }
    /* more references and attributes */
}}

The algorithm to retrieve data with IndexedDB would then do the following (I have a recursive pattern vaguely in mind):

Retrieve the course object with id=1 from the object store 'course'
For each reference object in the retrieved course object, do
   Retrieve the object with id=reference.id from the object store reference.objectstore
   Replace the reference object with the retrieved object

It is clearly visible that this implementation would be really cumbersome, especially due to the asynchronous nature of IndexedDB. It would also result in many different transactions to the database just to retrieve a course object and performance would suffer a lot (I don't really know what the performance of IndexedDB transactions looks like anyway).

How could I do this better and simpler?

I already looked at these threads which represent similar problems: link1, link2. I don't see any simpler solutions in these. Moreover, I'd prefer to avoid using an IndexedDB wrapper framework due to several reasons.

I could also imagine that I'm totally on the wrong track with IndexedDB for my problem.

Edit:

I finally ended up pursuing my approach to store the references in the objects themselves in the IndexedDB. This can result in some performance problems in cases of large amounts of data with many references. If used smartly, however, tremendous amounts of iteration and database hits can be avoided in most cases, and there is no need to store a complex database schema in memory or in the IndexedDB itself.

Generally I must say, that I get the impression that I'm misinterpreting the dynamic and straight idea with IndexedDB as a schemaless database in some way. But whatever, I implemented everything in JavaScript, it works fine and there is no chance for any inconsistencies.

Answer

Matt Browne picture Matt Browne · Jan 15, 2012

I'm new to IndexedDB myself but I too have been thinking a lot about how I would use IndexedDB for purposes like this. The first thing I would suggest, if you haven't done it already, is to look at how other key-value / document databases (CouchDB, MongoDB, etc.) work, since that's essentially the type of database that IndexedDB is.

There are several different approaches to handling relationships in a document database...as to synchronizing with your relational server-side database, you will probably need to create some kind of custom mapping because some of the approaches to relationships that would make sense for IndexedDB will not map very cleanly to a relational database. However, I think setting up such a mapping is definitely doable, and the bigger issue is how to handle relationships in IndexedDB, so that's what I'll focus on here...

As to your proposed solution, I think it could actually work well, and you could write a simple querying library that helped consolidate the plumbing code (more on that below). Key-value stores are built to be very efficient at looking up items by key, so doing so for each related object might not be as inefficient as you think...however, I came up with another idea that makes better use of indexes...

First, for my proposed solution, you'd need to store the "objectstore" metadata somewhere other than within the "reference" object itself...it doesn't necessarily even need to be stored in IndexedDB at all; you could just use an in-memory schema for that:

var schema = {
    Course: {
        fields: [id, title],
        relationships: {
            lecturers: {objectstore: 'lecturer'},
            semester: {objectstore: 'semester'},
        }
    },
    Lecturer: { ... }
    ...
};

(By the way, your JSON example has an error...you can't have more than one key called "reference" - it would need to be a "references" array.)

This frees you up to store the ID values directly in the relationship fields, so that you can create indexes on them (I've used letter prefixes for clarity even though in reality all of these would probably have an ID of 1, since ID values don't need to be unique across stores):

var course1 = {
    id:'C1',
    lecturers:['L1'],
    semester:1
};

var lecturer1 = {
    id:'L1',
    courses:['C1']
}

var semester1 = {
    id:'S1',
    courses:['C1']
}

You would, of course, have to be careful that all storage/retrieval operations happened through data access functions (e.g. insert(), update(), delete()) that were smart enough to ensure that the relationships were always updated correctly on both ends...actually you may not need that depending on how you plan to query the data, but it seems like a good idea since you might sometimes just want to get the IDs of the related objects (to be looked up later, or not) rather than actually retrieving them.

Let's say you have an index on the "courses" field in the lecturer store. Using an index, you could look up all of the lecturers associated with a particular course ID in one fell swoop:

lecturerStore.index("courses").get("C1").onsuccess = …

For that example it doesn't matter much because courses will generally only have 1-2 lecturers, but consider how an index could be used to efficiently look up all the courses in a particular semester:

coursesStore.index("semester").get("S1").onsuccess = …

Note that in the lecturer example (a many-to-many relationship), the index would need to be specified as "multientry," meaning that if you have a field whose value is an array, each element of the array will be added to the index. (See https://developer.mozilla.org/en/IndexedDB/IDBObjectStore#createIndex ...I'm not sure what the browser support is on this.)

And I'm sure you could do other clever things with indexing too, using cursors and IDBKeyRange to help do some sort of "join" operation. For ideas, check out this link, which demonstrates ways of handling relationships in CouchDB:

http://wiki.apache.org/couchdb/EntityRelationship

That link also mentions using embedded documents, which is something you should definitely consider -- not all objects necessarily need to have their own object store, especially for "aggregation" relationships.

(By the way, I'm not sure how helpful it would be to you since it doesn't provide much in the way of querying, but someone has actually implemented a CouchDB-like database on top of IndexedDB: https://github.com/mikeal/pouchdb)

In addition to indexes, implementing a caching mechanism would probably help a lot too.

Now, as to simplifying the querying process, I know you mentioned not wanting to use a wrapper library...but I had an idea about a convenient API that could be created, that would accept an object like this:

//select all courses taught by 'Professor Wilkins'
{
from: 'lecturer',  //open cursor on lecturer store 
where: function(lecturer) { return lecturer.name=='Professor Wilkins' }, //evaluate for each item found
select: function(lecturer) { return lecturer.courses }, //what to return from previous step
//this should be inferred in this case, but just to make it clear...
eagerFetch: function(lecturer) { return lecturer.courses }
}

I'm not sure how difficult it would be to implement, but it definitely seems like it would make life easier.

I've rambled long enough, but I wanted to mention one last thing, which is that I've also been thinking about borrowing some ideas from graph databases, since they're much better at handling relationships than document databases, and I do think it would be possible to implement a graph database on top of IndexedDB, I'm just not yet sure how practical it would be.

Good luck!