Insertion of thousands of contact entries using applyBatch is slow

Anders picture Anders · Apr 8, 2011 · Viewed 20.6k times · Source

I'm developing an application where I need to insert lots of Contact entries. At the current time approx 600 contacts with a total of 6000 phone numbers. The biggest contact has 1800 phone numbers.

Status as of today is that I have created a custom Account to hold the Contacts, so the user can select to see the contact in the Contacts view.

But the insertion of the contacts is painfully slow. I insert the contacts using ContentResolver.applyBatch. I've tried with different sizes of the ContentProviderOperation list(100, 200, 400), but the total running time is approx. the same. To insert all the contacts and numbers takes about 30 minutes!

Most issues I've found regarding slow insertion in SQlite brings up transactions. But since I use the ContentResolver.applyBatch-method I don't control this, and I would assume that the ContentResolver takes care of transaction management for me.

So, to my question: Am I doing something wrong, or is there anything I can do to speed this up?

Anders

Edit: @jcwenger: Oh, I see. Good explanation!

So then I will have to first insert into the raw_contacts table, and then the datatable with the name and numbers. What I'll lose is the back reference to the raw_id which I use in the applyBatch.

So I'll have to get all the id's of the newly inserted raw_contacts rows to use as foreign keys in the data table?

Answer

jcwenger picture jcwenger · Apr 8, 2011

Use ContentResolver.bulkInsert (Uri url, ContentValues[] values) instead of ApplyBatch()

ApplyBatch (1) uses transactions and (2) it locks the ContentProvider once for the whole batch instead locking/unlocking once per operation. because of this, it is slightly faster than doing them one at a time (non-batched).

However, since each Operation in the Batch can have a different URI and so on, there's a huge amount of overhead. "Oh, a new operation! I wonder what table it goes in... Here, I'll insert a single row... Oh, a new operation! I wonder what table it goes in..." ad infinitium. Since most of the work of turning URIs into tables involves lots of string comparisons, it's obviously very slow.

By contrast, bulkInsert applies a whole pile of values to the same table. It goes, "Bulk insert... find the table, okay, insert! insert! insert! insert! insert!" Much faster.

It will, of course, require your ContentResolver to implement bulkInsert efficiently. Most do, unless you wrote it yourself, in which case it will take a bit of coding.