Fastest way to insert many rows into sqlite db on iPhone

Crystal picture Crystal · Aug 2, 2012 · Viewed 8.5k times · Source

Can someone explain what the best way to insert a lot of data on the iPhone using FMDB is? I see things like using the beginTransaction command. I'm honestly not sure what this or setShouldCacheStatements do. I followed what code my coworker did so far, and this is what it looks like:

BOOL oldshouldcachestatements = _db.shouldCacheStatements;
[_db setShouldCacheStatements:YES];
[_db beginTransaction];
NSString *insertQuery = [[NSString alloc] initWithFormat:@"INSERT INTO %@ values(null, ?, ?, ?, ?, ?, ?, ?);", tableName];
[tableName release];
BOOL success;

bPtr += 2;
int *iPtr = (int *)bPtr;
int numRecords = *iPtr++;

for (NSInteger record = 0; record < numRecords; record++) {
    NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    // Some business logic to read the binary stream  

    NSNumber *freq = aFreq > 0 ? [NSNumber numberWithDouble:100 * aFreq / 32768]: [NSNumber numberWithDouble:-1.0];

    // these fields were calculated in the business logic section
    success = [_db executeUpdate:insertQuery,
               cID,                                                                                                   
               [NSNumber numberWithInt:position],                                                                       
               [NSString stringWithFormat:@"%@%@", [self stringForTypeA:typeA], [self stringForTypeB:typeB]],     // methods are switch statements that look up the decimal number and return a string
               [NSString stringWithFormat:@"r%i", rID],                                                               
               [self stringForOriginal:original],                                                                    
               [self stringForModified:modified],                                                                    
               freq];

    [pool drain];
}
[outerPool drain];

[_db commit];
[_db setShouldCacheStatements:oldshouldcachestatements];

Is this the fastest I can do? Is the writing the limitation of sqlite? I saw this: http://www.sqlite.org/faq.html#q19 and wasn't sure if this implementation was the best with fmdb, or if there was any other thing I can do. Some other coworkers mentioned something about bulk inserts and optimziing that, but I'm not honestly sure what that means since this is my first sqlite encounter. Any thoughts or directions I can go research? Thanks!

Answer

Sebastian Hojas picture Sebastian Hojas · Aug 7, 2012

First of all, in most cases you do not have to be concerned about the performance of sqlite3, if you are not using it completely wrong.

The following things boost the performance of INSERT statements:

Transactions

As you already mentioned, transactions are the most important feature. Especially if you have a large amount of queries, transaction will speed up your INSERTs by ~10 times.

PRAGMA Config

Sqlite3 provides several mechanism which avoid the corruption of your database in the worst cases. In some scenarios, this is not needed. In others, it is absolutely essential. The following sqlite3 commands may speed up your INSERT statements. A normal crash of your app will not corrupt the database, but a crash of the OS could.

PRAGMA synchronous=OFF -- may cause corruption if the OS fails
PRAGMA journal_mode=MEMORY -- Insert statements will be written to the disk at the end of the transaction
PRAGMA cache_size = 4000 -- If your SELECTs are really big, you may need to increase the cache
PRAGMA temp_store=MEMORY -- Attention: increases RAM use

Deactivate Indicies

Any SQL Index slows a INSERT statement down. Check if your table has some indices:

.indices <table_name>

If yes, DROP the INDEX and CREATE it after the transaction.

One Select

I do not see a way of using a BULK insert as you are generating new data. However, you could collect data and just perform one INSERT statement. This may boost up your performance dramatically, but it also rises the possibility of failure (syntax, for instance). One hack meets another hack: As sqlite3 does not support this directly, you have to use the UNION command to collect all insert statements accordingly.

INSERT INTO 'tablename'
      SELECT 'data1' AS 'column1', 'data2' AS 'column2'
UNION SELECT 'data3', 'data4'
UNION SELECT 'data5', 'data6'
UNION SELECT 'data7', 'data8'

Statement Caching

I would suggest to avoid the use of statement caching as there is a unfixed issue with this feature (and far as I know, it does not influence the performance dramatically).

Memory Management

The last point I'd like to mention is about ObjC. Compared to basic operations, memory management needs very very much time. Maybe you could avoid some stringWithFormat: or numberWithDouble: by preparing these variable outside the loop.

Summary

All in all, I don't think that you will have a problem with the speed of sqlite3 if you simply use transactions.