Linux MMAP internals

Mr Jay picture Mr Jay · Apr 15, 2009 · Viewed 11.3k times · Source

I have several questions regarding the mmap implementation in Linux systems which don't seem to be very much documented:

When mapping a file to memory using mmap, how would you handle prefetching the data in such file?

I.e. what happens when you read data from the mmaped region? Is that data moved to the L1/L2 caches? Is it read directly from disk cache? Does the prefetchnta and similar ASM instructions work on mmaped zones?

What's the overhead of the actual mmap call? Is it relative to the amount of mapped data, or constant?

Hope somebody has some insight into this. Thanks in advance.

Answer

Will Hartung picture Will Hartung · Apr 15, 2009

mmap is basically programmatic access to the Virtual Memory sub system.

When you have, say, 1G file, and you mmap it, you get a pointer to "the entire" file as if it were in memory.

However, at this stage nothing has happened save the actual mapping operation of reserving pages for the file in the VM. (The large the file, the longer the mapping operation, of course.)

In order to start reading data from the file, you simply access it through the pointer you were returned in the mmap call.

If you wish to "preload" parts of the file, just visit the area you'd like to preload. Make sure you visit ALL of the pages you want to load, since the VM will only load the pages you access. For example, say within your 1G file, you have a 10MB "index" area that you'd like to map in. The simplest way would be to just "walk your index", or whatever data structure you have, letting the VM page in data as necessary. Or, if you "know" that it's the "first 10MB" of the file, and that your page size for your VM is, say, 4K, then you can just cast the mmap pointer to a char pointer, and just iterate through the pages.

void load_mmap(char *mmapPtr) {
    // We'll load 10MB of data from mmap
    int offset = 0;
    for(int offset = 0; offset < 10 * 1024 * 1024; offset += 4 * 1024) {
        char *p = mmapPtr + offset;
        // deref pointer to force mmap load
        char c = *p;
    }
}

As for L1 and L2 caches, mmap has nothing to do with that, that's all about how you access the data.

Since you're using the underlying VM system, anything that addresses data within the mmap'd block will work (ever from assembly).

If you don't change any of the mmap'd data, the VM will automatically flush out old pages as new pages are needed If you actually do change them, then the VM will write those pages back for you.