How to decode /proc/pid/pagemap entries in Linux?

Naruto picture Naruto · Jun 10, 2013 · Viewed 11k times · Source

I am trying to decipher how to use /proc/pid/pagemap to get the physical address of a given set of pages. Suppose from the /proc/pid/maps, I get the virtual address afa2d000-afa42000 which corresponds to the heap. My question is how do I use this info to traverse the pagemap file and find the physical page frames correspond to the address afa2d000-afa42000.

The /proc/pid/pagemap entry is in binary format. Is there any tools to help parsing of this file?

Answer

Linux kernel documentation

Linux kernel doc describing the format: https://github.com/torvalds/linux/blob/v4.9/Documentation/vm/pagemap.txt

* Bits 0-54  page frame number (PFN) if present
* Bits 0-4   swap type if swapped
* Bits 5-54  swap offset if swapped
* Bit  55    pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
* Bit  56    page exclusively mapped (since 4.2)
* Bits 57-60 zero
* Bit  61    page is file-page or shared-anon (since 3.5)
* Bit  62    page swapped
* Bit  63    page present

C parser function

GitHub upstream.

#define _XOPEN_SOURCE 700
#include <fcntl.h> /* open */
#include <stdint.h> /* uint64_t  */
#include <stdlib.h> /* size_t */
#include <unistd.h> /* pread, sysconf */

typedef struct {
    uint64_t pfn : 54;
    unsigned int soft_dirty : 1;
    unsigned int file_page : 1;
    unsigned int swapped : 1;
    unsigned int present : 1;
} PagemapEntry;

/* Parse the pagemap entry for the given virtual address.
 *
 * @param[out] entry      the parsed entry
 * @param[in]  pagemap_fd file descriptor to an open /proc/pid/pagemap file
 * @param[in]  vaddr      virtual address to get entry for
 * @return 0 for success, 1 for failure
 */
int pagemap_get_entry(PagemapEntry *entry, int pagemap_fd, uintptr_t vaddr)
{
    size_t nread;
    ssize_t ret;
    uint64_t data;

    nread = 0;
    while (nread < sizeof(data)) {
        ret = pread(pagemap_fd, ((uint8_t*)&data) + nread, sizeof(data) - nread,
                (vaddr / sysconf(_SC_PAGE_SIZE)) * sizeof(data) + nread);
        nread += ret;
        if (ret <= 0) {
            return 1;
        }
    }
    entry->pfn = data & (((uint64_t)1 << 54) - 1);
    entry->soft_dirty = (data >> 54) & 1;
    entry->file_page = (data >> 61) & 1;
    entry->swapped = (data >> 62) & 1;
    entry->present = (data >> 63) & 1;
    return 0;
}

Example runnable programs using it: