Linux Zero-Copy: Transfer memory pages between two processes with vmsplice

nosid picture nosid · May 17, 2012 · Viewed 7k times · Source

Currently, I am trying to understand the value of splice/vmsplice. Regarding the use case of IPC, I stumbled upon the following answer on stackoverflow: https://stackoverflow.com/a/1350550/1305501

Question: How to transfer memory pages from one process to another process using vmsplice without copying data (i.e. zero-copy)?

The answer mentioned above claims that it is possible. However, it doesn't contain any source code. If I understand the documentation of vmsplice correctly, the following function will transfer the memory pages into a pipe (kernel buffer) without copying, if the memory is properly allocated and aligned. Error handling omitted for the ease of presentation.

// data is aligned to page boundaries,
// and length is a multiple of the page size
void transfer_to_pipe(int pipe_out, char* data, size_t length)
{
    size_t offset = 0;
    while (offset < length) {
        struct iovec iov { data + offset, length - offset };
        offset += vmsplice(pipe_out, &iov, 1, SPLICE_F_GIFT);
    }
}

But how can the memory pages be accessed from user space without copying? Apparently the following methods don't work:

  • vmsplice: This function can also be used for the reverse direction. But according to the comments in the kernel sources, the data will be copied.
  • read: I can imagine, that this function does some magic if the memory is properly aligned, but I doubt it.
  • mmap: Not possible on pipe. But is there some kind of virtual file that can be used instead, i.e. splice the memory pages to the virtual file and mmap it?
  • ... ?

Isn't it possible at all with vmsplice?

Answer

dtatulea picture dtatulea · May 17, 2012

As R.. mentioned, you only need to pass the fd to the receiving process somehow and on the other side use it as a normal fd.

edit: Actually, you have to use vmsplice() on the sending side to map the buffer to the pipe and splice() on the receiving side on the other end of the pipe. See an example here.

Another choice would be to use a shared mmap-ing.