According to this tutorial asynchronous disk file io can easily be achieved using AIO on linux, at least from a programming/api point-of-view. But before and after this tutorial I had read a lot of posts and articles that this either can not be done or you should use libevent with a patch and many other issues. Another thing was the loop that I should have waited for a signal, but based on this tutorial I can use a callback mechanism, which obviously makes AIO much easier to use.
Now, I am not a linux programmer even by a long shot I just wanted to find a straightforward way to support asynchronous disk file io on linux, learn it and add it to a async disk io library that I need for a personal project. Currently I'm using overlapped io on windows and io worker threads on non-windows platforms. Since the mutithreaded solution can be tricky, I wanted to replace it on linux with AIO.
SO, what is wrong with AIO as described in this tutorial? Is it performance? Is there a restriction on operations that can be done using AIO?
p.s. I don't care if the code will not be portable to other POSIX-compliant platforms, as long as it works on major linux distributions. And all I care about is regular disk file io.
Thanks.
The tutorial gives an overview of asynchronous I/O in general and talks about how there is kernel support for it. Then it goes on to talk about posix AIO (which is the standardized API for accessing asynchronous I/O), implying that using the posix AIO API on linux, will give you access to the kernel support for AIO. This is not the case.
On linux, there are really two separate AIO implementations:
So, in short, if you already have a generic implementation of multiple threads for disk I/O, you might be better off using that than to use glibc's implementation (because you might have slightly more control over it).
If you're committed to actually use the io_submit() family of functions, you may have to do quite a lot of work to circumvent the restrictions on those functions.
kernel AIO requires your files to be opened with O_DIRECT. Which in turn requires all your file offsets, read and write sizes to be aligned to blocks on the disk. This is typically fine if you're just using one large file and you can make it work very similar to the page cache in the OS. For reading and writing arbitrary files at arbitrary offsets and lengths, it gets messy.
If you end up giving kernel AIO a shot, I would highly recommend looking into tying one or more eventfds to your iocbs so that you can wait on completions using epoll/select rather than having to block in io_getevents().