.NET 4.5 has added new classes to work with zip archives. Now you can do something like this:
using (ZipArchive archive = ZipFile.OpenRead(zipFilePath))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
// Extract it to the file
entry.ExtractToFile(entry.Name);
// or do whatever you want
using (Stream stream = entry.Open())
{
...
}
}
}
Obviously, if you work with large archives it may take seconds or even minutes to read the files from the archive. So if you were writing some GUI app (WinForms or WPF) you would probably run such code in a separate thread otherwise you will block UI thread and make your app users very upset.
However all I/O operations in this code will be executed in the blocking mode which is considered as "not cool" in 2016. So there are two questions:
System.IO.Compression
classes (or maybe with some other third-party .NET library)?UPDATE:
To reply to the answer from Peter Duniho: yes, you're right. For some reason I didn't think about this option:
using (Stream zipStream = entry.Open())
using (FileStream fileStream = new FileStream(...))
{
await zipStream.CopyToAsync(fileStream);
}
which definitely works. Thanks!
By the way
await Task.Run(() => entry.ExtractToFile(entry.Name));
will still be CPU-bound blocking I/O operation, just in separate thread consume the thread from the thread pool during I/O operations.
However as I can see developers of .NET still use blocking I/O for some archive operations (like this code to enumerate entries in the archive for example: ZipArchive.cs on dotnet@github). I also found an open issue about the lack of asynchronous API for ZipFile APIs.
I guess at this time we have partial async support but it is far from complete.
- Is it possible to get async I/O with
System.IO.Compression
classes (or maybe with some other third-party .NET library)?
Depending on what you actually mean by "async I/O", you can do it with the built-in .NET types. For example:
using (ZipArchive archive = await Task.Run(() => ZipFile.OpenRead(zipFilePath)))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
// Extract it to the file
await Task.Run(() => entry.ExtractToFile(entry.Name));
// or do whatever you want
using (Stream stream = entry.Open())
{
// use XXXAsync() methods on Stream object
...
}
}
}
Wrap these in XXXAsync()
extension methods if you like.
- Does it even make sense to do that? I mean compressing/extracting algorithms are very CPU-consuming anyway, so if we even switch from CPU-bound I/O to async I/O, the performance gain can be relatively small (of course in percentage, not absolute values).
At least three reasons to do it: