VB.NET - download zip in Memory and extract file from memory to disk

Mark picture Mark · Jul 7, 2011 · Viewed 7k times · Source

I'm having some trouble with this, despite finding examples. I think it may be an encoding problem, but I'm just not sure. I am trying to programitally download a file from a https server, that uses cookies (and hence I'm using httpwebrequest). I'm debug printing the capacity of the streams to check, but the output [raw] files look different. Have tried other encoding to no avail.

Code:

    Sub downloadzip(strURL As String, strDestDir As String)

    Dim request As HttpWebRequest
    Dim response As HttpWebResponse

    request = Net.HttpWebRequest.Create(strURL)
    request.UserAgent = strUserAgent
    request.Method = "GET"
    request.CookieContainer = cookieJar
    response = request.GetResponse()

    If response.ContentType = "application/zip" Then
        Debug.WriteLine("Is Zip")
    Else
        Debug.WriteLine("Is NOT Zip: is " + response.ContentType.ToString)
        Exit Sub
    End If

    Dim intLen As Int64 = response.ContentLength
    Debug.WriteLine("response length: " + intLen.ToString)

    Using srStreamRemote As StreamReader = New StreamReader(response.GetResponseStream(), Encoding.Default)
        'Using ms As New MemoryStream(intLen)
        Dim fullfile As String = srStreamRemote.ReadToEnd

        Dim memstream As MemoryStream = New MemoryStream(New UnicodeEncoding().GetBytes(fullfile))

        'test write out to flie
        Dim data As Byte() = memstream.ToArray()
        Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
            filestrm.Write(data, 0, data.Length)
        End Using

        Debug.WriteLine("Memstream capacity " + memstream.Capacity.ToString)
        'Dim strData As String = srStreamRemote.ReadToEnd
        memstream.Seek(0, 0)
        Dim buffer As Byte() = New Byte(2048) {}
        Using zip As New ZipInputStream(memstream)
            Debug.WriteLine("zip stream cap " + zip.Length.ToString)
            zip.Seek(0, 0)
            Dim e As ZipEntry

            Dim flag As Boolean = True
            Do While flag ' daft, but won't assign e=zip... tries to evaluate
                e = zip.GetNextEntry
                If IsNothing(e) Then
                    flag = False
                    Exit Do
                Else
                    e.UseUnicodeAsNecessary = True
                End If

                If Not e.IsDirectory Then
                    Debug.WriteLine("Writing out " + e.FileName)
                    '    e.Extract(strDestDir)

                    Using output As FileStream = File.Open(Path.Combine(strDestDir, e.FileName), _
                                                          FileMode.Create, FileAccess.ReadWrite)
                        Dim n As Integer
                        Do While (n = zip.Read(buffer, 0, buffer.Length) > 0)
                            output.Write(buffer, 0, n)
                        Loop
                    End Using

                End If
            Loop
        End Using
        'End Using
    End Using 'srStreamRemote.Close()
    response.Close()
End Sub

So I get the right size file downloaded, but dotnetzip does not recognise it, and the files that get copied out are incomplete/invalid zips. I've spent most of today on this, and am ready to give up.

Answer

Smudge202 picture Smudge202 · Jul 7, 2011

I think the answer will be to break down the problem, and perhaps change a couple aspects in the code.

For example, lets get rid of converting the response stream to a string:

Dim memStream As MemoryStream
Using rdr As System.IO.Stream = response.GetResponseStream
    Dim count = Convert.ToInt32(response.ContentLength)
    Dim buffer = New Byte(count) {}
    Dim bytesRead As Integer
    Do
        bytesRead += rdr.Read(buffer, bytesRead, count - bytesRead)
    Loop Until bytesRead = count
    rdr.Close()
    memStream = New MemoryStream(buffer)
End Using

Next, there's an easier way to output the contents of a memory stream to a file. Consider your code

Dim data As Byte() = memstream.ToArray()
Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
    filestrm.Write(data, 0, data.Length)
End Using

can be replaced with

Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
    memstream.WriteTo(filestrm)
End Using

That eliminates the need to transfer your memory stream into another byte array, and then push the byte array down the stream, when in fact the memory stream can transfer data directly to file (via the filestream) saving the middle-man buffer.

I'll admit I haven't worked with the Zip/compression libraries you're using, but with the above amendments you have removed unnecessary transfers between streams, byte arrays, strings, etc, and hopefully eliminated the encoding issues you were having.

Give that a try and let us know how you get on. Consider attempting to open the file that you saved ("C:\temp\debug.zip") to see if it is listed as corrupt. If not, then you know at least as far as that in the code, it is working ok.