VB.Net editing a CSV (or delimited file)

donbyte picture donbyte · Jun 15, 2012 · Viewed 8.8k times · Source

I would like to know a way (of course I want to know the best way) to edit a csv file, preferably without having to read the original and write the results to a new one.

Example 1: I want to split the contents of a column into two columns 123abc into 123, abc

Example 2: I want to remove letters from the beginning of a column AB123 into 123

Clarification: I do not need help solving my examples; I just need to be pointed in the right direction of reading and editing the same file.

Here is a code example of using a temporary file that seems to me is too slow

    Dim currentLine as string()
tempFile.AutoFlush = True

Do Until origFile.EndOfData
    currentLine = origFile.ReadFields
    currentLine(1) = currentLine(1).Replace("ABC","") 'would convert ABC123 to 123
    For index as Integer = 0 to currentLine.Count - 2
        tempFile.Write(currentLine(index) & ",")
    Next
        tempFile.Write(currentLine(currentLine.Count - 1))
        tempFile.WriteLine()
Loop

tempFile.Flush()
tempFile.Dispose()
origFile.Dispose()
IO.File.Copy(tempFile,OrigFile,True)
IO.File.Delete(tempFile)

Answer

Andrew Morton picture Andrew Morton · Jun 15, 2012

You really will be best off writing a new file. You can write to a temporary file, delete the old file and rename the new file as the old file.

Something like:

Imports System.IO

Module Module1

    Sub Main()
        Dim tempfile = Path.GetTempFileName()
        Dim inputFile = "C:\Temp\input.txt"

        Using sw = New StreamWriter(tempfile)
            Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser(inputFile)
                MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
                MyReader.Delimiters = New String() {","}
                Dim currentRow As String()
                While Not MyReader.EndOfData
                    Try
                        currentRow = MyReader.ReadFields()
                        If currentRow.Count >= 2 Then
                            currentRow(1) = currentRow(1).Replace("ABC", "")
                        End If
                        sw.WriteLine(String.Join(",", currentRow))
                    Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
                        MsgBox("Line " & ex.Message & " is invalid.  Skipping")
                    End Try
                End While
            End Using
        End Using

        File.Delete(inputFile)
        File.Move(tempfile, inputFile)

    End Sub

End Module

(Basically copying the TextFieldParser example from http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser.readfields.aspx )