parse text file and remove commas inside double quotes

Internet Engineer picture Internet Engineer · Mar 27, 2012 · Viewed 21.8k times · Source

I have a text file that needs to be converted into a csv file. My plan is to:

  • parse the file line by line
  • search and replace commas inside double quotes with a space
  • then delete all double quotes
  • append the line to a new csv file

Question: I need a function that will recognize the comma inside a double quote and replace it.

Here is a sample line:

"MRS Brown","4611 BEAUMONT ST"," ","WARRIOR RUN, PA"

Answer

Pradeep Kumar picture Pradeep Kumar · Mar 27, 2012

Your file seems to be already in a CSV complaint format. Any good CSV reader would be able to read it correctly.

If your problem is just reading the field values correctly, then you need to read it the correct way.

Here is one way to do it:

using Microsoft.VisualBasic.FileIO; 


    private void button1_Click(object sender, EventArgs e)
    {
        TextFieldParser tfp = new TextFieldParser("C:\\Temp\\Test.csv");
        tfp.Delimiters = new string[] { "," };
        tfp.HasFieldsEnclosedInQuotes = true;
        while (!tfp.EndOfData)
        {
            string[] fields = tfp.ReadFields();

            // do whatever you want to do with the fields now...
            // e.g. remove the commas and double-quotes from the fields.
            for (int i = 0; i < fields.Length;i++ )
            {
                fields[i] = fields[i].Replace(","," ").Replace("\"","");
            }

            // this is to show what we got as the output
            textBox1.AppendText(String.Join("\t", fields) + "\n");
        }
        tfp.Close();
    }

EDIT:

I just noticed that the question has been filed under C#, VB.NET-2010. Here is the VB.NET version, just in case you are coding in VB.

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim tfp As New FileIO.TextFieldParser("C:\Temp\Test.csv")
    tfp.Delimiters = New String() {","}
    tfp.HasFieldsEnclosedInQuotes = True
    While Not tfp.EndOfData
        Dim fields() As String = tfp.ReadFields

        '' do whatever you want to do with the fields now...
        '' e.g. remove the commas and double-quotes from the fields.
        For i As Integer = 0 To fields.Length - 1
            fields(i) = fields(i).Replace(",", " ").Replace("""", "")
        Next
        '' this is to show what we got as the output
        TextBox1.AppendText(Join(fields, vbTab) & vbCrLf)
    End While
    tfp.Close()
End Sub