GrammatonCleric
GrammatonCleric

Reputation: 211

Replacing specific commas in a comma-delimited file

I have a file that I stream into a list object to save to a temp in my database, because I have to use the data in the temp table to join to other tables for a final result, then export this final result to a .csv file.

Everything works except for the stream read of the original file.

The file is comma-delimited and structured very specifically, and the structure never changes.

The problem I have is this:

The "AccountHolder" field has a comma amongst the characters that make up the string, so my FileStream sees this as a delimiter. How do I replace the comma in the AccountHolder string without breaking the comma-delimiter the FileStream has to adhere to?

List<object[]> fileContent = new List<object[]>();

            using (FileStream reader = File.OpenRead(ofd.FileName))
            using (TextFieldParser parser = new TextFieldParser(reader))
            {
                parser.TrimWhiteSpace = true;
                parser.Delimiters = new[] { "," };
                parser.HasFieldsEnclosedInQuotes = true;
                while (!parser.EndOfData)
                {
                    object[] line = parser.ReadFields();
                    fileContent.Add(line);
                    lstRegNo.Add(line[0].ToString().Trim());
                    lstAccHolder.Add(line[1].ToString().Trim().Replace(',', ' '));
                    lstAmount.Add(line[2].ToString().Trim().Replace(',', ' '));
                    lstAccNo.Add(line[3].ToString().Trim());
                    lstBranch.Add(line[4].ToString().Trim());
                    lstDate.Add(line[5].ToString().Trim());
                    lstCode.Add(line[6].ToString().Trim());
                    lstOrphenColumn.Add(line[7].ToString().Trim());
                }

Here's a sample of the file I'm streaming in:

000001,A WHATEVER,00000000001,0000000000001,000001,160510,17,0
000002,B WHATEVER,00000000002,0000000000002,000002,160510,17,0
000003,C, WHATEVER,00000000003,0000000000003,000003,160510,17,0
000004,D WHATEVER,00000000004,0000000000004,000004,160510,17,0
000005,E WHATEVER,00000000005,0000000000005,000005,160510,17,0

As you can see, on line 3, there's a comma in the AccountHolder's name. I need the value of this to be "C WHATEVER", not "C, WHATEVER" I want to eliminate that comma but still be able to stream the file into my List object splitting the fields in the file by commas.

Please note that the file's data will be different everytime I receive it, so simply looking for a static value won't cut it.

How do I do this?

Upvotes: 2

Views: 1751

Answers (3)

GrammatonCleric
GrammatonCleric

Reputation: 211

Ultimately what I did was the following:

List<string[]> fileContents = new List<string[]>();

            var lines = File.ReadAllLines(ofd.FileName).ToList();
            foreach (var item in lines)
            {
                string RegNo = string.Format("{0}", item.ToString().Substring(0, 19));
                string accHolder = string.Format("{0}", item.ToString().Substring(21, 30));
                string amount = string.Format("{0}", item.ToString().Substring(52, 11));
                string accNo = string.Format("{0}", item.ToString().Substring(64, 13));
                string branch = string.Format("{0}", item.ToString().Substring(78, 6));
                string date = string.Format("{0}-{1}-{2}", "20" + item.ToString().Substring(85, 2), item.ToString().Substring(87, 2), item.ToString().Substring(89, 2));
                string code = string.Format("{0}", item.ToString().Substring(92, 2));
                string orphenColumn = string.Format("{0}", item.ToString().Substring(95, 1));

                lstRegNo.Add(RegNo.Trim());
                lstAccHolder.Add(ExtensionMethods.RemoveSpecialCharacters(accHolder.Trim()));
                lstAmount.Add(amount.Trim());
                lstAccNo.Add(accNo.Trim());
                lstBranch.Add(branch.Trim());
                lstDate.Add(date);
                lstCode.Add(code.Trim());
                lstOrphenColumn.Add(orphenColumn);
            }

This seems to work for every file I've imported so far.

Thanks for all your suggestions!

Upvotes: 0

Jeff Pang
Jeff Pang

Reputation: 161

One idea is that whenever you read a line, you count the number of commas. So assuming your number of columns will always be 7. The code below will capture all the extra commas on the second column.

    string filename="sample.csv";
    var stream=new StreamReader(filename);
    string l=null;

    while((l = stream.ReadLine())!=null)
    {
        var ss=l.Split(',');
        int count = ss.Length - 1;//number of commas found
        int extraCommas = count - 7;

        var regNo = ss[0];
        StringBuilder accHolder = new StringBuilder();

        //loops from 1 and captures x number of extra commas
        for (int x = 1; x <= extraCommas+1; x++)
        {
            accHolder.Append(ss[x]);
        }

        var amount = ss[2 + extraCommas];
        var accNo = ss[3 + extraCommas];
        var branch = ss[4 + extraCommas];
        var date = ss[5 + extraCommas];   
        //etc....             
    }

Upvotes: 1

Clay
Clay

Reputation: 5084

Well...if the fields are fixed with, then you're not setting up your parser correctly. It knows how to handle fixed-width fields which is somewhat different than a canonical CSV file.

BTW, I'd be inclined to make some constants that reflect the names of the "columns":

//--> indexes into parsed line...
const int RegNo = 0;
const int AcctHolder = 2;
const int Amount = 4;
const int AcctNo = 6;
//--> ...etc.

using Microsoft.VisualBasic.FileIO;
//...
List<object[]> fileContent = new List<object[]>();
using (FileStream reader = File.OpenRead(ofd.FileName))
using (TextFieldParser parser = new TextFieldParser(reader))
{
  parser.TextFieldType = FieldType.FixedWidth;
  parser.SetFieldWidths
  (
     6  1, //--> width of RegNo, width of ignored comma
    10, 1, //--> width of AcctHolder, width of ignored comma
    10, 1, //--> width of Amount, width of ignored comma
    13, 1, //--> etc...
     6, 1, 
     6, 1, 
     2, 1,
     1
  );
  while (!parser.EndOfData)
  {
    object[] line = parser.ReadFields();
    fileContent.Add(line);
    lstRegNo.Add( line[ RegNo ].ToString( ));
    lstAccHolder.Add(line[ AcctHolder ].ToString().Replace(',', ' '));
    lstAmount.Add(line[ Amount ].ToString().Trim().Replace(',', ' '));
    lstAccNo.Add(line[ AcctNo ].ToString().Trim());
    //--> etc...
  }
}

For more information, check this out.

Upvotes: 1

Related Questions