Jeff Brady
Jeff Brady

Reputation: 1498

Replace character in string with tab & must contain 4 total tabs

I have a tab delimited file, and some of the strings contain a ý character which needs to be replaced with a \t. Also, the string needs to contain 4 tabs total, with any extra tabs tacked on to the end. For example, the strings:

1234ý5678
1234
ý1234ý5678

should look like

1234\t5678\t\t\t
1234\t\t\t\t
\t1234\t5678\t\t

Here's what I have so far:

string[] input_file = (string[])(e.Data.GetData(DataFormats.FileDrop));
string output_file = @"c:\filename.txt";

foreach (string file in input_file)
{
    string[] lines = File.ReadAllLines(file);

    for (int i = 0; i < lines.Length; i++)
    {
        string line = lines[i];

        string[] values = line.Split('\t');

        //look at each value in values, replace any ý with a tab, and add
                    //tabs at the end of the value so there are 4 total

        lines[i] = String.Join("\t", values);

    }
    File.WriteAllLines(output_file, lines);
}

EDIT: some clarification - the entire line might look like this:

331766*ALL1 16ý7    14561ý8038  14560ý8037  ausername  11:54:05  12 Nov 2007

I need to look at each string that makes up the line, and replace any ý with a \t, and add \t's to the end so each string has a total of 4. Here's what the result should look like:

331766*ALL1 16\t7\t\t\t 14561\t8038\t\t\t   14560\t8037\t\t\t   ausername  11:54:05  12 Nov 2007

Upvotes: 1

Views: 2549

Answers (3)

System Down
System Down

Reputation: 6270

What you do is:

  1. Split each line to strings using \t as separator.

  2. Iterate through the strings.

  3. For each string replace ý with \t.

  4. Count the number of \t in the string now, and add additional \t as needed.

Here's some code:

string[] lines = System.IO.File.ReadAllLines(input_file);
var result = new List<string>();
foreach(var line in lines)
{
    var strings = line.Split('\t');
    var newLine = "";
    foreach(var s in strings)
    {
        var newString = s.Replace('ý','\t');
        var count = newString.Count(f=>f=='\t');
        if (count<4)
            for(int i=0; i<4-count; i++)
                newString += "\t";
        newLine += newString + "\t";
    }
    result.Add(newLine);
}
File.WriteAllLines(output_file, result);

This could possibly be optimized better for speed using StringBuilder, but it's a good start.

Upvotes: 1

Austin Salonen
Austin Salonen

Reputation: 50225

private static string SplitAndPadded(string line, string joinedWith = "\t", char splitOn = 'ý')
{
    // 4 required splits yields 5 items ( 1 | 2 | 3 | 4 | 5 )
    // could/should be a parameter; this allowed for the cleaner comment
    const int requiredItems = 5;

    // the empty string case
    var required = Enumerable.Repeat(string.Empty, requiredItems);

    // keep empty items; 3rd test case
    var parts = line.Split(new[] { splitOn });

    // this will exclude items when parts.Count() > requiredItems
    return string.Join(joinedWith, parts.Concat(required).Take(requiredItems));
}


//usage
// .Select(SplitAndPadded) may need to be .Select(line => SplitAndPadded(line))
var lines = File.ReadAllLines(file).Select(SplitAndPadded).ToArray();
File.WriteAllLines(outputFile, lines);

// if input and output files are different, you don't need the ToArray (you can stream)

Upvotes: 1

Hossein Narimani Rad
Hossein Narimani Rad

Reputation: 32481

Try this:

string[] lines = System.IO.File.ReadAllLines(input_file);

for (int i = 0; i < lines.Length; i++)
{
    string line = lines[i];
    line = line.Replace("ý", "\t");
    int n = line.Split(new string[] { "\t" }, StringSplitOptions.None).Count()-1;
    string[] temp = new string[4 - n ];
    temp = temp.Select(input => "\t").ToArray();
    line += string.Join(string.Empty, temp);
    lines[i] = line;
}

System.IO.File.WriteAllLines(output_file, lines);

Upvotes: 1

Related Questions