Reputation: 1498
I have a tab delimited file, and some of the strings contain a ý
character which needs to be replaced with a \t
. Also, the string needs to contain 4 tabs total, with any extra tabs tacked on to the end. For example, the strings:
1234ý5678
1234
ý1234ý5678
should look like
1234\t5678\t\t\t
1234\t\t\t\t
\t1234\t5678\t\t
Here's what I have so far:
string[] input_file = (string[])(e.Data.GetData(DataFormats.FileDrop));
string output_file = @"c:\filename.txt";
foreach (string file in input_file)
{
string[] lines = File.ReadAllLines(file);
for (int i = 0; i < lines.Length; i++)
{
string line = lines[i];
string[] values = line.Split('\t');
//look at each value in values, replace any ý with a tab, and add
//tabs at the end of the value so there are 4 total
lines[i] = String.Join("\t", values);
}
File.WriteAllLines(output_file, lines);
}
EDIT: some clarification - the entire line might look like this:
331766*ALL1 16ý7 14561ý8038 14560ý8037 ausername 11:54:05 12 Nov 2007
I need to look at each string that makes up the line, and replace any ý with a \t, and add \t's to the end so each string has a total of 4. Here's what the result should look like:
331766*ALL1 16\t7\t\t\t 14561\t8038\t\t\t 14560\t8037\t\t\t ausername 11:54:05 12 Nov 2007
Upvotes: 1
Views: 2549
Reputation: 6270
What you do is:
Split each line to strings using \t as separator.
Iterate through the strings.
For each string replace ý with \t.
Count the number of \t in the string now, and add additional \t as needed.
Here's some code:
string[] lines = System.IO.File.ReadAllLines(input_file);
var result = new List<string>();
foreach(var line in lines)
{
var strings = line.Split('\t');
var newLine = "";
foreach(var s in strings)
{
var newString = s.Replace('ý','\t');
var count = newString.Count(f=>f=='\t');
if (count<4)
for(int i=0; i<4-count; i++)
newString += "\t";
newLine += newString + "\t";
}
result.Add(newLine);
}
File.WriteAllLines(output_file, result);
This could possibly be optimized better for speed using StringBuilder, but it's a good start.
Upvotes: 1
Reputation: 50225
private static string SplitAndPadded(string line, string joinedWith = "\t", char splitOn = 'ý')
{
// 4 required splits yields 5 items ( 1 | 2 | 3 | 4 | 5 )
// could/should be a parameter; this allowed for the cleaner comment
const int requiredItems = 5;
// the empty string case
var required = Enumerable.Repeat(string.Empty, requiredItems);
// keep empty items; 3rd test case
var parts = line.Split(new[] { splitOn });
// this will exclude items when parts.Count() > requiredItems
return string.Join(joinedWith, parts.Concat(required).Take(requiredItems));
}
//usage
// .Select(SplitAndPadded) may need to be .Select(line => SplitAndPadded(line))
var lines = File.ReadAllLines(file).Select(SplitAndPadded).ToArray();
File.WriteAllLines(outputFile, lines);
// if input and output files are different, you don't need the ToArray (you can stream)
Upvotes: 1
Reputation: 32481
Try this:
string[] lines = System.IO.File.ReadAllLines(input_file);
for (int i = 0; i < lines.Length; i++)
{
string line = lines[i];
line = line.Replace("ý", "\t");
int n = line.Split(new string[] { "\t" }, StringSplitOptions.None).Count()-1;
string[] temp = new string[4 - n ];
temp = temp.Select(input => "\t").ToArray();
line += string.Join(string.Empty, temp);
lines[i] = line;
}
System.IO.File.WriteAllLines(output_file, lines);
Upvotes: 1