Reputation: 4180
So my goal is to write a C# program that will read a .txt file with thousands of rows and reformat certain rows with condition.
I will read each row and check if this row contains 1901
, if it does, I want to make changes to the row.
For example this line,
T1.hello < 1901, AS bye
I want to replace it with this,
T2.hello AS bye
In this case, hello
and bye
are the two pieces of data i am going to keep and transit format from T1.data1 < 1901, AS data2
to T2.data1 AS data2
if and only if the old row contains 1901
.
Please note data1
and data2
can be any data, they are not always hello
and bye
I have never used IO in C# so I am running out of ideas, so far the code I have is as follows, I am stuck at the if
statement of my code, I need some guidance on how to handle such situation:
string path = @"C:\Users\jzhu\Desktop\test1.txt";
StreamReader reader = new StreamReader(File.OpenRead(path));
string fileContent = reader.ReadToEnd();
reader.Close();
List<string> lines = new List<string>(File.ReadAllLines(path));
for (int i = 0; i < lines.Count; i++)
{
if(lines[i].Contains("1901"))
{
//here is the part I need guidance
}
}
StreamWriter writer = new StreamWriter(File.Create(path));
writer.Write(fileContent);
writer.Close();
Upvotes: 0
Views: 110
Reputation: 13138
Use Regex.Replace
string path = @"C:\Users\jzhu\Desktop\test1.txt";
List<string> lines = new List<string>(File.ReadAllLines(path));
for (int i = 0; i < lines.Count; i++)
{
lines[i] = Regex.Replace(lines[i], @"T1\.([^ ]*) < 1901, AS", "T2.$1 AS");
}
File.WriteAllLines(path, lines);
Upvotes: 0
Reputation: 9500
I think this is a case for a regular expression, because you want to capture a variable amount of data after the T1 and preserve it. Try something like this:
string pattern = "T1.([^ ]+) < 1901,( .*)";
Regex rgx = new Regex(pattern);
for (int i = 0; i < lines.Count; i++)
{
Match m = rgx.Match(lines[i]);
if (m.Success == true) {
lines[i] = rgx.Replace(lines[i],"T2." + m.Groups[1] + m.Groups[2]);
}
}
The stuff in ()
s in the pattern is the stuff that will get captured, into groups on the Match
object (the first group on the match--index 0--is the whole matched line itself).
So ([^ ]+)
find everything after 'T1' that is not a space until a space is encountered and stuffs that into Match
group 2 (index 1).
( .*)
finds everthing after '1901,', beginning with a space, followed by anything repeated any amount of times .*
, and stuffs that into group 3 (index 2). Since these items are preserved in groups, you can now retrieve them when you write the replacement string.
Upvotes: 1
Reputation: 1009
What you could do is have
StringBuilder sb = new StringBuilder();
for (int i = 0; i < lines.Count; i++)
{
if(lines[i].Contains("1901"))
{
sb.AppendLine(lines[i].Replace("< 1901,",""));
}
else
{
sb.AppendLine(lines[i]);
}
}
using (StreamWriter writer = new StreamWriter(path))
{
writer.Write(sb.ToString());
}
This will assume that you know that you want to replace "< 1901," with an empty string.
Upvotes: 2