OPK
OPK

Reputation: 4180

Replacing string in a .txt file

So my goal is to write a C# program that will read a .txt file with thousands of rows and reformat certain rows with condition.

I will read each row and check if this row contains 1901, if it does, I want to make changes to the row.

For example this line,

 T1.hello < 1901, AS bye

I want to replace it with this,

 T2.hello AS bye

In this case, hello and bye are the two pieces of data i am going to keep and transit format from T1.data1 < 1901, AS data2 to T2.data1 AS data2 if and only if the old row contains 1901.

Please note data1 and data2 can be any data, they are not always hello and bye

I have never used IO in C# so I am running out of ideas, so far the code I have is as follows, I am stuck at the if statement of my code, I need some guidance on how to handle such situation:

string path = @"C:\Users\jzhu\Desktop\test1.txt";
StreamReader reader = new StreamReader(File.OpenRead(path));
string fileContent = reader.ReadToEnd();
reader.Close();
List<string> lines = new List<string>(File.ReadAllLines(path));
for (int i = 0; i < lines.Count; i++)
{
     if(lines[i].Contains("1901"))
     {         
         //here is the part I need guidance
     }
}
StreamWriter writer = new StreamWriter(File.Create(path));
writer.Write(fileContent);
writer.Close();

Upvotes: 0

Views: 110

Answers (3)

Guillaume
Guillaume

Reputation: 13138

Use Regex.Replace

string path = @"C:\Users\jzhu\Desktop\test1.txt";
List<string> lines = new List<string>(File.ReadAllLines(path));
for (int i = 0; i < lines.Count; i++)
{
     lines[i] = Regex.Replace(lines[i], @"T1\.([^ ]*) < 1901, AS", "T2.$1 AS");
}
File.WriteAllLines(path, lines);

Upvotes: 0

DWright
DWright

Reputation: 9500

I think this is a case for a regular expression, because you want to capture a variable amount of data after the T1 and preserve it. Try something like this:

string pattern = "T1.([^ ]+) < 1901,( .*)";
Regex rgx = new Regex(pattern);
for (int i = 0; i < lines.Count; i++)
{
    Match m = rgx.Match(lines[i]);
    if (m.Success == true) {
        lines[i] = rgx.Replace(lines[i],"T2." + m.Groups[1] + m.Groups[2]);
    }
}

The stuff in ()s in the pattern is the stuff that will get captured, into groups on the Match object (the first group on the match--index 0--is the whole matched line itself).

So ([^ ]+) find everything after 'T1' that is not a space until a space is encountered and stuffs that into Match group 2 (index 1).

( .*) finds everthing after '1901,', beginning with a space, followed by anything repeated any amount of times .*, and stuffs that into group 3 (index 2). Since these items are preserved in groups, you can now retrieve them when you write the replacement string.

Upvotes: 1

Dominic Scanlan
Dominic Scanlan

Reputation: 1009

What you could do is have

StringBuilder sb = new StringBuilder();
for (int i = 0; i < lines.Count; i++)
{
    if(lines[i].Contains("1901"))
    {         
         sb.AppendLine(lines[i].Replace("< 1901,",""));
    }
    else
    {
        sb.AppendLine(lines[i]);
    }
}

using (StreamWriter writer = new StreamWriter(path))
{
    writer.Write(sb.ToString());
}

This will assume that you know that you want to replace "< 1901," with an empty string.

Upvotes: 2

Related Questions