user7157732
user7157732

Reputation: 347

Extract contents between specific tags in a textfile - C#

I have a text file with the following information:

ALLOC

apple1
orange1
banana1

ALLOC
apple2
orange2
banana2

ALLOC
apple3
orange3
banana3

Based on the help from stackflow community, I am now able to read the whole file.I also found out that to extract contents between a tag, for ex, ALLOC, I could write:

var filelocation = @"c:\Fruits.txt";
var sectionLines = File.ReadAllLines(filelocation).TakeWhile(l => !l.StartsWith("ALLOC"));

But this will give me IEnumerable<string>:

apple1
orange1
banana1    
apple2
orange2
banana2    
apple3
orange3

How do I create 3 separate strings as

string1 = apple1 orange1 banana1
string2 = apple2 ornage2 banana2
string3 = apple3 orange3

In short, need to extract contents between tags.

Upvotes: 3

Views: 348

Answers (2)

kat1330
kat1330

Reputation: 5332

Here is some approach how you can return result which you want:

string[] words = { "ALLOC", "apple1", "orange1", "banana1", "ALLOC", "apple2", "orange2", "banana2", "ALLOC" };

var result = string.Join(" ", words)
        .Split(new string[] { "ALLOC" }, StringSplitOptions.RemoveEmptyEntries)            
        .Select(p => p.Trim(' '));

First I am making single string of all words. Than I am splitting by "ALLOC", and selecting trimmed strings.

Result is:

string[] result = { "apple1 orange1 banana1", "apple2 orange2 banana2" };

For your case,

var filelocation = @"c:\Fruits.txt";
var allLines = File.ReadAllLines(filelocation);
var sectionLines = string.Join(" ", allLines)
            .Split(new string[] { "ALLOC" }, StringSplitOptions.RemoveEmptyEntries)            
            .Select(p => p.Trim(' '));

Upvotes: 3

Mohit S
Mohit S

Reputation: 14064

This might do the trick for you

string fullstr = File.ReadAllText("c:\\Fruits.txt");
string[] parts = fullstr.Split(new string[] { "ALLOC" }, StringSplitOptions.RemoveEmptyEntries);
List<string> outputstr = new List<string>();
foreach(string p in parts)
{
    outputstr.Add(p.Replace("\r\n", " ").Trim(' '));
}

Here we read all text at once using File.ReadAllText and then splitted it with ALLOC and then in the outputstr just added the splitted string by replacing \r\n that is new line with a space and trimmed the result.

Here is the screenshot of the result

Upvotes: 1

Related Questions