user1181429
user1181429

Reputation: 31

using LINQ to process a text file

Text File Format

headerinfo = "abc"
**part1=001**
element1
element2....
...
element15
end_element
**part2=002**
element1
element2....
...
emelent15
end_element
......
end_header

I want to select all rows of text starting from part1=001 up to but not including part2=002.

So far I have:

var res = (from line in File.ReadAllLines(sExecPath + @"\" + sFileName)
           where line == "part1=001"
           select line).ToList();

I am trying to use between option in linq, it does not seem to return any result.

var part1= (from prt in File.ReadAllLines(sExecPath + @"\" + sFileName)
            where prt.CompareTo("part1=001") >=0  
            && prt.CompareTo("part=002") >= 0
            select prt);

Upvotes: 3

Views: 3052

Answers (4)

Tigran
Tigran

Reputation: 62248

The simplest and streghtforward solution comes to me is something like this:

var lines = File.ReadAllLines(@"C:\Sample.txt").
             SkipWhile(line=>!line.Contains("part1")).
                   Skip(1).TakeWhile(line=>!line.Contains("part2"));

It returns result you want actually. The logic is simple:

  • SkipWhile lines till meeting a line that contains "part1"
  • after Skip(1) (as it will be actually that one that contains "part1" string)
  • finally Take those ones till getting to the line containing "part2".

Upvotes: 1

Chris Shain
Chris Shain

Reputation: 51319

I think you are looking for TakeWhile:

var linesInPartOne = File
       .ReadAllLines(sExecPath + @"\" + sFileName)
       .SkipWhile(line => !line.StartsWith("**part1="))
       // To skip to part 1 header line, uncomment the line below:
       // Skip(1)
       .TakeWhile(line => !line.StartsWith("**part2="));

To generalize this to retrieve any given numbered part, something like this would do:

public static IEnumerable<String> ReadHeaderPart(String filePath, int part) {
    return File
        .ReadAllLines(filePath)
        .SkipWhile(line => !line.StartsWith("**part" + part + "="))
        // To skip to part 1 header line, uncomment the line below:
        // Skip(1)
       .TakeWhile(line => 
            !line.StartsWith("**part" + (part + 1) + "=" 
            && 
            !line.StartsWith("end_header")))
       .ToList();
 }

EDIT: I had a Skip(1) in there to skip the part 1 header. Removed it since you seem to want to keep that line.

Upvotes: 8

Jeff Mercado
Jeff Mercado

Reputation: 134811

public static IEnumerable<string> GetLinesBetween(
    string path,
    string fromInclusive,
    string toExclusive)
{
    return File.ReadLines(path)
        .SkipWhile(line => line != fromInclusive)
        .TakeWhile(line => line != toExclusive);
}

var path = Path.Combine(sExecPath, sFileName); // don't combine paths like that
var result = GetLinesBetween(path, "part1=001", "part2=002").ToList();

Upvotes: 6

eouw0o83hf
eouw0o83hf

Reputation: 9588

Linq probably isn't your best bet here. Just try doing

var lines = File.ReadAllLines(filename);
List<string> linesICareABout = new List<string>();
for(int i = 0; !linesICareAbout[i].Contains("part2=002"); ++i)
{
 linesICareABout.Add(lines[i]);
}

Then do whatever you want with the lines you read in.

However, if you're really dedicated to using Linq, try TakeWhile

http://msdn.microsoft.com/en-us/library/bb534804.aspx

Upvotes: 0

Related Questions