Cylen
Cylen

Reputation: 119

Analysing line by line and storing if meets criteria, else ignore

I've dug around a lot on this one and not found quite was I was looking for.

INPUT: Multiple (in the hundreds, occasionally thousands) of lines of ASCII text, ranging from 97 characters long to over 500. The criteria for whether I want to keep this data or not is purely contained in the first 3 characters (always numbers - arbitrary values 100,200 and 300 are the ones I'm interested in).

The output required is only those that start with 100, 200 or 300, the rest I can ignore.

This is what I have as my streamreader, which currently outputs to console:

using System;
using System.Collections.Generic;
using System.IO;

class Program
{
public void Do
{

    // Read in a file line-by-line, and store in a List.

    List<string> list = new List<string>();
    using (StreamReader reader = new StreamReader("File.dat"))
    {
        string line;
        while ((line = reader.ReadLine()) != null)
        {
            list.Add(line); // Add to list.
            Console.WriteLine(line); // Write to console.
        //    Console.ReadLine();
        }
    }
}
}

I was hoping to put in a line that says

IF {
FIRST3CHAR != (100,200,300) }
then skip,

but I'm unsure how to define the FIRST3CHAR class. This is the only filter that will be done on the raw data.

I will afterwards, be analysing this filtered data set based on other criteria contained within, but I'll give that a shot myself before asking for any assistance.

Upvotes: 2

Views: 152

Answers (3)

Tim Schmelter
Tim Schmelter

Reputation: 460238

This code is more readable and does what you want:

var allowedNumbers = new[]{ "100", "200", "300" };
IEnumerable<String> lines = File
                   .ReadLines("File.dat")
                   .Where(l => allowedNumbers.Any(num => l.StartsWith(num)));

now you can enumerate the lines for example with a foreach:

foreach(string line in lines)
{
    Console.WriteLine(line); // Write to console.
}

Since you want to add those lines to a List<string> anyway, you can use Enumerable.ToList instead of the foreach:

List<string> list = lines.ToList();

Upvotes: 5

Marc Gravell
Marc Gravell

Reputation: 1063619

At the simplest level:

if(line.StartsWith("100") || line.StartsWith("200") || line.StartsWith("300"))
{
    list.Add(line); // Add to list.
    Console.WriteLine(line); // Write to console.
}

If the file is huge (as in, hundreds of thousands of lines), it might also be worth looking at implementing it as an iterator block. But the "starts" test is pretty simple.

If you need more flexibility, I would consider a regex; for example:

static readonly Regex re = new Regex("^[012]00", RegexOptions.Compiled);

...
while (...)
{
    if(re.IsMatch(line))
    {
        list.Add(line); // Add to list.
        Console.WriteLine(line); // Write to console.
    }
}

Upvotes: 2

Daniel Hilgarth
Daniel Hilgarth

Reputation: 174397

Is there a reason why you don't just add this condition to your loop?

while ((line = reader.ReadLine()) != null)
{
    var beginning = line.Substring(0, 3);
    if(beginning != "100" && beginning != "200" && beginning != "300")
        continue;
    list.Add(line); // Add to list.
    Console.WriteLine(line); // Write to console.
}

Upvotes: 1

Related Questions