user1282609
user1282609

Reputation: 575

Read specific data from text file

I have a text file as follows(it is having more than hundered thousands of lines):

Header
AGROUP1
ADATA1|0000
ADATA2|0001
ADATA3|0002
D0000|TNE
D0001|TNE
D0002|TNE
AGROUP2
ADATA1|0000
ADATA2|0001
ADATA3|0002
D0000|TNE
D0001|TNE
D0002|TNE
AGROUP3
ADATA1|0000
ADATA2|0001
ADATA3|0002
D0000|TNE
D0001|TNE
D0002|TNE

Infact it has more than hundered thousands lines of code.

I need to read data based on group For example in a method:

public void ReadData(string strGroup)
{
    if(strGroup == "AGROUP2)
       //Read from the text file starting from line  "AGROUP2" to "AGROUP3"(i.e lines under AGROUP2)
}

What i have tried is

 public void ReadData(string strGroup)
    {
             bool start = false;
             while ((line = reader.ReadLine()) != null)
                    {
                        if (line == strGroup && line.Length == 5)
                            start = true;
                        else if (line.Length == 5)
                            start = false;
                        if(start)
                            yield return line;
                    }
    }

It is working fine, Performance wise, it takes longer since my text file is a very very huge file....There is if condition on every line in the method.

IS the a better way to do this?

Upvotes: 0

Views: 1216

Answers (2)

Emond
Emond

Reputation: 50712

If there is anything you know about the structure of the file that might help you could use that:

  • if the list is sorted you might know when to stop parsing.
  • if the list contains jump tables or an index you could skip lines
  • if the groups have a specific number of lines you can skip those

If not, you're destined to search from top to bottom and you will only be able to increase the speed using technical tricks:

  • read batches of lines instead of single lines
  • try to prevent creating many tiny objects (strings) in your code that might choke the garbage collector
  • if you need to do a lot of random access (going back and forth throughout the file) you might consider indexing or splitting the file first.

Upvotes: 1

DXM
DXM

Reputation: 1249

What if you use bash command to cut the huge file into smaller ones with AGROUP# as the first line. I think bash commands are more optimized.

Upvotes: 0

Related Questions