satheesh kumar
satheesh kumar

Reputation: 139

How to Dirctly read Specific lines of data from the large text file without search every line in C , or java

I have a text file with some Account details in a huge size more than 7GB.Each Lines contains details of a single Accounts and other information.Here i want to read some Account details which contains first 3 charecters as "XBB". If i useed to search line by line it will take such a long time so I want to hit directly to that Particular Lines which contains the "XBB"..

Is there any Possible ways to do that in Java or VB , or VB.net

Upvotes: 3

Views: 336

Answers (4)

prprcupofcoffee
prprcupofcoffee

Reputation: 2970

It doesn't matter what language you use; the only way to find something is to search for it. You can use a search tool like Lucene to do the searching ahead of time, i.e., create full-text search index, or you can do the searching when you need to as you're doing it now, but you won't be able to escape the searching part.

Upvotes: 1

Sam Axe
Sam Axe

Reputation: 33738

You can do this only if you have an Index file, and that index file contains indexes for the particular column of data you want to search on.

The other option would be to load the file into a database, like Sql Server Express, and run a sql query on it.

Upvotes: 0

Luis Cruz
Luis Cruz

Reputation: 23

Use regular expressions (regex). With these you can set an expression that contains only those specific letters. Then using a scanner it will look for only that sequence of letters.

Upvotes: -1

Ted Hopp
Ted Hopp

Reputation: 234857

If the lines are sorted by their first 3 characters, then you can do a binary search. This is straightforward if the lines are a fixed length. Otherwise, you will need to search for the start of each line at each step of the binary search.

If you know the index of the line, you can try going to it directly. Again, this is trivial if the lines are a fixed length; otherwise you will still have to probe and search a bit.

In Java, the tool to use for this is RandomAccessFile. I don't know about VB/VB.net.

Following the suggestion by Peter Lawrey, if you are willing to scan the file once, you can build an index of the offset into the file at which each 3-character prefix starts. You can then use this to very quickly get to the correct line.

Upvotes: 4

Related Questions