Bentley Carr
Bentley Carr

Reputation: 702

Efficient Way To Get Line from StreamReader which Matches Regex in C#

I have a file, and I want to get the line of the file which matches a regex query.

My code is something like this:

Assembly assembly = typeof(EmbeddedResourceGetter).GetTypeInfo().Assembly;
Stream stream = assembly.GetManifestResourceStream(resourcePath);
StreamReader sr = new StreamReader(stream);

return file.ReadToEnd()
    .Split('\n').ToList()
    .Find(l => Regex.IsMatch(l, "regex-query-here"));

however, I feel like this is quite inefficient and if I need to repeat this multiple times, it can take a long time to complete.

So is there a more efficient way to get a line which matches a regex query without reading the whole file, or will I have to refactor my code in a different way to make it more efficient?

Upvotes: 0

Views: 1050

Answers (2)

M.kazem Akhgary
M.kazem Akhgary

Reputation: 19149

Find only gets the first match. so if you really want to get the first match dont read whole file. its inefficient. read the file line by line using File.ReadLines

Also using Regex.IsMatch at every iteration is inefficient. create regex only once.

Regex regex = new Regex("regex-query-here");
return File.ReadLines(path).FirstOrDefault(l => regex.IsMatch(l));

File.ReadLines loads only one line to memory at a time. FirstOrDefault will stop iteration as soon as first match is found. so if your match is in 23rd line you will read only 23 lines from the file and you will get your result.

Reading all the file into memory may be faster but thats a trade off between memory and performance.

Another thing i have to mention is that splitting by \n is not a cross-platform way to get lines.

Upvotes: 2

Ferit
Ferit

Reputation: 9657

You should read the file once, store it in a variable, because I/O operations are expensive. Then, run the regex on the variable.

When you read your file into a variable, you read it from hard disk to RAM, accessing RAM is fast, hard disk is slow. Without doubt best is to read from hard disk once!

Also reading line by line fails, if you want to match multiline pattern.

For example:

Can
you
match
me
if
you
read
me
line
by
line?

"Can\s+you" regex would fail to match in this case, because you won't get "Can" and "you" in same string.

Upvotes: 1

Related Questions