Reputation: 31857
I need to count the number of lines in a string. Any line break can be character can be present in the string (CR, LF or CRLF).
So possible new line chars:
* \n
* \r
* \r\n
For example, with the following input:
This is [\n]
an string that [\r]
has four [\r\n]
lines
The method should return 4 lines. Do you know any built in function, or someone already implemented it?
static int GetLineCount(string input)
{
// could you provide a good implementation for this method?
// I want to avoid string.split since it performs really bad
}
NOTE: Performance is important for me, because I could read large strings.
Upvotes: 3
Views: 5605
Reputation: 113272
int count = 0;
int len = input.Length;
for(int i = 0; i != len; ++i)
switch(input[i])
{
case '\r':
++count;
if (i + 1 != len && input[i + 1] == '\n')
++i;
break;
case '\n':
// Uncomment below to include all other line break sequences
// case '\u000A':
// case '\v':
// case '\f':
// case '\u0085':
// case '\u2028':
// case '\u2029':
++count;
break;
}
Simply scan through, counting the line-breaks, and in the case of \r
test if the next character is \n
and skip it if it is.
Performance is important for me, because I could read large strings.
If at all possible then, avoid reading large strings at all. E.g. if they come from streams this is pretty easy to do directly on a stream as there is no more than one-character read-ahead ever needed.
Here's another variant that doesn't count newlines at the very end of a string:
int count = 1;
int len = input.Length - 1;
for(int i = 0; i < len; ++i)
switch(input[i])
{
case '\r':
if (input[i + 1] == '\n')
{
if (++i >= len)
{
break;
}
}
goto case '\n';
case '\n':
// Uncomment below to include all other line break sequences
// case '\u000A':
// case '\v':
// case '\f':
// case '\u0085':
// case '\u2028':
// case '\u2029':
++count;
break;
}
This therefore considers ""
, "a line"
, "a line\n"
and "a line\r\n"
to each be one line only, and so on.
Upvotes: 5
Reputation: 1861
Completely manual implementation: (You aren't going to be much faster then this)
public static int GetLineCount(string input)
{
int lineCount = 0;
for (int i = 0; i < input.Length; i++)
{
switch (input[i])
{
case '\r':
{
if (i + 1 < input.Length)
{
i++;
if (input[i] == '\r')
{
lineCount += 2;
}
else
{
lineCount++;
}
}
else
{
lineCount++;
}
}
break;
case '\n':
lineCount++;
break;
default:
break;
}
}
Upvotes: 1
Reputation: 11228
If you want to get the number of lines you should count only \n
as \r
means a carriage return and doesn't advance to the new line:
static int GetLineCount(string input)
{
return input.Count(c => c == '\n');
}
Upvotes: -1
Reputation: 160
Here is an example similar to how Microsoft does it while reading lines from a file:
int numberOfLines = 0;
using (StreamReader sr = new StreamReader(path, encoding))
while ((line = sr.ReadLine()) != null)
numberOfLines += 1;
For reference/reading: http://referencesource.microsoft.com/#mscorlib/system/io/file.cs,8d10107b7a92c5c2 http://referencesource.microsoft.com/#mscorlib/system/io/file.cs,675b2259e8706c26
Upvotes: 0
Reputation: 2362
What about this discussion
the simple
private static int Count4(string s)
{
int n = 0;
foreach( var c in s )
{
if ( c == '\n' ) n++;
}
return n+1;
}
should be very fast, even with larger strings... numerous other algorithms have been tested there. What speaks against this implementation? If you don`t extend to use parallel execution I would try this very simple approach.
Upvotes: 1
Reputation: 1227
Your string is from a file ?
I think this one do the job and do it pretty fast :
int count = File.ReadLines(path).Count();
from : How to get Number Of Lines without Reading File To End
Upvotes: 2