Reputation: 15197
I am removing text from a string and what to replace each line with a blank line.
Some background: I am writing a compare function that compares two strings. Its all working fine and are displayed in there two separate web browsers. When i try scroll down on my browsers the strings are different lengths, I want to replace the text i am removeing with a blank line so that my strings are the same length.
In the code below i am looking to count how many lines aDiff.Text has
Here is my code:
public string diff_prettyHtmlShowInserts(List<Diff> diffs)
{
StringBuilder html = new StringBuilder();
foreach (Diff aDiff in diffs)
{
string text = aDiff.text.Replace("&", "&").Replace("<", "<")
.Replace(">", ">").Replace("\n", "<br>"); //¶
switch (aDiff.operation)
{
case Operation.DELETE:
//foreach('\n' in aDiff.text)
// {
// html.Append("\n"); // Would like to replace each line with a blankline
// }
break;
case Operation.EQUAL:
html.Append("<span>").Append(text).Append("</span>");
break;
case Operation.INSERT:
html.Append("<ins style=\"background:#e6ffe6;\">").Append(text)
.Append("</ins>");
break;
}
}
return html.ToString();
}
Upvotes: 59
Views: 104190
Reputation: 60596
I benchmarked all the answers.
Stack:
Method | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
---|---|---|---|---|---|---|
Test_Replace | 978.1 ns | 153.54 ns | 8.42 ns | 1.2722 | - | 7984 B |
Test_IndexOfInCycle | 336.0 ns | 13.42 ns | 0.74 ns | - | - | - |
Test_CycleOverString | 2,815.7 ns | 148.98 ns | 8.17 ns | - | - | - |
Test_Split | 1,253.2 ns | 85.83 ns | 4.70 ns | 1.4648 | 0.0477 | 9192 B |
Test_RegexMatchesCount | 11,221.4 ns | 1,196.62 ns | 65.59 ns | 1.3428 | 0.0305 | 8480 B |
Test_CountCharUnsafe | 3,054.4 ns | 272.66 ns | 14.95 ns | - | - | - |
The winner is IndexOfInCycle
private static int IndexOfInCycle(string str)
{
int index = -1;
int count = 0;
while (-1 != (index = str.IndexOf('\n', index + 1)))
count++;
return count + 1;
}
UPDATE: there were errors in my benchmark, updated the results.
Also, I even tried iterating over string with unsafe
it still loses to the IndexOf
loop.
Upvotes: 4
Reputation: 1639
Here's my version, based on @NathanielDoldersum 's answer but modified to check for empty strings and more accurately count the last line. I consider a string ending with a newline to not have an additional line after that newline; the last line ends at the end of the string in that case.
It's only the third fastest method according to @AlexfromJitbit 's benchmark, but it doesn't allocate any memory.
/// <summary>
/// Counts the number of lines in a string. If there is a non-empty
/// substring beyond the last newline character, it is also counted as a
/// line, but if the string ends with a newline, it is not considered to have
/// a final line after that newline.
/// Empty and null strings are considered to have no lines.
/// </summary>
/// <param name="str">The string whose lines are to be counted.</param>
/// <returns>The number of lines in the string.</returns>
public static int countLines(string str)
{
if (string.IsNullOrEmpty(str))
{
return 0;
}
int count = 0;
for (int i = 0; i < str.Length; i++)
{
if (str[i] == '\n') count++;
}
if (str.EndsWith("\n"))
{
return count;
}
return count + 1;
}
Here's an XUnit unit test for it (which all pass of course):
[Theory]
[InlineData("1", 1)]
[InlineData("1\n", 1)]
[InlineData("1\r\n", 1)]
[InlineData("1\n2\n3\n", 3)]
[InlineData("1\n2\n3", 3)]
[InlineData("1\r\n2\r\n3\r\n", 3)]
[InlineData("1\r\n2\r\n3", 3)]
[InlineData(null, 0)]
[InlineData("", 0)]
public void countLinesReturnsExpectedValue(string str, int expected)
{
Assert.Equal(expected, CUtils.countLines(str));
}
Upvotes: 0
Reputation: 2251
You could use Regex
. Try this code:
StringBuilder html = new StringBuilder();
//...
int lineCount = Regex.Matches(html.ToString(), Environment.NewLine).Count;
Upvotes: 1
Reputation: 7866
Method 1:
int numLines = aDiff.text.Length - aDiff.text.Replace _
(Environment.NewLine, string.Empty).Length;
Method 2:
int numLines = aDiff.text.Split('\n').Length;
Both will give you number of lines in text.
Upvotes: 103
Reputation: 41
public static int CalcStringLines(string text)
{
int count = 1;
for (int i = 0; i < text.Length; i++)
{
if (text[i] == '\n') count++;
}
return count;
}
That's the fastest/easiest/no memory allocation way to do it...
Upvotes: 4
Reputation: 51
Late to the party here, but I think this handles all lines, even the last line (at least on windows):
Regex.Matches(text, "$", RegexOptions.Multiline).Count;
Upvotes: 5
Reputation: 91
int newLineLen = Environment.NewLine.Length;
int numLines = aDiff.text.Length - aDiff.text.Replace(Environment.NewLine, string.Empty).Length;
if (newLineLen != 0)
{
numLines /= newLineLen;
numLines++;
}
Slightly more robust, accounting for the first line that will not have a line break in it.
Upvotes: 7
Reputation: 698
Efficient and cost least memory.
Regex.Matches( "Your String" , System.Environment.NewLine).Count ;
Off course, we can extend our string class
using System.Text.RegularExpressions ;
public static class StringExtensions
{
/// <summary>
/// Get the nummer of lines in the string.
/// </summary>
/// <returns>Nummer of lines</returns>
public static int LineCount(this string str)
{
return Regex.Matches( str , System.Environment.NewLine).Count ;
}
}
reference : µBio, Dieter Meemken
Upvotes: 4
Reputation: 3471
using System.Text.RegularExpressions;
Regex.Matches(text, "\n").Count
I think counting the occurrence of '\n'
is the most efficient way, considering speed and memory usage.
Using split('\n')
is a bad idea because it makes new arrays of string so it's poor in performance and efficiency! specially when your string gets larger and contains more lines.
Replacing '\n'
character with empty character and calculating the difference is not efficient too, because it should do several operations like searching, creating new strings and memory allocations etc.
You can just do one operation, i.e. search. So you can just count the occurrence of '\n'
character in the string, as @lokimidgard suggested.
It worth mentioning that searching for '\n'
character is better than searching for "\r\n"
(or Environment.NewLine
in Windows), because the former (i.e. '\n'
) works for both Unix and Windows line endings.
Upvotes: 5
Reputation: 3157
You can also use Linq to count occurrences of lines, like this:
int numLines = aDiff.Count(c => c.Equals('\n')) + 1;
Late, but offers alternative to other answers.
Upvotes: 25
Reputation: 1119
A variant that does not alocate new Strings or array of Strings
private static int CountLines(string str)
{
if (str == null)
throw new ArgumentNullException("str");
if (str == string.Empty)
return 0;
int index = -1;
int count = 0;
while (-1 != (index = str.IndexOf(Environment.NewLine, index + 1)))
count++;
return count + 1;
}
Upvotes: 18
Reputation: 24368
I did a bunch of performance testing of different methods (Split, Replace, for loop over chars, Linq.Count) and the winner was the Replace method (Split method was slightly faster when strings were less than 2KB, but not much).
But there's 2 bugs in the accepted answer. One bug is when the last line doesn't end with a newline it won't count the last line. The other bug is if you're reading a file with UNIX line endings on Windows it won't count any lines since Environment.Newline is \r\n
and won't exist (you can always just use \n
since it's the last char of a line ending for UNIX and Windows).
So here's a simple extension method...
public static int CountLines(this string text)
{
int count = 0;
if (!string.IsNullOrEmpty(text))
{
count = text.Length - text.Replace("\n", string.Empty).Length;
// if the last char of the string is not a newline, make sure to count that line too
if (text[text.Length - 1] != '\n')
{
++count;
}
}
return count;
}
Upvotes: 7
Reputation: 1967
to make things easy, i put the solution from poncha in a nice extention method, so you can use it simply like this:
int numLines = aDiff.text.LineCount();
The code:
/// <summary>
/// Extension class for strings.
/// </summary>
public static class StringExtensions
{
/// <summary>
/// Get the nummer of lines in the string.
/// </summary>
/// <returns>Nummer of lines</returns>
public static int LineCount(this string str)
{
return str.Split('\n').Length;
}
}
Have fun...
Upvotes: 3
Reputation: 17724
Inefficient, but still:
var newLineCount = aDiff.Text.Split('\n').Length -1;
Upvotes: 8