Pomster
Pomster

Reputation: 15197

How to count lines in a string?

I am removing text from a string and what to replace each line with a blank line.

Some background: I am writing a compare function that compares two strings. Its all working fine and are displayed in there two separate web browsers. When i try scroll down on my browsers the strings are different lengths, I want to replace the text i am removeing with a blank line so that my strings are the same length.

In the code below i am looking to count how many lines aDiff.Text has

Here is my code:

public string diff_prettyHtmlShowInserts(List<Diff> diffs)
    {
        StringBuilder html = new StringBuilder();

        foreach (Diff aDiff in diffs)
        {
            string text = aDiff.text.Replace("&", "&amp;").Replace("<", "&lt;")
              .Replace(">", "&gt;").Replace("\n", "<br>"); //&para;
            switch (aDiff.operation)
            {

                case Operation.DELETE:                              
                   //foreach('\n' in aDiff.text)
                   // {
                   //     html.Append("\n"); // Would like to replace each line with a blankline
                   // }
                    break;
                case Operation.EQUAL:
                    html.Append("<span>").Append(text).Append("</span>");
                    break;
                case Operation.INSERT:
                    html.Append("<ins style=\"background:#e6ffe6;\">").Append(text)
                        .Append("</ins>");
                    break;
            }
        }
        return html.ToString();
    }

Upvotes: 59

Views: 104190

Answers (14)

Alex from Jitbit
Alex from Jitbit

Reputation: 60596

I benchmarked all the answers.

Stack:

  • BenchmarkDotNet
  • .NET 6
  • Intel Core i7-9700K
  • HTML file with 50 lines
Method Mean Error StdDev Gen0 Gen1 Allocated
Test_Replace 978.1 ns 153.54 ns 8.42 ns 1.2722 - 7984 B
Test_IndexOfInCycle 336.0 ns 13.42 ns 0.74 ns - - -
Test_CycleOverString 2,815.7 ns 148.98 ns 8.17 ns - - -
Test_Split 1,253.2 ns 85.83 ns 4.70 ns 1.4648 0.0477 9192 B
Test_RegexMatchesCount 11,221.4 ns 1,196.62 ns 65.59 ns 1.3428 0.0305 8480 B
Test_CountCharUnsafe 3,054.4 ns 272.66 ns 14.95 ns - - -

The winner is IndexOfInCycle

private static int IndexOfInCycle(string str)
{
    int index = -1;
    int count = 0;
    while (-1 != (index = str.IndexOf('\n', index + 1)))
        count++;
    return count + 1;
}

UPDATE: there were errors in my benchmark, updated the results.

Also, I even tried iterating over string with unsafe it still loses to the IndexOf loop.

Upvotes: 4

Christopher Hamkins
Christopher Hamkins

Reputation: 1639

Here's my version, based on @NathanielDoldersum 's answer but modified to check for empty strings and more accurately count the last line. I consider a string ending with a newline to not have an additional line after that newline; the last line ends at the end of the string in that case.

It's only the third fastest method according to @AlexfromJitbit 's benchmark, but it doesn't allocate any memory.

        /// <summary>
        /// Counts the number of lines in a string. If there is a non-empty
        /// substring beyond the last newline character, it is also counted as a
        /// line, but if the string ends with a newline, it is not considered to have
        /// a final line after that newline.
        /// Empty and null strings are considered to have no lines.
        /// </summary>
        /// <param name="str">The string whose lines are to be counted.</param>
        /// <returns>The number of lines in the string.</returns>
        public static int countLines(string str)
        {
            if (string.IsNullOrEmpty(str))
            {
                return 0;
            }
            int count = 0;
            for (int i = 0; i < str.Length; i++)
            {
                if (str[i] == '\n') count++;
            }
            if (str.EndsWith("\n"))
            {
                return count;
            }
            return count + 1;
        }

Here's an XUnit unit test for it (which all pass of course):

        [Theory]
        [InlineData("1", 1)]
        [InlineData("1\n", 1)]
        [InlineData("1\r\n", 1)]
        [InlineData("1\n2\n3\n", 3)]
        [InlineData("1\n2\n3", 3)]
        [InlineData("1\r\n2\r\n3\r\n", 3)]
        [InlineData("1\r\n2\r\n3", 3)]
        [InlineData(null, 0)]
        [InlineData("", 0)]
        public void countLinesReturnsExpectedValue(string str, int expected)
        {
            Assert.Equal(expected, CUtils.countLines(str));
        }

Upvotes: 0

MB_18
MB_18

Reputation: 2251

You could use Regex. Try this code:

StringBuilder html = new StringBuilder();
//...
int lineCount = Regex.Matches(html.ToString(), Environment.NewLine).Count;

Upvotes: 1

poncha
poncha

Reputation: 7866

Method 1:

int numLines = aDiff.text.Length - aDiff.text.Replace _
                   (Environment.NewLine, string.Empty).Length;

Method 2:

int numLines = aDiff.text.Split('\n').Length;

Both will give you number of lines in text.

Upvotes: 103

Nathaniel Doldersum
Nathaniel Doldersum

Reputation: 41

public static int CalcStringLines(string text)
{
    int count = 1;
    for (int i = 0; i < text.Length; i++)
    {
        if (text[i] == '\n') count++;
    }

    return count;
}

That's the fastest/easiest/no memory allocation way to do it...

Upvotes: 4

tmr6183
tmr6183

Reputation: 51

Late to the party here, but I think this handles all lines, even the last line (at least on windows):

Regex.Matches(text, "$", RegexOptions.Multiline).Count; 

Upvotes: 5

Graham Bedford
Graham Bedford

Reputation: 91

int newLineLen = Environment.NewLine.Length;
int numLines = aDiff.text.Length - aDiff.text.Replace(Environment.NewLine, string.Empty).Length;
if (newLineLen != 0)
{
    numLines /= newLineLen;
    numLines++;
}

Slightly more robust, accounting for the first line that will not have a line break in it.

Upvotes: 7

Radian Jheng
Radian Jheng

Reputation: 698

Efficient and cost least memory.

Regex.Matches( "Your String" , System.Environment.NewLine).Count ;

Off course, we can extend our string class

using System.Text.RegularExpressions ;

public static class StringExtensions
{
    /// <summary>
    /// Get the nummer of lines in the string.
    /// </summary>
    /// <returns>Nummer of lines</returns>
    public static int LineCount(this string str)
    {
        return Regex.Matches( str , System.Environment.NewLine).Count ;
    }
}

reference : µBio, Dieter Meemken

Upvotes: 4

Majid
Majid

Reputation: 3471

using System.Text.RegularExpressions;

Regex.Matches(text, "\n").Count

I think counting the occurrence of '\n' is the most efficient way, considering speed and memory usage.

Using split('\n') is a bad idea because it makes new arrays of string so it's poor in performance and efficiency! specially when your string gets larger and contains more lines.

Replacing '\n' character with empty character and calculating the difference is not efficient too, because it should do several operations like searching, creating new strings and memory allocations etc.

You can just do one operation, i.e. search. So you can just count the occurrence of '\n' character in the string, as @lokimidgard suggested.

It worth mentioning that searching for '\n' character is better than searching for "\r\n" (or Environment.NewLine in Windows), because the former (i.e. '\n') works for both Unix and Windows line endings.

Upvotes: 5

CrnaStena
CrnaStena

Reputation: 3157

You can also use Linq to count occurrences of lines, like this:

int numLines = aDiff.Count(c => c.Equals('\n')) + 1;

Late, but offers alternative to other answers.

Upvotes: 25

lokimidgard
lokimidgard

Reputation: 1119

A variant that does not alocate new Strings or array of Strings

private static int CountLines(string str)
{
    if (str == null)
        throw new ArgumentNullException("str");
    if (str == string.Empty)
        return 0;
    int index = -1;
    int count = 0;
    while (-1 != (index = str.IndexOf(Environment.NewLine, index + 1)))
        count++;

   return count + 1;
}

Upvotes: 18

JeremyWeir
JeremyWeir

Reputation: 24368

I did a bunch of performance testing of different methods (Split, Replace, for loop over chars, Linq.Count) and the winner was the Replace method (Split method was slightly faster when strings were less than 2KB, but not much).

But there's 2 bugs in the accepted answer. One bug is when the last line doesn't end with a newline it won't count the last line. The other bug is if you're reading a file with UNIX line endings on Windows it won't count any lines since Environment.Newline is \r\n and won't exist (you can always just use \n since it's the last char of a line ending for UNIX and Windows).

So here's a simple extension method...

public static int CountLines(this string text)
{
    int count = 0;
    if (!string.IsNullOrEmpty(text))
    {
        count = text.Length - text.Replace("\n", string.Empty).Length;

        // if the last char of the string is not a newline, make sure to count that line too
        if (text[text.Length - 1] != '\n')
        {
            ++count;
        }
    }

    return count;
}

Upvotes: 7

Dieter Meemken
Dieter Meemken

Reputation: 1967

to make things easy, i put the solution from poncha in a nice extention method, so you can use it simply like this:

int numLines = aDiff.text.LineCount();

The code:

/// <summary>
/// Extension class for strings.
/// </summary>
public static class StringExtensions
{
    /// <summary>
    /// Get the nummer of lines in the string.
    /// </summary>
    /// <returns>Nummer of lines</returns>
    public static int LineCount(this string str)
    {
        return str.Split('\n').Length;
    }
}

Have fun...

Upvotes: 3

nunespascal
nunespascal

Reputation: 17724

Inefficient, but still:

var newLineCount = aDiff.Text.Split('\n').Length -1;

Upvotes: 8

Related Questions