user32826
user32826

Reputation:

Determine what line ending is used in a text file

Whats the best way in C# to determine the line endings used in a text file (Unix, Windows, Mac)?

Upvotes: 12

Views: 14312

Answers (7)

Don
Don

Reputation: 9661

There is Environment.NewLine though that is only for determining what is used on the current system and won't help with reading files from various sources.

If it's reading I usually look for \n (Edit: apperantly there are some using only \r) and assume that the line ends there.

Upvotes: 0

Konrad Rudolph
Konrad Rudolph

Reputation: 545628

Notice that text files may have inconsistent line endings. Your program should not choke on that. Using ReadLine on a StreamReader (and similar methods) will take care of any possible line ending automatically.

If you manually read lines from a file, make sure to accept any line endings, even if inconsistent. In practice, this is quite easy using the following algorithm:

  • Scan ahead until you find either CR or LF.
  • If you read CR, peek ahead at the next character;
  • If the next character is LF, consume it (otherwise, put it back).

Upvotes: 16

unbeli
unbeli

Reputation: 30228

Here is some advanced guesswork: read the file, count CRs and LFs

if (CR > LF*2) then "Mac" 
else if (LF > CR*2) then "Unix"
else "Windows"

Also note, that newer Macs (Mac OS X) use Unix line endings

Upvotes: 3

nothrow
nothrow

Reputation: 16168

Reading most of textual formats I usually look for \n, and then Trim() the whole string (whitespaces at beginning and end are often redundant).

Upvotes: 0

Hans Olsson
Hans Olsson

Reputation: 55009

I'd just search the file for the first \r or \n and if it was a \n I'd look at the previous character to see if it's a \r, if so, it's \r\n otherwise it's whichever found.

Upvotes: 2

zildjohn01
zildjohn01

Reputation: 11515

If it were me, I'd just read the file one char at a time until I came across the first \r or a \n. This is assuming you have sensical input.

Upvotes: 0

Curtis White
Curtis White

Reputation: 6353

I would imagine you couldn't know for sure, would have to set this in the editor. You could use some AI, the algorithm would be:

  1. Search for each type of line ending, you'd search those specific characters
  2. Measure the distances between the them.
  3. If one type tends to repeat then you assume that's the type. Count the repeats and use some measure of dispersion.

So, for example, if you had repeats of CRLF at 38, 40, 45, and that was within tolerance you'd default to assuming the line end was CRLF.

Upvotes: 0

Related Questions