RidingThisToTheTop
RidingThisToTheTop

Reputation:

How can I remove a trailing newline?

How can I remove the last character of a string if it is a newline?

"abc\n"  -->  "abc"

Upvotes: 2069

Views: 2482961

Answers (28)

Lyle Z
Lyle Z

Reputation: 1363

How about:

line = line.rstrip('\r*\n')

Upvotes: -2

mihaicc
mihaicc

Reputation: 3102

"line 1\nline 2\r\n...".replace('\n', '').replace('\r', '')
>>> 'line 1line 2...'

or you could always get geekier with regexps

Upvotes: 38

Rich Bradshaw
Rich Bradshaw

Reputation: 73045

Try the method rstrip() (see doc Python 2 and Python 3)

>>> 'test string\n'.rstrip()
'test string'

Python's rstrip() method strips all kinds of trailing whitespace by default, not just one newline as Perl does with chomp.

>>> 'test string \n \r\n\n\r \n\n'.rstrip()
'test string'

To strip only newlines:

>>> 'test string \n \r\n\n\r \n\n'.rstrip('\n')
'test string \n \r\n\n\r '

In addition to rstrip(), there are also the methods strip() and lstrip(). Here is an example with the three of them:

>>> s = "   \n\r\n  \n  abc   def \n\r\n  \n  "
>>> s.strip()
'abc   def'
>>> s.lstrip()
'abc   def \n\r\n  \n  '
>>> s.rstrip()
'   \n\r\n  \n  abc   def'

Upvotes: 2324

user1464878
user1464878

Reputation:

s = '''Hello  World \t\n\r\tHi There'''
# import the module string   
import string
# use the method translate to convert 
s.translate({ord(c): None for c in string.whitespace}
>>'HelloWorldHiThere'

With regex

s = '''  Hello  World 
\t\n\r\tHi '''
print(re.sub(r"\s+", "", s), sep='')  # \s matches all white spaces
>HelloWorldHi

Replace \n,\t,\r

s.replace('\n', '').replace('\t','').replace('\r','')
>'  Hello  World Hi '

With regex

s = '''Hello  World \t\n\r\tHi There'''
regex = re.compile(r'[\n\r\t]')
regex.sub("", s)
>'Hello  World Hi There'

with Join

s = '''Hello  World \t\n\r\tHi There'''
' '.join(s.split())
>'Hello  World Hi There'

Upvotes: 9

Venfah Nazir
Venfah Nazir

Reputation: 330


This will work both for windows and linux (bit expensive with re sub if you are looking for only re solution)

import re 
if re.search("(\\r|)\\n$", line):
    line = re.sub("(\\r|)\\n$", "", line)

Upvotes: 0

Alien Life Form
Alien Life Form

Reputation: 1944

This would replicate exactly perl's chomp (minus behavior on arrays) for "\n" line terminator:

def chomp(x):
    if x.endswith("\r\n"): return x[:-2]
    if x.endswith("\n") or x.endswith("\r"): return x[:-1]
    return x

(Note: it does not modify string 'in place'; it does not strip extra trailing whitespace; takes \r\n in account)

Upvotes: 35

Taylor D. Edmiston
Taylor D. Edmiston

Reputation: 13036

I'm bubbling up my regular expression based answer from one I posted earlier in the comments of another answer. I think using re is a clearer more explicit solution to this problem than str.rstrip.

>>> import re

If you want to remove one or more trailing newline chars:

>>> re.sub(r'[\n\r]+$', '', '\nx\r\n')
'\nx'

If you want to remove newline chars everywhere (not just trailing):

>>> re.sub(r'[\n\r]+', '', '\nx\r\n')
'x'

If you want to remove only 1-2 trailing newline chars (i.e., \r, \n, \r\n, \n\r, \r\r, \n\n)

>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r\n')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n')
'\nx'

I have a feeling what most people really want here, is to remove just one occurrence of a trailing newline character, either \r\n or \n and nothing more.

>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n\n', count=1)
'\nx\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n\r\n', count=1)
'\nx\r\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n', count=1)
'\nx'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n', count=1)
'\nx'

(The ?: is to create a non-capturing group.)

(By the way this is not what '...'.rstrip('\n', '').rstrip('\r', '') does which may not be clear to others stumbling upon this thread. str.rstrip strips as many of the trailing characters as possible, so a string like foo\n\n\n would result in a false positive of foo whereas you may have wanted to preserve the other newlines after stripping a single trailing one.)

Upvotes: 9

kuzzooroo
kuzzooroo

Reputation: 7408

I find it convenient to have be able to get the chomped lines via in iterator, parallel to the way you can get the un-chomped lines from a file object. You can do so with the following code:

def chomped_lines(it):
    return map(operator.methodcaller('rstrip', '\r\n'), it)

Sample usage:

with open("file.txt") as infile:
    for line in chomped_lines(infile):
        process(line)

Upvotes: 9

teichert
teichert

Reputation: 4713

It looks like there is not a perfect analog for perl's chomp. In particular, rstrip cannot handle multi-character newline delimiters like \r\n. However, splitlines does as pointed out here. Following my answer on a different question, you can combine join and splitlines to remove/replace all newlines from a string s:

''.join(s.splitlines())

The following removes exactly one trailing newline (as chomp would, I believe). Passing True as the keepends argument to splitlines retain the delimiters. Then, splitlines is called again to remove the delimiters on just the last "line":

def chomp(s):
    if len(s):
        lines = s.splitlines(True)
        last = lines.pop()
        return ''.join(lines + last.splitlines())
    else:
        return ''

Upvotes: 8

internetional
internetional

Reputation: 322

There are three types of line endings that we normally encounter: \n, \r and \r\n. A rather simple regular expression in re.sub, namely r"\r?\n?$", is able to catch them all.

(And we gotta catch 'em all, am I right?)

import re

re.sub(r"\r?\n?$", "", the_text, 1)

With the last argument, we limit the number of occurences replaced to one, mimicking chomp to some extent. Example:

import re

text_1 = "hellothere\n\n\n"
text_2 = "hellothere\n\n\r"
text_3 = "hellothere\n\n\r\n"

a = re.sub(r"\r?\n?$", "", text_1, 1)
b = re.sub(r"\r?\n?$", "", text_2, 1)
c = re.sub(r"\r?\n?$", "", text_3, 1)

... where a == b == c is True.

Upvotes: 3

user7121455
user7121455

Reputation:

>>> '   spacious   '.rstrip()
'   spacious'
>>> "AABAA".rstrip("A")
  'AAB'
>>> "ABBA".rstrip("AB") # both AB and BA are stripped
   ''
>>> "ABCABBA".rstrip("AB")
   'ABC'

Upvotes: 5

Hackaholic
Hackaholic

Reputation: 19763

you can use strip:

line = line.strip()

demo:

>>> "\n\n hello world \n\n".strip()
'hello world'

Upvotes: 32

slec
slec

Reputation: 539

s = s.rstrip()

will remove all newlines at the end of the string s. The assignment is needed because rstrip returns a new string instead of modifying the original string.

Upvotes: 39

kiriloff
kiriloff

Reputation: 26333

You may use line = line.rstrip('\n'). This will strip all newlines from the end of the string, not just one.

Upvotes: 43

Jamie
Jamie

Reputation: 593

I might use something like this:

import os
s = s.rstrip(os.linesep)

I think the problem with rstrip("\n") is that you'll probably want to make sure the line separator is portable. (some antiquated systems are rumored to use "\r\n"). The other gotcha is that rstrip will strip out repeated whitespace. Hopefully os.linesep will contain the right characters. the above works for me.

Upvotes: 56

Sameer Siruguri
Sameer Siruguri

Reputation:

Note that rstrip doesn't act exactly like Perl's chomp() because it doesn't modify the string. That is, in Perl:

$x="a\n";

chomp $x

results in $x being "a".

but in Python:

x="a\n"

x.rstrip()

will mean that the value of x is still "a\n". Even x=x.rstrip() doesn't always give the same result, as it strips all whitespace from the end of the string, not just one newline at most.

Upvotes: 103

Mike
Mike

Reputation: 3803

The canonical way to strip end-of-line (EOL) characters is to use the string rstrip() method removing any trailing \r or \n. Here are examples for Mac, Windows, and Unix EOL characters.

>>> 'Mac EOL\r'.rstrip('\r\n')
'Mac EOL'
>>> 'Windows EOL\r\n'.rstrip('\r\n')
'Windows EOL'
>>> 'Unix EOL\n'.rstrip('\r\n')
'Unix EOL'

Using '\r\n' as the parameter to rstrip means that it will strip out any trailing combination of '\r' or '\n'. That's why it works in all three cases above.

This nuance matters in rare cases. For example, I once had to process a text file which contained an HL7 message. The HL7 standard requires a trailing '\r' as its EOL character. The Windows machine on which I was using this message had appended its own '\r\n' EOL character. Therefore, the end of each line looked like '\r\r\n'. Using rstrip('\r\n') would have taken off the entire '\r\r\n' which is not what I wanted. In that case, I simply sliced off the last two characters instead.

Note that unlike Perl's chomp function, this will strip all specified characters at the end of the string, not just one:

>>> "Hello\n\n\n".rstrip("\n")
"Hello"

Upvotes: 173

minopret
minopret

Reputation: 4806

An example in Python's documentation simply uses line.strip().

Perl's chomp function removes one linebreak sequence from the end of a string only if it's actually there.

Here is how I plan to do that in Python, if process is conceptually the function that I need in order to do something useful to each line from this file:

import os
sep_pos = -len(os.linesep)
with open("file.txt") as f:
    for line in f:
        if line[sep_pos:] == os.linesep:
            line = line[:sep_pos]
        process(line)

Upvotes: 19

Help me
Help me

Reputation: 51

Just use :

line = line.rstrip("\n")

or

line = line.strip("\n")

You don't need any of this complicated stuff

Upvotes: 4

Andrew Grimm
Andrew Grimm

Reputation: 81631

I don't program in Python, but I came across an FAQ at python.org advocating S.rstrip("\r\n") for python 2.2 or later.

Upvotes: 13

Stephen Miller
Stephen Miller

Reputation: 522

If you are concerned about speed (say you have a looong list of strings) and you know the nature of the newline char, string slicing is actually faster than rstrip. A little test to illustrate this:

import time

loops = 50000000

def method1(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string[:-1]
    t1 = time.time()
    print('Method 1: ' + str(t1 - t0))

def method2(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string.rstrip()
    t1 = time.time()
    print('Method 2: ' + str(t1 - t0))

method1()
method2()

Output:

Method 1: 3.92700004578
Method 2: 6.73000001907

Upvotes: 1

user4178860
user4178860

Reputation: 45

A catch all:

line = line.rstrip('\r|\n')

Upvotes: -3

user1151618
user1151618

Reputation:

import re

r_unwanted = re.compile("[\n\t\r]")
r_unwanted.sub("", your_text)

Upvotes: 13

Leozj
Leozj

Reputation: 109

If your question is to clean up all the line breaks in a multiple line str object (oldstr), you can split it into a list according to the delimiter '\n' and then join this list into a new str(newstr).

newstr = "".join(oldstr.split('\n'))

Upvotes: 10

Chij
Chij

Reputation: 89

workaround solution for special case:

if the newline character is the last character (as is the case with most file inputs), then for any element in the collection you can index as follows:

foobar= foobar[:-1]

to slice out your newline character.

Upvotes: 8

Carlos Valiente
Carlos Valiente

Reputation: 912

Careful with "foo".rstrip(os.linesep): That will only chomp the newline characters for the platform where your Python is being executed. Imagine you're chimping the lines of a Windows file under Linux, for instance:

$ python
Python 2.7.1 (r271:86832, Mar 18 2011, 09:09:48) 
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys
>>> sys.platform
'linux2'
>>> "foo\r\n".rstrip(os.linesep)
'foo\r'
>>>

Use "foo".rstrip("\r\n") instead, as Mike says above.

Upvotes: 19

ingydotnet
ingydotnet

Reputation: 2613

rstrip doesn't do the same thing as chomp, on so many levels. Read http://perldoc.perl.org/functions/chomp.html and see that chomp is very complex indeed.

However, my main point is that chomp removes at most 1 line ending, whereas rstrip will remove as many as it can.

Here you can see rstrip removing all the newlines:

>>> 'foo\n\n'.rstrip(os.linesep)
'foo'

A much closer approximation of typical Perl chomp usage can be accomplished with re.sub, like this:

>>> re.sub(os.linesep + r'\Z','','foo\n\n')
'foo\n'

Upvotes: 20

Ryan Ginstrom
Ryan Ginstrom

Reputation: 14121

And I would say the "pythonic" way to get lines without trailing newline characters is splitlines().

>>> text = "line 1\nline 2\r\nline 3\nline 4"
>>> text.splitlines()
['line 1', 'line 2', 'line 3', 'line 4']

Upvotes: 186

Related Questions