Derek Mancina
Derek Mancina

Reputation: 11

How to strip the first two numbers and a period in Python?

I'm a complete newbie in python. I've been trying to strip the first two characters and a period from a file that contains this data:

12.This a line

13. This is a line too
14. 12 and 13 please stop fighting

I want to strip the 12.1 from line 1. Also, I want to remove the newline. But in line 3. there is a space after . I need to remove that too.

So far this is what I've tried: import re

with open('linex.txt', 'r+') as lines:
    for line in lines:
        line = line[2:]
        lines.write(line)

Can someone guide me to get this thing done?

Upvotes: 0

Views: 785

Answers (2)

Martijn Pieters
Martijn Pieters

Reputation: 1121834

Use str.partition() to get everything after the first dot, then str.strip() to remove all leading and trailing whitespace:

line = line.partition('.')[-1].strip()

Demo:

>>> sample = '''\
... 12.This a line
... 13. This is a line too
... 14. 12 and 13 please stop fighting
... '''
>>> for line in sample.splitlines(True):
...     print repr(line.partition('.')[-1].strip())
... 
'This a line'
'This is a line too'
'12 and 13 please stop fighting'

Using str.partition() does result in an empty string if there is no . in the line. The alternative is to use str.split() with a separator and limit:

line = line.split('.', 1)[-1].strip()

which will result in the original line (but stripped) if there is no period at all.

A quick demo showing the differences:

>>> 'foo bar baz'.partition('bar')
('foo ', 'bar', ' baz')
>>> 'foo bar baz'.partition('bar')[-1]
' baz'
>>> 'foo baz'.partition('bar')
('foo baz', '', '')
>>> 'foo baz'.partition('bar')[-1]
''
>>> 'foo bar baz'.split('bar', 1)
['foo ', ' baz']
>>> 'foo bar baz'.split('bar', 1)[-1]
' baz'
>>> 'foo baz'.split('bar', 1)
['foo baz']
>>> 'foo baz'.split('bar', 1)[-1]
'foo baz'

Upvotes: 0

dom0
dom0

Reputation: 7486

line = re.sub(r"^\d{2}\.", "", line).strip()

^ matches only the start of the line then \d{2} selects two numbers, \. the literal dot. sub replaces then everything selected by the aforementioned regular expression with an empty string (the second argument). strip() then removes whitespace from both ends of the result.

Reference: https://docs.python.org/3/library/re.html#re.sub

Upvotes: 1

Related Questions