MAAHE
MAAHE

Reputation: 169

find and remove any line has this format from a text file

I have text files that contain several lines with different formats. I need to delete any line that has only numbers in this format (number.). example, I want to delete only these lines (01.,19,31.,20.). I can't use numbers or positions because numbers and positions differ from file to another

0.01        0.01        
80.            1
01. 
19. 
31. 
20. 
51. t4           0.
24. t3           0.
06. t2           0.
01. t1           0.

I am trying this,

import re
with open("file.txt", "r") as f:
    lines = f.readlines()
with open("file.txt", "w") as f:
    for line in lines:
        if line.strip("\n") != re.match('[0-100].', line):
            f.write(line)

The ouput i am looking for

0.01        0.01        
80.            1
51. t4           0.
24. t3           0.
06. t2           0.
01. t1           0.

Upvotes: 0

Views: 52

Answers (1)

Sebcworks
Sebcworks

Reputation: 139

As said in comments, there is a problem with the regexp. In your case, you'll want a condition like this one:

if not re.match('^[0-9]{1,3}\.$', line.strip()):

The match will be from 0. to 999., if you really want to restrict up to 100. and with always a leading 0, so you can do somthing like this:

if not re.match('^(?:[0-9]{2}|100)\.$', line.strip()):

You can test your regexp with websites like this one: https://regex101.com/ (don't forget to select Python on the left side)

Upvotes: 2

Related Questions