Rahul
Rahul

Reputation: 191

Regular Expression Removing All the content from File instead of matching pattern in Python

I have below input file:

Input File: https://drive.google.com/open?id=1BkRRUKn_AvtRmV4L1pQlNLIPL7pVRAoY

I'm trying to add spaces in the text file matching 2 cases as below, and keeping the rest of unmatched line as it is.

So that the output would be as below for a new file or new modified input file.

Output File: https://drive.google.com/open?id=1BkXjlrMG39yusKQ5dw8gYCGYo25xCrXF

Input: Input

Output: Output

What I have tried?

import re
import fileinput

pattern1 = re.compile('[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]+([0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]+)$')
pattern2 = re.compile('PRODUCT:+/^[A-Z]{0,10}$/')

for line in fileinput.input('Input_File_S.txt', inplace = True, backup='.bak'):
   if re.search(pattern1,line):
       line = re.sub(pattern1, '                                                                                                                                      \1', line)
   elif re.search(pattern2,line):
       line = re.sub(pattern2, '                                                                                                                                      \1', line)

However this is just removing all the content from my file. Any help on what i am doing wrong or any corrections would be highly appreciated.

Upvotes: 1

Views: 234

Answers (1)

Michał Turczyn
Michał Turczyn

Reputation: 37377

Try pattern: ^(?:Product.+| {14}[a-zA-Z]{4}$)

Explanation:

^ - match beginning of a line

(?:...) - non-capturing group

Product.+ - match Product and one or more of any characters (except new line)

| - alternation (or),

{14} - match space 14 times

[a-zA-Z]{4} - match lower or uppercase character 4 tiems

$ - match end of a line

Upvotes: 2

Related Questions