Rahul Varma
Rahul Varma

Reputation: 550

regex replace unable to substitute in Python with regex variables

we have huge number of files where we need to transfrom to json here is the sampple data of one file

{
1=2,
4=tt,
6=9
}
{
1=gg,
2=bd,
6=bb
}

I am using python to convert the data where regex expression is working fine but the same regex is not working when i implementing in python code here is the code

import numpy as np
f = open('/Users/rahulvarma/Downloads/2020120911.txt', 'r')
content = f.read()
import re
regex = r"([0-9]+)(=)((.*)+)"
subst = "\"$1\":\"$3\","
result = re.sub(regex, subst, content,  0, re.MULTILINE)

if result:
    print (result)

but my were

{
"$1":"$3",
"$1":"$3",
"$1":"$3"
}
{
"$1":"$3",
"$1":"$3",
"$1":"$3"
}

my expected output should be

{
"1":"2",
"4":"tt",
"6":"9"
}
{
"1":"gg",
"2":"bd",
"6":"bb"
}

Upvotes: 2

Views: 101

Answers (1)

anubhava
anubhava

Reputation: 785058

You can search using this regex:

(\d+)=([^,\n]*)(,|$)

And replace using:

"\1":"\2"\3

RegEx Demo

Code:

regex = r"(\d+)=([^,\n]*)(,|$)"

result = re.sub(regex, r'"\1":"\2"\3', input_str, 0, re.MULTILINE)

RegEx Details:

  • (\d+): Match 1+ digits in captured group #1
  • =: Match = character
  • ([^,\n]*): Match 0 or more of any characters that are not , and not \n in captured group #2
  • (,|$): Match comma or end of line in captured group #3

Upvotes: 3

Related Questions