MaciejPL
MaciejPL

Reputation: 1049

Python: replace string with regex

my problem here is that I have a huge amount of files. Each xml file contains ID. And I got set of source and target files. Source files has name A and ID = B. Target file has name B and ID=B What I need to do is to match source ID B with Target name B and then replace target ID=B with source name A. Hope its clear

Here is my code

import os
import re

sourcepath = input('Path to source folder:\n')
targetpath = input('Path to target folder:\n')
for root,dir,source in os.walk(sourcepath):
    for each_file in source:
        os.chdir(root)
        correctID = each_file[:16]
        each_xml = open(each_file, 'r', encoding='utf8').read()
        findsourceID = re.findall('id="\w{3}\d{13}"', each_xml)
        StringID = str(findsourceID)
        correctFilename = StringID[6:22]
        IDtoreplace = 'id="' + correctID + '"'
        print(IDtoreplace)
        for main,folder,target in os.walk(targetpath):
            for each_target in target:
                os.chdir(main)
                targetname = each_target[:16]
                if targetname == correctFilename:
                    with open(each_target, 'r+', encoding='utf8') as each_targ:
                        each_targ.read()
                        findtargetID = re.sub('id="\w{3}\d{13}"',str(IDtoreplace), each_targ)
                        each_targ.close()

And here is the error

File "C:/Users/ms/Desktop/Project/test.py", line 23, in <module>
    findtargetID = re.sub('id="\w{3}\d{13}"',str(IDtoreplace), each_targ)
  File "C:\Users\ms\AppData\Local\Programs\Python\Python35\lib\re.py", line 182, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

Upvotes: 0

Views: 103

Answers (1)

mike3996
mike3996

Reputation: 17517

You read() from each_targ but you don't store the string anywhere.

Instead you pass the file handle each_targ to .sub and that causes the type mismatch here. You could just say:

findtargetID = re.sub('id="\w{3}\d{13}"',str(IDtoreplace), each_targ.read())

Upvotes: 1

Related Questions