Reputation: 21
I am trying to take a cpp file that has already been written and add header files to the list of includes using a python script. Currently, I create a string that has all of the includes that I want to add, and then using the re module I replace on of the includes with my string. All of the includes have a "\t" in there name, and this is causing issues; instead of printing the line as expected (#include "abc\type\GenericTypeMT.h
), I am getting #include "abc ype\GenericTypeMT.h
. When I print my string to the console, it has the expected form which leads me to believe that this is an re.sub issue and not an issue writing to the file. Below is an the code.
import re
import string
INCLUDE = "#include \"abc\\type\\"
with open("file.h", "r+") as f:
a = ""
b = ""
for line in file:
a = a + line
f.seek(0,0)
types = open("types.txt", "r+")
for t in types:
head = INCLUDE + t.strip() + "MT.h"
b = b + head + "\n"
a = re.sub(r'#include "abc\\type\\GenericTypeMT\.h"', b, a)
types.close()
print b
print a
f.write(a)
The output for b
is:
#include "abc\type\GenericTypeMT.h"
#include "abc\type\ServiceTypeMT.h"
#include "abc\type\AnotherTypeMT.h"
The (truncated) output for a
is:
/* INCLUDES *********************************/
#include "abc ype\GenericTypeMT.h"
#include "abc ype\ServiceTypeMT.h"
#include "abc ype\AnotherTypeMT.h"
#include <map>
...
The closest thing to my question that I could find was How to write \t to file using Python, but that is different than my problem, since mine seems to stem from the substitutions done by the regular expression, as shown by the print before the write.
Upvotes: 1
Views: 1537
Reputation: 1123860
The re.sub()
function expands meta-characters (escape sequences) in the replacement string too. The \t
character sequence (consisting of two characters, \
and t
) in your replacement string interpreted, by the re
module, as the escape sequence for a tab character:
>>> import re
>>> re.sub(r'^.', '\\t', 'foo')
'\too'
>>> print(re.sub(r'^.', '\\t', 'foo'))
oo
But if you used a function for the replacement value, then no such expansion takes place. Note that this includes not processing placeholders, you'd have to use the match object passed into the function to create your own placeholder insertion logic.
You don't have any placeholders in your code, so a lambda
to create the function should suffice:
a = re.sub(r'#include "abc\\type\\GenericTypeMT\.h"', lambda m: b, a)
Demo on the same contrived foo
sample string from before:
>>> re.sub(r'^.', lambda m: '\\t', 'foo')
'\\too'
>>> print(re.sub(r'^.', lambda m: '\\t', 'foo'))
\too
The re.escape()
function, is unfortunately too greedy with adding \
backslashes to many more characters than just replacement meta-characters; you'd end up with many more backslashes than you started with.
Note that because you don't actually do any pattern matching in your substitution, you may as well just use str.replace()
to do the job:
a = a.replace(r'#include "abc\type\GenericTypeMT.h"', b)
The \
and .
characters are no longer a meta character in a regular expression, so they doesn't need escaping either.
Upvotes: 2