Reputation: 111
I'm trying to write a Python program that will basically take as input a raw Python source file and change all variable names to V, and all method or function names to F and leave everything else as is. So I'm trying to achieve this:
Input:
def gcd(x, y):
"""Sample docstring."""
if x == y:
return x
if x > y:
if y == 0: return x
return gcd(x-y, y)
if y > x:
if x == 0: return y
# Sample comment.
return gcd(x, y-x)
Output:
def F(V, V):
"""Sample docstring."""
if V == V:
return V
if V > V:
if V == 0: return V
return F(V-V, V)
if V > V:
if V == 0: return V
# Sample comment.
return F(V, V)
Everything else in the Python source file will be kept as is, including comments, docstrings, etc. I'm not entirely sure how to do this, as I would also like to later on be able to do this same sort of processing with Java and C++ files. (I.E. Change all variable names to V and all function/method names to F). Therefore, using something like the Python tokenize module or the Python ast module might not be the best option, as this will limit me to only being able to process Python source files, and not Java or C++ files.
I have already made an attempt using PLY, but the issue is, I have to specify all regular expression rules, and the whole Python grammar. There must be an easier way to achieve what I'm trying to do? Or is that the only way possible if I plan on dealing with Java and C++ source code files at a later stage?
Would be really great if I could get some idea or feedback on what the best option is to go about this.
Upvotes: 5
Views: 2587
Reputation: 31025
If you have that kind of code you could use two regex. One to get the function name and other to get the variables.
If you use this regex:
\w+(?=\()|\b[xy]\b
You can see that it matches everything you need.
So, you can use both regex separatedly to replace with the content you want. The first step would be the function name replacement using:
\w+(?=\()
Second regex would be to replace X and Y by V:
\b[xy]\b
Upvotes: 1