Reputation: 3
I am working on a git repo and I need to share folder hierarchy and file names to external vendor to perform some code analysis. I have whole hierarchy available in a csv file.
Problem is that I cannot provide actual folder paths or file names as they contain protected information. For code analysis, external vendor only needs folder paths and file names. They can utilize that information and provide output of code analysis. Internally, we need to have mapping available of actual vs obfuscated file paths / names.
Example of this mapping would be: conf1/conf2/conf3.txt -> dsdasd/dsadsd/dadssd.txt conf1/conf2/conf4.py -> dsdasd/dsadsd/dasdsd.py
Manual mapping is not feasible as the repo contains over 200k files with 20 level deep folder hierarchy. There are 2 requirements for this conversion:
Upvotes: 0
Views: 381
Reputation: 5808
I'll describe how I'd go about this in pseudocode.
NEXT := 1
MAP := empty
for each full path P in your repos
split P using '/' as the delimiter
for each element E of the split path
if it is the last element, remove the extension
if E is in the MAP
CODE := MAP[E]
else
CODE := NEXT
increase NEXT
MAP[E] := CODE
replace E with CODE
if it is the last element, put back the extension
join the transformed elements using '/' as the delimiter
print the result
This will convert:
conf1/conf2/conf3.txt -> 1/2/3.txt
conf1/conf2/conf4.py -> 1/2/4.py
and meets your requirements. If you need to literally obfuscate the path, then you should use some unique random word, instead of NEXT
, in the pseudocode above.
Upvotes: 1