jO.
jO.

Reputation: 3522

python search and replace

I have a file with string of text containing names which I would like to replace. I have another file with two columns, A and B containing names. Column A contains the same names as in the string (file 1). I would basically like to replace those names with the names in column B. I have tried using Python, but I'm still too much of a beginner to pull it off. Any pointer would be greatly appreciated.


File1               
NameA.....NameB....NameC....etc

File2                 
A     B    
NameA NameD         
NameB NameE          
NameC NameF

Would like;

File1                       
NameD....NameE....NameF....etc

Upvotes: 0

Views: 277

Answers (5)

jO.
jO.

Reputation: 3522

Thanks for the replies. Although, none didn't really work properly. Probably due to the nature of the string in file1 (newick format). This is what I originally worked on... probably not so good. Although, if I could get a replace function to work it might do the trick..?

import re

LineString = open("file1.txt", "r").read()

pattern = re.compile('\d+OTU\_\d+\_\w+\_\d+')
words = pattern.findall(LineString)

colA = []
colB = []

with open("file2.txt", "r") as f:
for line in f:
    parts = line.split()
    if len(parts) > 0:
        colA.append(parts[0])   
    if len(parts) > 1:
        colB.append(parts[1])

#Doesnt work
if words == colA:
LineString.replace(colA, colB)

String in file1 one looks like:(((((((((('1OTU_1_769_wint_446':0.00156420,'1OTU_1_822_wint_445':0.00000000)0.5700:0.00156410,'1OTU_1_851_wint_454':0.00000000) etc...

words, colA, colB looks like: e.g. 1OTU_1_769_wint_446

Upvotes: 0

Davit
Davit

Reputation: 81

I think you need code like this:

File1 = open("File1", "r")   
File2 = open("File2", "r")   
File3 = open("File3","w")

for line in File2:

    A, B  = line.strip().split('\t')

    for line_string in File1:

        line_string.replace(A,B)

        File3.write('%s\n' % line_string)

File3.close()

Upvotes: 0

Yarkee
Yarkee

Reputation: 9424

with open('File1', 'r') as fd:
    keys = fd.read().split()

name_map = {}

with open('File2', 'r') as fd:
    for line in fd.readlines():
        key, value = line.split()
        name_map[key] = value

with open('File1', 'w') as fd:
    new_names = []
    for k in keys:
        new_names.append(name_map[k])
    fd.write(" ".join(new_names))

Upvotes: 1

Luka Rahne
Luka Rahne

Reputation: 10517

#read filrst file as list
with open("file1") as f:
    names1=f.read().strip().split();

#read file2 as dictionary
with open("file2") as f:  
    names2=dict(i.strip().split() for i in  f.readlines())

#write replacement in file3
with open("file3","w") as f:
    f.write(" ".join(names2[i] for i in names1))

Upvotes: 1

hillmandj
hillmandj

Reputation: 83

I would consider using RegEx (the re module in Python). This would allow you to create functions that could search for specific text patterns. You could extract select "groups" of text using the group() function if you properly construct your re.compile() function and re.search() function. The library is quite extensive, so here is a link to the documentation:

http://docs.python.org/2/library/re.html

I would also check out an online tutorial, such as this one:

http://www.youtube.com/watch?v=DRR9fOXkfRE

Upvotes: 0

Related Questions