re.findall on each sentence of a list

Question

I have got a list of sentences:

[                              'home twn cafe nr link rd',  
                                'taj lands ends hotel..',  
            'SILVER PALACE705BPALI MALA ROADBANDRA WEST',  
     'turner rd lemon rd 4 fountain  pali rd junctio...',    
      ' FLAT 657 FLOOR AIR INDIA APTS 61B PALI HILL',  
                        'bungalow 9 Mt Mary Bandra West',  
     'shabbir apt charklie rajan rd abv icici ban...',  
                'st peters church backyard loun hill rd',  
                                       'Union Park Road ', 
                                 'Flat 32 Building No 8',  
                                       'mehboob studio',  
                                          'ONGC Colony',  
'Nargis Dutt Road Grand Canyon Building Appa']

I need to use re.findall to find all words with 'rd', and replace them with 'road'. I tried this :

data2 = [nltk.sent_tokenize(lines) for lines in data]  
c = [re.findall('nr',sent) for sent in data2]

and I got this error :

TypeError: expected string or buffer

how do I use re.findall in an iterative statement? dunno how to convert to string.. plz help

thefourtheye · Accepted Answer

I would use a simple RegEx and list comprehension like this

import re
pattern = re.compile(r"\brd\b")
print [pattern.sub("road", line) for line in data]

Output

['home twn cafe nr link road',
 'taj lands ends hotel..',
 'SILVER PALACE705BPALI MALA ROADBANDRA WEST',
 'turner road lemon road 4 fountain  pali road junctio...',
 ' FLAT 657 FLOOR AIR INDIA APTS 61B PALI HILL',
 'bungalow 9 Mt Mary Bandra West',
 'shabbir apt charklie rajan road abv icici ban...',
 'st peters church backyard loun hill road',
 'Union Park Road ',
 'Flat 32 Building No 8',
 'mehboob studio',
 'ONGC Colony',
 'Nargis Dutt Road Grand Canyon Building Appa']

re.findall on each sentence of a list

Answers (1)

Related Questions