Malaikatu Kargbo
Malaikatu Kargbo

Reputation: 313

Manipulating a string, in a list of lists

I am writing a function that takes in a list as parameter. This parameter is a list of lists of strings, each string contains the first and the last name separated by a white space. I am supposed to check in each list if the first name is repeated, and if so, to create a new list containing the repeated names. The word counts as repeated only if it was repeated in its sublist. E.g.

 >>>findAgents( [["John Knight", "John Doe", "Erik Peterson"],["Fred Douglas", "John Stephans", "Mike Dud", "Mike Samuels"]])

would yield

 ['John', 'Mike']

So far I have been able to iterate through the list and access first names. But I don't know how to organize them in a way that will keep them in their own areas, so I can check if something is repeated JUST in that area. This is my code:

def findAgents(listOlists):
newlist = []
x = 0
for alist in listOlists:
    for name in alist:
        space = name.find(" ")
        firstname = (name[0:space])
        print( firstname)

Upvotes: 2

Views: 97

Answers (3)

gregory
gregory

Reputation: 12895

I'd use regex and pluck out the duplicate name from each list:

import re

names = [["John Knight", "John Doe", "Erik Peterson"],["Fred Douglas", "John Stephans", "Mike Dud", "Mike Samuels"]]

def extractDups(names):
       res = []
       for eachlist in names:
          res.extend(re.findall(r'\b(\w+)\b.*\1', ' '.join(eachlist)))
       return(res)

example:

    >>>extractDups(names)
    ['John', 'Mike'] 

Upvotes: 0

Jarvis
Jarvis

Reputation: 8564

You can try this :

def func(temp) :
dic = {}
for i in temp :
    for j in i :
        dic[j.split(" ")[0]] = dic.get(j.split(" ")[0], 0) + 1
return dic

Now, we need to get all names whose count is greater than or equal to 2. This can be done by a single iteration over the dictionary :

temp = []
for i in dic :
    if dic[i] >= 2 :
        temp.append(dic[i])

The list temp will contain the desired result.

Upvotes: 0

Jean-François Fabre
Jean-François Fabre

Reputation: 140168

I'd rewrite that using collections.Counter in a flattened list comprehension, counting the first names (using str.partition) and filtering on first names when more than 1 occurrence:

l = [["John Knight", "John Doe", "Erik Peterson"],["Fred Douglas", "John Stephans", "Mike Dud", "Mike Samuels"]]

import collections

x = [k for sl in l for k,v in collections.Counter(x.partition(" ")[0] for x in sl).items() if v>1]
print(x)

result:

['John', 'Mike']

Upvotes: 1

Related Questions