Reputation: 1
Trying to find duplicates in an array where each list inside the list is a different row of a document. Im trying to find the words where that are the same
def helper(a):
for x in range(len(a)-1):
for y in range(len(a[x])):
for i in range(len(a)):
for j in range(len(a[x])-1):
if(a[x][y]==a[i][j]):
if(x!=i and y!=j):
print(a[i][j])
a=[[i, will, always, be, very, happy][happy,people, are, cool, very]]
only prints out happy when I want happy and very to be printed if I change the -1 in the for loops it gets an index out of bounds error
Upvotes: 0
Views: 441
Reputation: 25
Okay, let's flatten this out using functools.reduce
first, and then use the built-in set
datatype to wipe duplicates out.
from functools import reduce
def helper(a):
b = reduce(lambda x, y: x + y, a)
# TODO: rest of the function
This will give us a 1D list, but it will still have several repeating elements. Essentially, what we are doing is that we are adding all the member lists together into one big list. Let me explain this part.
lambda
function is a way to define a nameless function in one line, often inside the argument of a function call. It is written using the keyword lambda
. That is what the lambda x, y: x + y
does here: it defines a Lambda Function which adds two entities. Note that when this function receives two lists as arguments, it returns a single list containing all their members.reduce
function from the functools
library takes two arguments - a function (in this case, the lambda function that adds), and a list (or iterable, if you know what those are). The function it takes as an argument must be a reducing function - a function which takes in two arguments and returns a single value. Essentially, the reduce
function will take the first two members of the list and apply the function it reads as its argument, then take the result of this and apply the function again to this result and the third member of the list, and then take the result of that and apply the function to the result and the fourth member of the list, and so on. In the end, it will have reduced the entire list to a single entity.set
datatype to do this. To obtain unique values of a list L
, use:L_unique = list(set(L))
Applying this inside our function:
def helper(a):
b = reduce(lambda x, y: x + y, a)
c = list( set(b) )
# TODO: Print out the members
This is done, all that remains is to print out everything inside the new list c
. This list contains every unique value in a
.
To wit:
def helper(a):
"""
Prints out unique values in the
2D list named `a`
"""
b = reduce(lambda x, y: x + y, a)
c = list( set(b) )
for item in c:
print(item)
If you wish to implement this using algorithms instead of using builtins and libraries, then this method ought to do:
def helper(a):
# 1. Linearize the 2D list
b = []
for item in a:
b = b + item
# 2. Print unique values in 1D list
c = []
for item in b:
if item in c:
continue
else:
c.append(item)
print(item)
I am hoping that this is self-explanatory. If it is confusing, kindly comment on what you find difficult in this answer.
Upvotes: 0
Reputation: 1620
Concise answer thanks to a list comprehension that allows to easily create a list.
word for word in a[0]
is quite explicit, it loops over the word of the first row.
if word in a[1]
retains only words that belong to the 2nd row.
duplicates = [word for word in a[0] if word in a[1]]
print(duplicates) # ['very', 'happy']
The two in
keywords have nothing to do.
The 1nd in
is involved in the foreach loop construct.
The 2nd in
is a membership operator.
Upvotes: 1
Reputation: 163
a =[["i", "will", "always", "be", "very", "happy"],["happy","people", "are", "cool", "very"]]
for i in range(len(a)-1):
res = set(a[i]) & set(a[i+1])
print(res)
Using sets you are able to acheive this with only one loop
Upvotes: 1
Reputation: 302
Your for
is kind of complicated. I would solve it like this:
same_words = list()
for scanning_list in a:
for scanned_list in a:
if scanning_list == scanned_list:
continue
for scanning_item in scanning_list:
if scanning_item in scanned_list and scanning_item not in same_words:
print(scanning_item)
same_words.append(scanning_item)
Upvotes: -1