Benjie Perez
Benjie Perez

Reputation: 76

How to remove duplicate words in an array?

I have these arrays in python:

noDuplicateArr = ['"foo barr', '"foo corp', '"barr corp']
wordsArr = ['"fool barr', '"fool corp"']

Now what's the best approach to not append in noDuplicateArr the words "fool barr" & "fool corp" because "barr" and "corp" are already present in noDuplicateArr?

Upvotes: 0

Views: 187

Answers (2)

Ben
Ben

Reputation: 2472

To better phrase this, you want to prevent appending a string to a list of strings if it contains a word/substring that already exists in it. You'll need to use a set to keep track of words that have already been added.

noDuplicateArr = ['"foo barr', '"foo corp', '"barr corp']
wordsArr = ['"fool barr', '"fool corp"']

seen_words = set()
for words in noDuplicateArr:
  words = words.strip('"')
  seen_words |= set(words.split())

for words in wordsArr:
  seen = False
  words = words.strip('"')
  for word in words.split():
    if word in seen_words:
      seen = True
      continue
  if not seen:
    noDuplicateArr.append(words)

Upvotes: 1

jcf
jcf

Reputation: 602

list(set(noDuplicateArr.extend(wordsArr)))

This will give you an array with unique entries only.

Upvotes: 0

Related Questions