sulav_lfc
sulav_lfc

Reputation: 782

removing duplicate content python

I've got a text file containing multiple jobtitle. I want to remove the title that reoccurs. I created 2 empty array, one for all jobtitle and another which stores non-duplicate values. The code i've used is:

with open('jobtitle.txt') as fp:
jobtitle =[]
jobtitle_original = []
for line in fp:
 jobtitle.append(line)
for i in range(0,len(jobtitle)):
 for j in range(0,len(jobtitle_original)):
  if jobtitle_original[j] == jobtitle[i]:
   continue
  else:
   jobtitle_original.append(jobtitle[i])
print jobtitle_original

But it returns me an empty array. I'm using Python 2.7.

Upvotes: 0

Views: 49

Answers (2)

ElmoVanKielmo
ElmoVanKielmo

Reputation: 11300

Combining your file input and set solution.

with open('jobtitle.txt') as fp:
    result = set(fp.readlines())

Upvotes: 1

sshashank124
sshashank124

Reputation: 32189

You can simply use set:

jobs = ['engineer','artist','mechanic','teacher','teacher','engineer','engineer']

print list(set(jobs))
['engineer','artist','mechanic','teacher']

A simpler demonstration:

>>> lst = [1,4,2,4,3,5,3,5,3,5,4,5,4]
>>> print list(set(lst))
[1,4,2,3,5]

set takes a list and creates a set of non-duplicate items. Then, you can simply cast it as a list using list(set(something)).

Upvotes: 1

Related Questions