siddpro
siddpro

Reputation: 81

removing duplicate strings from the list?

I wrote a program to extract all email addresses from a text file starting from 'From:'.I created a list to store all extracted email addresses into list and create another list to store only unique email addresses by removing duplicate email addresses. Now I am getting the output but at the same time I am getting output which shows 'set'before printing the new list i.e after "print Unique_list"

Note - original text file is not attached as I dont know how to do it.

Thank you

  print "Please enter the file path:\n"
  text_file = raw_input ("Enter the file name:")
  print "Opening File\n"
  #Using try and except to print error message if file not found
  try:
    open_file = open ( text_file )
    print "Text file " + text_file + " is opened \n"
 except:
 #Printing error message 
     print "File not found"
 #Using "raise SystemExit" for program termination if file not found
    raise SystemExit
 #Creating dynamic list to store the no. Email addresses starting from       'From:'
 Email_list = [];
 #Using loop to extract the Email addresses starting from 'From:'
 for line in open_file:
   if 'From:' in line:
 #Adding each extracted Email addresses at the end of the list
    Email_list.append(line)
 print "Printing extracted Email addresses\n"       
 print  Email_list,"\n"
 print "Before removing duplicate Email addresses, the total no. of Email      addresses starting with 'From:'",len(Email_list),"\n"
 #Removing duplicate Email addresses 
 Unique_list = set(Email_list)
 #print Email_list.count()
 print "Printing Unique Email addresses\n"
 print (Unique_list)
 print "After removing duplicate Email addresses, the total no. of Email   [enter image description here][1]address starting with From:, ",len(Unique_list),"\n" )`

Upvotes: 2

Views: 1300

Answers (2)

Storvig
Storvig

Reputation: 51

The answer may depend on the goal. It is not clear based on the question whether the goal is exclusively to print the addresses in a specific way; or to print them according to some assumptions of readability. If the goal is to print in a given desired way, you may be well-served by controlling the output; rather than relying on the built-in String representation of the objects that you would be printing.

An example: Instead of print Email_list,"\n" use print print (','.join (Email_list, '\n'))

If you would like to emulate the representation of a list, you could use something like print ('[\'{list}\']'.format (list = '\', \''.join (Email_list)), '\n') or maybe something more cohesive. In any case, you could control the way in which you would like to print.

If you rely on internally-determined representations of objects for printing, you may be pushed to make coding considerations based on questions of output; and this is not a choice that supports one's ability to make the best coding choices for pure program logic.

Or, did I misunderstand your question?

Upvotes: 1

Brian Cain
Brian Cain

Reputation: 14619

getting output which shows 'set'before printing the new list i.e after "print Unique_list"

Just convert it back to a list again.

Unique_list = set(Email_list)
Unique_list = list(Unique_list)
#print Email_list.count()
print "Printing Unique Email addresses\n"
print (Unique_list)

Upvotes: 1

Related Questions