user951487
user951487

Reputation: 855

compare two file and find matching words in python

I have a two file: the first one includes terms and their frequency:

table 2
apple 4
pencil 89

The second file is a dictionary:

abroad
apple
bread
...

I want to check whether the first file contains any words from the second file. For example both the first file and the second file contains "apple". I am new to python. I try something but it does not work. Could you help me ? Thank you

for line in dictionary:
    words = line.split()
    print words[0]

for line2 in test:
    words2 = line2.split()
    print words2[0]

Upvotes: 0

Views: 25865

Answers (4)

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 251166

Something like this:

with open("file1") as f1,open("file2") as f2:
    words=set(line.strip() for line in f1)   #create a set of words from dictionary file

    #why sets? sets provide an O(1) lookup, so overall complexity is O(N)

    #now loop over each line of other file (word, freq file)
    for line in f2:
        word,freq=line.split()   #fetch word,freq 
        if word in words:        #if word is found in words set then print it
            print word

output:

apple

Upvotes: 4

kiriloff
kiriloff

Reputation: 26333

I have your first double list in try.txt and the single list in try_match.txt

f = open('try.txt', 'r')
f_match = open('try_match.txt', 'r')
print f
dictionary = []
for line in f:
    a, b = line.split()
    dictionary.append(a)

for line in f_match:
    if line.split()[0] in dictionary:
        print line.split()[0]

Upvotes: 2

snehal
snehal

Reputation: 1846

It may help you :

file1 = set(line.strip() for line in open('file1.txt'))

file2 = set(line.strip() for line in open('file2.txt'))

for line in file1 & file2:

    if line:

        print line

Upvotes: 3

jadkik94
jadkik94

Reputation: 7078

Here's what you should do:

  • First, you need to put all the dictionary words in some place where you can easily look them up. If you don't do that, you'd have to read the whole dictionary file every time you want to check one single word in the other file.

  • Second, you need to check if each word in the file is in the words you extracted from the dictionary file.

For the first part, you need to use either a list or a set. The difference between these two is that list keeps the order you put the items in it. A set is unordered, so it doesn't matter which word you read first from the dictionary file. Also, a set is faster when you look up an item, because that's what it is for.

To see if an item is in a set, you can do: item in my_set which is either True or False.

Upvotes: 2

Related Questions