user2464966
user2464966

Reputation: 3

searching content of one file in another file : python

I am trying to search names from file 1 in file 2 and merge some data on matched lines

file1:

A   28  sep 1980
B   28  jan 1985
C   25  feb 1990    
D   27  march   1995

and file2

A   hyd
B   alig
C   slg 
D   raj

Using this:

import sys
data1 = open(sys.argv[1]).read().rstrip('\n')
data2 = open(sys.argv[2]).read().rstrip('\n')
list1 = data1.split('\n')
list2 = data2.split('\n')

for line in list1:
  for item in list2:
    if line.split('\t')[0] in item.split('\t')[0]:
        print(item,'\t',line.split('\t')[3])

Result:

A   hyd      1980
B   alig     1985
C   slg  1990
D   raj      1995

Two questions (for clarifying the concept):

1 - I was hoping that if I change the order of lines in file2, I should get smaller number of matches but I still get all the matches. Why?

2- Although this program serves the purpose, how memory efficient it is expected to be? please suggest.

Thanks

Upvotes: 0

Views: 205

Answers (1)

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799024

1 - I was hoping that if I change the order of lines in file2, I should get smaller number of matches but I still get all the matches. Why?

Your program does a full cross-join of all lines, therefore you will always get full results.

2- Although this program serves the purpose, how memory efficient it is expected to be? please suggest.

Awful. Read only the shortest file into memory and iterate over the lines of the longer one once.

with open('bigfile.txt', 'r') as bigfile:
  for bigline in bigfile:
    for littleline in littlefiledata:
       ...

Upvotes: 1

Related Questions