How can I output the file name with its word content in such format in python?

Question

Say I have a file test.txt containing :

1:text1.txt
2:text2.txt

text1.txt contains:

I am a good person

text2.txt contains:

Bla bla

I would like to output :

I 1
Bla 2    
am 1    
bla 2    
good 1
a 1
person 1

As in I want to output the file index with each word in the file. I would post my code but it is so ugly and far from the solution. I'm new to python so please be nice. There is no specified order of the output, the sample output I mentioned is utterly random just to get you to have an idea of what I'm looking for.

This is my code

`with open("text.txt", "r") as f: text=f.readlines()

for line in text:
  splitted=line.split(":")

splitsplit=splitted[1].split("
")
files=splitsplit[0]

splittedindicies=splitted[0].split("
")
indicies=splittedindicies[0]

print indicies[0]
files_list=list(files)
files_l=files.split(" ")
for x in files_l:
    fileshandle=open(x,"r")
    read=fileshandle.readlines()

    for y in read:
        words=y.split(" ")
        words.sort()
        for j in words:
            print j `

My output is:

1 I am a good person 2 Bla bla

Again, please be nice, I'm an R programmer first time dealing with python.

Aaditya Ura · Accepted Answer

You should try some regex recipe here :

As you comment out :

how can I store the output

Your output is in values of dict , you can do operation with them.

import re
track={}
pattern=r'(\d):?(\w+\.txt)'
with open('test.txt','r') as file_name:
    for line in file_name:
        match=re.finditer(pattern,line)
        for finding in match:

            with open(finding.group(2)) as file_name_2:
                for item in file_name_2:
                    track[int(finding.group(1))]=item.split()

for key,value in track.items():
    for item in value:
        print(key,item)

output:

1 I
1 am
1 a
1 good
1 person
2 Bla
2 bla

How can I output the file name with its word content in such format in python?

Answers (2)

Related Questions