user15051990
user15051990

Reputation: 1895

Apply regex to text files and save result in dictionary python

I have multiple text files, I want to clean them and store them in the dictionary key as name of file and value as the cleaned text file. I have reproduce the text file as a.txt and b.txt.

a.txt 
2018/03/21-17:08:48.638553  508     7FF4A8F3D704     snononsonfvnosnovoosr
2018/03/21-17:08:48.985053 346K     7FE9D2D51706     ahelooa afoaona woom
2018/03/21-17:08:50.486601 1.5M     7FE9D3D41706     qojfcmqcacaeia
2018/03/21-17:08:50.980519  16K     7FE9BD1AF707     user: number is 93823004
2018/03/21-17:08:50.981908 1389     7FE9BDC2B707     user 7fb31ecfa700
2018/03/21-17:08:51.066967    0     7FE9BDC91700     Exit Status = 0x0
2018/03/21-17:08:51.066968    1     7FE9BDC91700     std:ZMD:

b.txt
2018/03/21-17:08:48.638553  508     7FF4A8F3D704     snononsonfvnosnovoosr
2018/03/21-17:08:48.985053 346K     7FE9D2D51706     ahelooa afoaona woom
2018/03/21-17:08:50.486601 1.5M     7FE9D3D41706     qojfcmqcacaeia
2018/03/21-17:08:50.980519  16K     7FE9BD1AF707     user: number is 93823004
2018/03/21-17:08:50.981908 1389     7FE9BDC2B707     user 7fb31ecfa700
2018/03/21-17:08:51.066967    0     7FE9BDC91700     Exit Status = 0x0
2018/03/21-17:08:51.066968    1     7FE9BDC91700     std:ZMD:

My Solution:

import collections
import glob
import re
my_list = []
mydict = collections.defaultdict()
for files in glob.glob("*.txt"):
    file_name = files[1]
    with open(files, 'r') as f:
        for lines in f:
            remove = re.sub(r"^.{53}", "", lines)
            my_list.append(remove)
        mydict[file_name] = my_list

It saves result in the below format because I am appending in the list:

dict = {a: [snononsonfvnosnovoosr, ahelooa afoaona woom, qojfcmqcacaeia, user: number is 93823004, Exit Status = 0x0, std:ZMD:],
b: [snononsonfvnosnovoosr, ahelooa afoaona woom, qojfcmqcacaeia, user: number is 93823004, Exit Status = 0x0, std:ZMD:]}

Expected Result:

dict = {a: [snononsonfvnosnovoosr ahelooa afoaona woom qojfcmqcacaeia user: number is 93823004 Exit Status = 0x0 std:ZMD:],
b: [snononsonfvnosnovoosr ahelooa afoaona woom qojfcmqcacaeia user: number is 93823004 Exit Status = 0x0 std:ZMD:]}

Upvotes: 1

Views: 69

Answers (1)

Matt.G
Matt.G

Reputation: 3609

Try changing to: mydict[file_name] = [' '.join(my_list)]

Upvotes: 1

Related Questions