AVilenius
AVilenius

Reputation: 27

Python - Need help writing dictionary of dictionaries to CSV file

I'm still very new to Python and I'm trying to create a report that takes a syslog containing info and error messages and then puts those in a CSV file with 3 columns. First column should have a username, the second column should have the amount of error messages found related to the username and the last column should have the amount of info messages related to the username.

I will then convert the CSV in excel so I can get this result:

enter image description here

To do this I have this code:

import re
import csv
import operator
from collections import Counter

test_list = []
test_list2 = []


with open(r"syslog.txt", "r") as log:
  for i in log:
    if re.findall("ERROR.*", i):
      test_list.append(re.findall("ticky:.*ERROR [\w '].*\(([\w\.]*).*$", i))
    elif re.findall("INFO.*", i):
      test_list2.append(re.findall("ticky:.*INFO [\w '].*\(([\w\.]*).*$", i))

flattened = [val for sublist in test_list for val in sublist]
test_dict = Counter(flattened)


flattened2 = [val for sublist in test_list2 for val in sublist]
test_dict2 = Counter(flattened2)


error = sorted(test_dict.items(), key=operator.itemgetter(0))
info = sorted(test_dict2.items(), key=operator.itemgetter(0))
username = {'info': info, 'error': error}
users = {'username': username}


userNames = username.get("error", "")
info_amount = username.get("info", "")
error_amount = username.get("error", "")


usernames_final = [x[0] for x in userNames]
info_message_amount = [x[1] for x in info_amount]
error_message_amount = [x[1] for x in error_amount]

with open('emails.csv', 'w') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(["User", "Info", "Error"])
    for (a, b, c) in zip(usernames_final, info_message_amount, error_message_amount):
        csvfile.write(a + "," + str(b) + "," + str(c) + '\n')

And here's a few lines from the syslog.txt:

Jan 31 06:59:57 ubuntu.local ticky: INFO Commented on ticket [#7255] (oren)
Jan 31 07:59:56 ubuntu.local ticky: ERROR Ticket doesn't exist (flavia)
Jan 31 08:01:40 ubuntu.local ticky: ERROR Tried to add information to closed ticket (jackowens)
Jan 31 08:03:19 ubuntu.local ticky: INFO Closed ticket [#1712] (britanni)
Jan 31 08:22:37 ubuntu.local ticky: INFO Created ticket [#2860] (mcintosh)
Jan 31 08:28:07 ubuntu.local ticky: ERROR Timeout while retrieving information (montanap)

I've managed to get a dict of dictionaries that looks like this (it's the 'users' variable):

{'username': {'info': [('ac', 2),
                       ('ahmed.miller', 2),
                       ('blossom', 2),
                       ('breee', 1),
                       ('britanni', 1),
                       ('enim.non', 2),
                       ('jackowens', 2),
                       ('kirknixon', 2),
                       ('mcintosh', 4),
                       ('mdouglas', 2),
                       ('noel', 6),
                       ('nonummy', 2),
                       ('oren', 2),
                       ('rr.robinson', 2),
                       ('sri', 2)],
              'error': [('ac', 2),
                        ('ahmed.miller', 4),
                        ('blossom', 6),
                        ('bpacheco', 2),
                        ('breee', 5),
                        ('britanni', 1),
                        ('enim.non', 3),
                        ('flavia', 5),
                        ('jackowens', 4),
                        ('kirknixon', 1),
                        ('mai.hendrix', 3),
                        ('mcintosh', 3),
                        ('mdouglas', 3),
                        ('montanap', 4),
                        ('noel', 3),
                        ('nonummy', 3),
                        ('oren', 7),
                        ('rr.robinson', 1),
                        ('sri', 2),
                        ('xlg', 4)]}}

It has all the information I need and it's sorted but I can't figure out how to make this into a CSV that fits my criteria.

The result I get from the last code block where it writes to csv is almost correct except it doesn't pull all the usernames and it also adds 1 to only certain user's info messages. I'm thinking it only iterates over the usernames that exist in both info_message_amount and error_message_amount and not over all of them which is why I only get some of the users. For the extra numbers, I've got no clue.

If anyone could help me with this I would be very grateful, I'm just not able to figure it out.

Thanks!

EDIT: I should also mention that this is for an exercise I'm doing and they expect me to accomplish this without using pandas. Only the modules/packages already imported should be used. We have not covered pandas yet so I don't know how to use it.

Upvotes: 0

Views: 114

Answers (3)

AVilenius
AVilenius

Reputation: 27

Thanks for all the tips!

I was able to make it work by using this:

usernames_final = [x[0] for x in userNames]
info_message_amount = [x[1] for x in info_amount]
info_users = [x[0] for x in info_amount]
error_message_amount = [x[1] for x in error_amount]

with open('emails.csv', 'w') as csvfile:
    i = 0
    writer = csv.writer(csvfile)
    writer.writerow(["User", "Info", "Error"])
    for user, error in zip(usernames_final, error_message_amount):
        if user in info_users:
            csvfile.write(user + "," + str(info_message_amount[i]) + "," + str(error) + '\n')
            i += 1
        else:
            csvfile.write(user + "," + "0" + "," + str(error) + '\n')

Upvotes: 0

fluffykitten
fluffykitten

Reputation: 130

So provided the example of the dictionary you published in the question, could be something like this (im assuming that dictionary is named "dic") No pandas needed:


tupla_1=()
tupla_2=()
err_list=dic['username']['error']
info_list=dic['username']['info']
for i in range(len(err_list)):
  look_for=err_list[i][0]
  found=False
  for j in range(len(info_list)):
    if look_for==info_list[j][0]:
      found=True
      tupla_1=err_list[i]
      tupla_1=tupla_1+(info_list[j][1],)
      err_list[i]=tupla_1
  if found==False:
    tupla_2=err_list[i]
    tupla_2=tupla_2+(0,)
    err_list[i]=tupla_2

print(err_list)

csvstr=''
for i in range(len(err_list)):
    csvstr+=str(err_list[i][0])+","+str(err_list[i][2])+","+str(err_list[i][1])+"\n"

f = open("emails.csv", "w")
f.write(csvstr)



Upvotes: 1

fluffykitten
fluffykitten

Reputation: 130

Maybe you could try manually writing the csv instead of using a library since CSV is a simple format. Something like this:

csvstr=''
for i in range(len(userNames)):
    csvstr+=userNames[i]+","+info[i]+","+error[i]+"/n"

f = open("emails.csv", "w")
f.write(csvstr)

Upvotes: 0

Related Questions