user14800447
user14800447

Reputation: 51

Parsing text file in Python into a dictionary

I have a csv file. I want to create dictionary from this data.I should not pandas. Data looks like this:

enter image description here

I do this. But size of the numbers is not same. How can I create a dictionary?

filename=" data.dat"
file=open(filename, encoding="latin-1").read().split(' , ')
dictt={}

for row in file:
    dictt[row[0]] = {‘values’, row[1]}

I have a file as above. First, I need to create a dict. After that, I will print the daily number of unique measurements in desending order according to date.

Final Expected result:

enter image description here

Upvotes: 1

Views: 1280

Answers (3)

Mrigank Prasoon
Mrigank Prasoon

Reputation: 1

The below code would work just fine.

from typing import List , Dict

def create_data_dictionary(keys , values) -> List:
    data_dictionary = []
    for key , value in zip(keys , values) :
        data_dictionary.append((key , value))
    return data_dictionary

def parser(path : str) -> List :
    _keys = []
    _values = []

    with open(path , "r") as fptr :
        fptr.readline()
        for line in fptr :
            row = line.strip().split(",")
            _keys.append(row[0].strip())
            st = set([weight.strip() for weight in row[1 : ]])
            # print(st)
            _values.append(len(st))
            # print(row)

    return create_data_dictionary(_keys , _values)


if __name__ == "__main__" :
    path = "resources/test2.csv"
    data_dictionary = parser(path)
    data_dictionary.sort(key = lambda x : x[1] , reverse=True)
    print(data_dictionary)

Below is the data which I used to parse and created a data dictionary.

Date,weight
2020-06-12 00:00:00+03:00 , 91.5,91.9,91.9,91.9,92.55,92.55,92.1,92.1,93.3,93.3
2020-06-13 00:00:00+03:00 , 91.6,91.6,92.85,92.85,92.3,92.3,92.1,92.1,94.1,94.1
2020-06-14 00:00:00+03:00 , 91.5,91.5,91.65,91.5,91.5,92.9,92.9
2020-06-15 00:00:00+03:00 , 91.85,91.85,91.6,91.6,91.85,92.55,92.4,92.4,93.7,93.7,93.35,93.35
2020-06-16 00:00:00+03:00 , 91.5,91.9,91.9,91.9,92.55,92.55,92.1,92.1,93.3,93.3,98.7,94.7
2020-06-17 00:00:00+03:00 , 91.5,91.9,91.9,91.9,92.55,92.55,92.1,92.1,93.3,93.3,94.0
2020-06-18 00:00:00+03:00 , 91.5,91.9,91.9,91.9,92.55,92.55
2020-06-19 00:00:00+03:00 , 91.5,91.9,91.9,91.9,92.55,92.55,92.1,92.1,93.3,93.3
2020-06-20 00:00:00+03:00

Here is what the output looks like.

[('2020-06-16 00:00:00+03:00', 7), ('2020-06-15 00:00:00+03:00', 6), ('2020-06-17 00:00:00+03:00', 6), ('2020-06-12 00:00:00+03:00', 5), ('2020-06-13 00:00:00+03:00', 5), ('2020-06-19 00:00:00+03:00', 5), ('2020-06-14 00:00:00+03:00', 3), ('2020-06-18 00:00:00+03:00', 3), ('2020-06-20 00:00:00+03:00', 0)]

You can now print the output accordingly or store it in a text/csv file.

Upvotes: 0

azibom
azibom

Reputation: 1934

Hi
That will do what you want

with open("./test.txt") as myFile:
    formattedData = dict()
    for line in myFile:
        try:
            date , numbers = line.split(' , ')
            numbers = numbers.replace("\n","") 
            numbers = numbers.split(',')
            formattedData[date] = len(list(set(numbers)))
        except:
            date = line
            formattedData[date] = 0
print(formattedData)

Upvotes: 1

PiyushG
PiyushG

Reputation: 121

Firstly, do not call your variables 'list' as that is a python keyword and it will cause confusion. I can't reproduce this because I don't have the file, but I think changing the line in the for loop to this should work.

newvariablename[row[0]] = row[1]

Upvotes: 0

Related Questions