Parsing an csv file and plotting with Python

Question

I'm new to Python development and I have to implement a project on data analysis. I have a data.txt file which has the following values:

ID,name,date,confirmedInfections
DE2,BAYERN,2020-02-24,19
.
.
DE2,BAYERN,2020-02-25,19
DE1,BADEN-WÃœRTTEMBERG,2020-02-24,1
.
.
DE1,BADEN-WÃœRTTEMBERG,2020-02-26,7
.
.(lot of other names and data)

What I'm trying to do?

As you can see in the file above each name represents a city with covid infections. For each city, I need to save a data frame for each city and plot a time series graph which uses the index of date on x-axis and confirmedInfections on y-axis. An example:

Because of the big data file I was given with four columns I think that I'm doing a mistake on parsing that file and selecting the correct values. Here is an example of my code:

# Getting the data fron Bayern city
data = pd.read_csv("data.txt", index_col="name")
first = data.loc["BAYERN"]
print(first)

# Plotting the timeseries
series = read_csv('data.txt' ,header=0, index_col=0, parse_dates=True, squeeze=True)
series.plot()
pyplot.show()

And here is a photo of the result:

As you can see on the x-axis I get all the different IDs that are included on data.txt. From that to exlude the ID and stats of each city.

Thanks for your time.

Krunal Sonparate · Accepted Answer

You need to parse date after reading from CSV

import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt
# You can limit the columns as below provided
headers = ['ID','name','date','confirmedInfections']
data = pd.read_csv('data.csv',names=headers)

data['Date'] = data['Date'].map(lambda x: datetime.strptime(str(x), '%Y/%m/%d'))
x = data['Date']
y = data['confirmedInfections']

# Plot using pyplotlib
plt.plot(x,y)
# display chart
plt.show()

I haven't tested this particular code. I hope this will work for you

Parsing an csv file and plotting with Python

Answers (1)

Related Questions