Reputation: 173
I have tried to strip and get the data in the .txt file to allow me to plot a simple graph, but i can't seem to get the data into the format that i would like. Could someone guide me in the right direction?
Below is a short example of the data in the text file, and in python i am trying to .read() the text file, then plot a simple graph, using the headings in the text file itself if possible.
Date,Value
2016-03-31,0.7927
2016-03-30,0.7859
2016-03-29,0.7843
2016-03-24,0.7893
2016-03-23,0.792
2016-03-22,0.7897
2016-03-21,0.7818
2016-03-18,0.778
2016-03-17,0.781
2016-03-16,0.7855
2016-03-15,0.7845
my python code that i have tried so far: (this won't be perfect code as i am still sorting through it!)
import numpy as np
import matplotlib.pyplot as plt
with open("EURGBP DATA.txt") as f:
data = f.read()
data = data.split('\n')
x = [row.split()[0] for row in data]
y = [row.split()[1] for row in data]
index = [i for i,val in enumerate(x)]
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.set_title("Plot DAta")
ax1.set_xlabel('x')
ax1.set_ylabel('y')
ax1.set_xticklabels(x)
ax1.plot(index ,y, c='r', label='the data')
leg = ax1.legend()
plt.locator_params(nbins=len(index)-1)
plt.show()
Upvotes: 2
Views: 5758
Reputation: 55589
These lines split the data rows on spaces, not commas:
x = [row.split()[0] for row in data]
y = [row.split()[1] for row in data]
You need to specify the character to split on (whitespace characters are the default):
x = [row.split(',')[0] for row in data]
y = [row.split(',')[1] for row in data]
EDIT: additional data cleaning
If the data file has a trailing newline then
y = [row.split(',')[1] for row in data]
will raise an IndexError
, because the trailing newline row will not have second element:
>>> data = 'a,b\nc,d\n'.split('\n')
>>> print(data)
['a,b', 'c,d', '']
>>> print(data[0].split(','))
['a', 'b']
>>> print(data[-1].split(','))
['']
Defend against this by testing that the row is not an empty string before splitting the values:
x = [row.split(',')[0] for row in data if row]
y = [row.split(',')[1] for row in data if row]
You also need to remove the column header names from the the values that you are passing to matplotlib. Do this by omitting the first row when creating the x and y values:
>>> data = 'First,Second\na,b\nc,d\n'.split('\n')
>>> print(data)
['First,Second', 'a,b', 'c,d', '']
>>> x = [row.split(',')[0] for row in data[1:] if row]
>>> print(x)
['a', 'c']
>>> y = [row.split(',')[1] for row in data[1:] if row]
>>> print(y)
['b', 'd']
Upvotes: 2
Reputation: 947
With pandas (I use import pandas as pd
below), this can actually be done with one line:
pd.read_table('datafile.txt', parse_dates = True, index_col = 0, sep = ',').plot()
where the parse_dates
keyword tells pandas to try to convert the index to datetime. The result looks like this:
Upvotes: 4
Reputation: 58865
The DataFrame
object in pandas
already has a function plot()
which is very helpful. Copying your example to the clipboard I could perform the plot just doing:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_clipboard(delimiter=',')
df.plot()
ax = plt.gca()
ax.set_xticklabels(df.Date)
plt.savefig(filename='test.png')
Upvotes: 3