Reputation: 9
I have a really long text file that I want to plot in Python. I've imported the text file using this:
import matplotlib.pyplot as plt
plt.figure()
with open('6-18-2015 14.2.9.txt') as f:
for line in f:
line = [float(line)]
plt.plot(line)
Every time I run the code, I get: ValueError: invalid literal for float(): How do I solve this problem? Any help is much appreciated.
Upvotes: 0
Views: 261
Reputation: 2173
You should have a look at pandas. It makes such tasks really trivial. For example: assuming you have a .csv
file named data.csv
which looks like this
x, y
1, 1
2, 4
3, 9
...
then you can plot it as follows
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("data.csv")
plt.plot(df.x, df.y)
plt.show()
EDIT:
You can transpose your 4x10000 data and change it to 10000x4. Here's an example showing how to plot the 10000x4 data using matplotlib.
4ddata.csv
x,y,z,u
10.39, 73.32, 2.02, 28.26
11.13, 68.71, 1.86, 27.83
12.71, 74.27, 1.89, 28.26
11.46, 91.06, 1.63, 28.26
11.72, 85.38, 1.51, 28.26
13.39, 78.68, 1.89, 28.26
13.02, 68.02, 2.01, 28.26
12.08, 64.37, 2.18, 28.26
11.58, 60.71, 2.28, 28.26
8.94, 65.67, 1.92, 27.04
11.61, 59.57, 2.32, 27.52
19.06, 74.49, 1.69, 63.35
17.52, 73.62, 1.73, 63.51
19.52, 71.52, 1.79, 63.51
18.76, 67.55, 1.86, 63.51
19.84, 53.34, 2.3, 63.51
20.19, 59.82, 1.97, 63.51
17.43, 57.89, 2.05, 63.38
17.9, 59.95, 1.89, 63.51
18.97, 57.84, 2, 63.51
19.22, 57.74, 2.05, 63.51
17.55, 55.66, 1.99, 63.51
19.22, 101.31, 6.76, 94.29
19.41, 99.47, 6.07, 94.15
18.99, 94.01, 7.32, 94.08
19.88, 103.57, 6.98, 94.58
19.08, 95.38, 5.66, 94.14
20.36, 100.43, 6.13, 94.47
20.13, 98.78, 7.37, 94.47
20.36, 89.36, 8.79, 94.71
20.96, 84.48, 8.33, 94.01
21.02, 83.97, 6.78, 94.72
19.6, 95.64, 6.56, 94.57
plot.py
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
df = pd.read_csv("4ddata.csv")
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(df.x, df.y, df.z, s=df.u)
plt.show()
This example represents the 4th dimension as (point size)2
As you have a very long file, you may want to use
ax.scatter(df.x, df.y, df.z, c=df.u)
instead of
ax.scatter(df.x, df.y, df.z, s=df.u)
This will represent 4th dimension as color thus preventing unnecessary visual clutter.
The problem in you case is that when you use for line in f:
you are reading the entire line. So you get something like
line = "1.23, 4.26, 5.78, 3.44\n"
Python is unable to figure out how to convert this variable to float and hence the error. The invalid literal here is probably a ,
. Also, using a loop to iterate through data to plot is likely to be highly inefficient, you must use the provided functions where ever possible as they are highly optimized for the task they perform.
Upvotes: 3