Reputation:
I wanted to use the numpy loadtxt method to read .csv
files for my experiment. I have three different time-series data of the following format with different characteristics where the first column is timestamp and the second column is the value.
0.086206438,10
0.086425551,12
0.089227066,20
0.089262508,24
0.089744425,30
0.090036815,40
0.090054172,28
0.090377569,28
0.090514071,28
0.090762872,28
0.090912691,27
For reproducibility, I have shared the three time-series data I am using here.
If I do it like the following
import numpy as np
fname="data1.csv"
col_time,col_window = np.loadtxt(fname,delimiter=',').T
It works fine as intended. However instead of reading only a single file, I want to pass a dictionary to col_time,col_window = np.loadtxt(types,delimiter=',').T
as the following
protocols = {}
types = {"data1": "data1.csv", "data2": "data2.csv", "data3": "data3.csv"}
so that I can read multiple csv
files and do plot all the results at ones using a one for loop
as in the following.
for protname, fname in types.items():
col_time, col_window = protocols[protname]["col_time"], protocols[protname]["col_window"]
rt = np.exp(np.diff(np.log(col_window)))
plt.plot(quotient_times, quotient, ".", markersize=4, label=protname)
plt.title(protname)
plt.xlabel("t")
plt.ylabel("values")
plt.legend()
plt.show()
But it is giving me an error ValueError: could not convert string to float: b'data1'
. How can I load multiple csv
files as a dictionary?
Upvotes: 1
Views: 1052
Reputation: 148890
Assuming that you want to build a protocols
dict that will be useable in your code, you can easily build it with a simple loop:
types = {"data1": "data1.csv", "data2": "data2.csv", "data3": "data3.csv"}
protocols = {}
for name, file in types.items():
col_time, col_window = np.loadtxt(file, delimiter=',').T
protocols[name] = {'col_time': col_time, 'col_window': col_window}
You can then successfully plot the 3 graphs:
for protname, fname in types.items():
col_time, col_window = protocols[protname]["col_time"], protocols[protname]["col_window"]
rt = np.exp(np.diff(np.log(col_window)))
plt.plot(col_time, col_window, ".", markersize=4, label=protname)
plt.title(protname)
plt.xlabel("t")
plt.ylabel("values")
plt.legend()
plt.show()
Upvotes: 1
Reputation: 1216
Loading data from multiple CSV files is not supported in pandas and numpy. You can use concat
function of pandas DataFrame
and load all the files. The example bellow demonstrates using pandas. Replace StringIO
with file object.
data="""
0.086206438,10
0.086425551,12
0.089227066,20
0.089262508,24
0.089744425,30
0.090036815,40
0.090054172,28
0.090377569,28
0.090514071,28
0.090762872,28
0.090912691,27
"""
data2="""
0.086206438,29
0.086425551,32
0.089227066,50
0.089262508,54
"""
data3="""
0.086206438,69
0.086425551,72
0.089227066,70
0.089262508,74
"""
import pandas as pd
from io import StringIO
files={"data1":data,"data2":data2,"data3":data3}
# Load the first file into data frame
key=list(files.keys())[0]
df=pd.read_csv(StringIO(files.get(key)),header=None,usecols=[0,1],names=['data1','data2'])
print(df.head())
# remove file from dictionary
files.pop(key,None)
print("final values")
# Efficient :Concat this dataframe with remaining files
df=pd.concat([pd.read_csv(StringIO(files[i]),header=None,usecols=[0,1],names=['data1','data2']) for i in files.keys()],
ignore_index=True)
print(df.tail())
For more insight: pandas append vs concat
Upvotes: 0