Googlebot
Googlebot

Reputation: 15673

How to get labels by numpy loadtext?

I have a data file in the form of

Col0   Col1  Col2
2015   1     4
2016   2     3

The data is float, and I use numpty loadtext to make a ndarray. However, I need to skip the label rows and columns to have an array of the data. How can I make the ndarray out of the data while reading the labels too?

import numpy as np
import matplotlib.pyplot as plt

data = np.loadtxt("data.csv", skiprows=1) 
# I need to skip the first row in reading the data but still get the labels.
x= data[:,0]
a= data[:,1]
b= data[:,2]

plt.xlabel(COL0) # Reading the COL0 value from the file.
plt.ylabel(COL1) # Reading the COL1 value from the file.
plt.plot(x,a)

NOTE: The labels (column titles) are unknown in the script. The script should be generic to work with any input file of the same structure.

Upvotes: 4

Views: 2146

Answers (2)

Chiel
Chiel

Reputation: 6194

With genfromtxt it is possible to get the names in a tuple. You can query on name, and you can get the names out into a variable using dtype.names[n], where n is an index.

import numpy as np
import matplotlib.pyplot as plt

data = np.genfromtxt('data.csv', names=True)

x = data[data.dtype.names[0]] # In this case this equals data['Col1'].
a = data[data.dtype.names[1]]
b = data[data.dtype.names[2]]

plt.figure()
plt.plot(x, a)
plt.xlabel(data.dtype.names[0])
plt.ylabel(data.dtype.names[1])
plt.show()

Upvotes: 4

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339102

This is not really an answer to the actual question, but I feel you might be interested in knowing how to do the same with pandas instead of numpy.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("data.csv", delim_whitespace=True)

df.set_index(df.columns[0]).plot()

plt.show()

would result in

enter image description here

As can be seen, there is no need to know any column name and the plot is labeled automatically.

Of course the data can then also be used to be plotted with matplotlib:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("data.csv", delim_whitespace=True)
x = df[df.columns[0]]
a = df[df.columns[1]]
b = df[df.columns[2]]

plt.figure()
plt.plot(x, a)
plt.xlabel(df.columns[0])
plt.ylabel(df.columns[1])
plt.show()

Upvotes: 1

Related Questions