Reputation: 1129
I have the following data
admit_data = np.genfromtxt('/content/drive/My Drive/Colab/admission_predict.csv', delimiter=',')
What I need is to get some particular column header. I am using the following code to get the data. But not able to get those column name
print(admit_data[1:].tolist())
Is there any function like .tolist()
so that I can extract only that column's name?
Edit 1
Added sample data format
Upvotes: 0
Views: 1323
Reputation: 26896
Firstly, you need to get the column names from the csv
with np.genfromtxt()
, e.g. by specifying names=True
, then the names of the columns end up in the dtype
as data.dtype.names
, e.g.:
import numpy as np
data = np.genfromtxt(
io.StringIO('A,B,C\n1,2,3\n4,5,6'),
dtype=None, names=True, delimiter=',', encoding='utf8')
print(data)
# [(1, 2, 3) (4, 5, 6)]
print(data.dtype.names)
# ('A', 'B', 'C')
However, please note that with data[1:]
you are not selecting columns, but rows! To select the rows, you have to use one of the names
:
print(data[1:])
# [(4, 5, 6)]
print(data['A'])
# [1 4]
print(data[['A', 'B']])
# [(1, 2) (4, 5)]
and more advanced indexing are actually a bit cumbersome:
# print(data.shape)
# (2,)
print(data[1:][0][1])
# 5
On the other hand, Pandas would offer a much more direct syntax and that is one of the main reasons for it to be the preferred tools for this use case:
import pandas as pd
df = pd.read_csv(io.StringIO('A,B,C\n1,2,3\n4,5,6'))
print(df['A'])
# 0 1
# 1 4
# Name: A, dtype: int64
print(df['A'][0])
# 1
Upvotes: 1
Reputation: 103
Could you please give more information about the data that you want to extract.
Based on your question, tolist() function is present for Pandas series. Better convert the admit_data as pandas series(using pd.Series() function). Then you can extract the first row as list.
Upvotes: 1