Reputation: 3382
I am new to python. I want to grab a whole column from a *.csv file. in order to do so, I saw that my best way is to convert my csv to 2-dim. array using:
> import numpy as np
> csv=np.genfromtxt(file_name.csv, delimeter=",")
and than for example if I want to grab the 8-th column just to write:
column8=csv[:,7]
my problem is that I have fields, in double quotes, with a comma in them, so I have a ValueError:
ValueError: Some errors were detected !
Line #6 (got 16 columns instead of 15) Line #21 (got 16 columns instead of 15) Line #45 (got 18 columns instead of 15) etc.
so all I want is, if for example this is my csv:
a,b,c,d
f,g,h,"i,j"
k,l,m,"n,o,p"
so if for example I want to grab the 4-th column I want the answer to be:
d
i,j
n,o,p
any ideas anyone??
thank you!!!
Upvotes: 0
Views: 427
Reputation: 8011
Similarly to taleinat's solution, but if you know the header_name and want to return a list.
import csv
with open(FILENAME, "rb") as MEDIA:
headers = []
columns = []
required_header = "name"
csv_file = csv.reader(MEDIA, quoting=csv.QUOTE_NONE, delimiter=DELIMITER)
for row in csv_file:
headers = row[:]
break
position = headers.index(required_header)
[columns.append(row[position]) for row in csv_file]
return columns
Upvotes: 0
Reputation: 706
pandas is very good for reading from csv files.
try to use:
df = pandas.read_csv("filename.csv", delimeter=",") # header=None)
after this, to access a column
df['colname'] # or df[col_ind] if you set header=None
Upvotes: 0
Reputation: 8701
Python's built-in csv
module takes care of this nicely with the default settings. So this should just work:
import csv
with open("file_name.csv", "r", newline='') as f:
reader = csv.reader(f)
column8 = [row[7] for row in reader]
This is a slight variation on the first example in the module's documentation, which contains additional useful information.
Upvotes: 1
Reputation: 774
Using pandas package will solve your problem. As pandas has a wide variety of methods from which we can read different file formats.
import pandas as pd
df = pd.read_csv("filename.csv")
print df[column4]
Upvotes: 0