Reputation: 455
I am using read_csv in Pandas v0.18.1 to read in some data. I want to choose a subset of columns and rows from the csv, so I have tried:
df_a = pd.read_csv(filepath, index_col = False, usecols=cols_to_use, skiprows=1)
This gives me a ValueError: Usecols do not match names. Note that cols_to_use is a list of column names, but if I leave out the skiprows part:
df_a = pd.read_csv(filepath, index_col = False, usecols=cols_to_use)
it works fine, and similarly if I leave out the usecols bit and put skiprows back in, that works fine too.
Could this be a bug (that you can't use usecols and skiprows at the same time)? I've tried looking in the documentation but couldn't find any mention of it. Or perhaps there is a logical reason that you can't use both?
(Also if there is a better/more obvious way of picking out a subset of columns and rows from a csv that would be appreciated too!)
Thanks in advance!
Upvotes: 3
Views: 3710
Reputation: 2414
If the first row of your csv file contains the column names then skiprows=1
will ignore the row with the column names and you run into the error.
If you want to skip specific rows you can provide the row numbers as a list using e.g. skiprows=[1]
. The line numbers are 0-indexed, hence the column names are in line 0 and the first data line is number 1.
Upvotes: 2