AF7TI
AF7TI

Reputation: 13

pandas read_csv create new column and usecols at the same time

I'm trying to load multiple csv files into a single dataframe df while:

All of this works fine until I attempt to exclude a column with usecols, which throws the error Too many columns specified: expected 5 and found 4.

Is it possible to create a new column and pass usecols at the same time?

The reason I'm creating & populating a new 'Station' column during read_csv is my dataframe will contain data from multiple stations. I can work around the error by doing read_csv in one statement and dropping the QD column in the next with df.drop('QD', axis=1, inplace=True) but want to make sure I understand how to do this the most pandas way possible.

Here's the code that throws the error:

df = pd.concat(pd.read_csv("http://lgdc.uml.edu/common/DIDBGetValues?ursiCode=" + row['StationCode'] + "&charName=MUFD&DMUF=3000",
                           skiprows=17,
                           delim_whitespace=True,
                           parse_dates=[0],
                           usecols=['Time','CS','MUFD','Station'],
                           names=['Time','CS','MUFD','QD','Station']
                ).fillna(row['StationCode']
                ).set_index(['Time', 'Station']) 
                for index, row in stationdf.iterrows())

Example StationCode from stationdf BC840. Data sample 2016-09-19T00:00:05.000Z 100 19.34 //

Upvotes: 1

Views: 1098

Answers (1)

maxymoo
maxymoo

Reputation: 36555

You can create a new column using operator chaining with assign:

df = pd.read_csv(...).assign(StationCode=row['StationCode'])

Upvotes: 2

Related Questions