How can I select discrete columns from data frame

Question

My data table looks like this. With data.columns of following 60 columns

['WeatherHR0', 'WeatherHR1', 'WeatherHR2', 'WeatherHR3', 'WeatherHR4',
       'WeatherHR5', 'WeatherHR6', 'WeatherHR7', 'WeatherHR8', 'WeatherHR9',
       'WeatherHR10', 'WeatherHR11', 'WeatherHR12', 'WeatherHR13',
       'WeatherHR14', 'WeatherHR15', 'WeatherHR16', 'WeatherHR17',
       'WeatherHR18', 'WeatherHR19', 'WeatherHR20', 'WeatherHR21',
       'WeatherHR22', 'WeatherHR23', 'AvgDB', 'HDD0', 'HDD5', 'HDD10', 'HDD13',
       'HDD18', 'CDD15', 'CDD18', 'Peak Average', 'Day of Week', 'Holiday',
       'HR1', 'HR2', 'HR3', 'HR4', 'HR5', 'HR6', 'HR7', 'HR8', 'HR9', 'HR10',
       'HR11', 'HR12', 'HR13', 'HR14', 'HR15', 'HR16', 'HR17', 'HR18', 'HR19',
       'HR20', 'HR21', 'HR22', 'HR23', 'HR24', 'Max']

At times, I wanted to select data from multiple columns that are separated by unwanted columns (unwanted as for that moment, might needed for later).

I wanted to do something like df.loc[['WeatherHR0':WeatherHR23'+ 'Peak Average'+ 'HR0':'HR24']].

For example, I may want to select columns from WeatherHR0~WeatherHR23 + Peak Average + HR0~24 while keep other columns undropped.

I know I can drop the unselected columns/create new dataframe, but is there a pythonic way to selected discreted columns in pandas?

jfaccioni · Accepted Answer

you can create a list of strings (pythonically) and then use that to select the columns:

columns_of_interest = ([f'WeatherHR{i}' for i in range(24)] + 
                       ['Peak Average'] + 
                       [f'HR{i}' for i in range(25)])
data[columns_of_interest]

Of course, this strategy works best if there are common patterns in the column names.

How can I select discrete columns from data frame

Answers (1)

Related Questions