using pandas to read a csv file with whatever columns matchi with the column names given in a list

Question

I have a couple CSV files with about 15 columns. I am interested only in 5 columns. So I stored them in a list.

mylist=['col1','col2','col3','col4','col5']

I read the csv file in pandas dataframe df.

Now when I do df[mylist] it throws error because col4 is not present in the csv file.

My question is how do I still read the files even if some of columns listed in my list are not present in the csv.

Example: if csv file doesn't have col4, then the code should just extract whatever columns are matching with the columns in the list?

jezrael · Accepted Answer

You can use intersection of real columns names with list:

df = pd.read_csv('file.csv')
df1 = df[df.columns.intersection([mylist])]

Answers (2)