Preserving Cases while Column Matching from List in Pandas

Question

I have a dataframe with strings with varying cases for column names as well as a list of lowercase strings.

a = [5, 5, 4, 6]
b = [4, 4, 4, 4]
c = [6, 8, 2, 3]
d = [8, 6, 4, 3]

df = pd.DataFrame({'a': a,
                   'B_B': b,
                   'CC_Cc': c,
                   'Dd_DdDd': d})

cols = ['b_b', 'cc_cc', 'dd_dddd']

I want to select the columns in df that match the strings in cols while preserving the cases of the columns in df. I've been able to match the column names by making them all lowercase, but I'm not sure how save the original cases of the dataframe columns.

In this case I would want to create a new dataframe with only the columns in df from keep cols, but with their original cases. How would I go about doing this?

Desired output:

   B_B  CC_Cc  Dd_DdDd  
0    4      6        8  
1    4      8        6  
2    4      2        4  
3    4      3        3

akuiper · Accepted Answer

You can use str.lower() to convert the column names to lower case, then construct a logical series with isin method to select the columns; the column names will not be altered in this way:

df.loc[:, df.columns.str.lower().isin(cols)]

An alternative method would be to use filter function, in regex specify a modifier (?i) to ignore case:

df.filter(regex="(?i)" + "|".join(cols))

Notice this regex method also matches column names that contain the pattern in cols list, if you want a exact match ignoring cases, you can add word boundaries in:

df.filter(regex="(?i)\b"+"\b|\b".join(cols)+"\b")

Preserving Cases while Column Matching from List in Pandas

Answers (1)

Related Questions