Reputation: 979
I have a dataframe with strings with varying cases for column names as well as a list of lowercase strings.
a = [5, 5, 4, 6]
b = [4, 4, 4, 4]
c = [6, 8, 2, 3]
d = [8, 6, 4, 3]
df = pd.DataFrame({'a': a,
'B_B': b,
'CC_Cc': c,
'Dd_DdDd': d})
cols = ['b_b', 'cc_cc', 'dd_dddd']
I want to select the columns in df that match the strings in cols while preserving the cases of the columns in df. I've been able to match the column names by making them all lowercase, but I'm not sure how save the original cases of the dataframe columns.
In this case I would want to create a new dataframe with only the columns in df from keep cols, but with their original cases. How would I go about doing this?
Desired output:
B_B CC_Cc Dd_DdDd
0 4 6 8
1 4 8 6
2 4 2 4
3 4 3 3
Upvotes: 1
Views: 250
Reputation: 214957
You can use str.lower()
to convert the column names to lower case, then construct a logical series with isin
method to select the columns; the column names will not be altered in this way:
df.loc[:, df.columns.str.lower().isin(cols)]
An alternative method would be to use filter
function, in regex specify a modifier (?i)
to ignore case:
df.filter(regex="(?i)" + "|".join(cols))
Notice this regex method also matches column names that contain the pattern in cols
list, if you want a exact match ignoring cases, you can add word boundaries in:
df.filter(regex="(?i)\\b"+"\\b|\\b".join(cols)+"\\b")
Upvotes: 2