Reputation: 27
I have a list of substrings and I want to check what columns in my data frame contains any of those substrings so I can change their data types.
For instance, if "Amount" is a substring in the list, then the columns with the names Cred_Amount, Amount_123, Amount should have their data types changed to int.
lst = ["Amount", "ID"]
df = pd.DataFrame({'ID_Loan': [2199, 3994, 8499, 74889, 83479],'Loan_Amount': ['100', '122', '222', '350', '300'],'Team': ['Boston', 'Boston', 'Boston', 'Chele', 'Barse'],'CreditAmount': [89, 79, 113, 78, 84],'College': ['MIT', 'MIT', 'MIT', 'Stanford', 'Stanford'],'ID_Num': [99999, 99994, 89999, 78889, 87779]})
df
ID_Loan Loan_Amount Team CreditAmount College ID_Num
0 2199 100 Boston 89 MIT 99999
1 3994 122 Boston 79 MIT 99994
2 8499 222 Boston 113 MIT 89999
3 74889 350 Chele 78 Stanford 78889
4 83479 300 Barse 84 Stanford 87779
I expected that the 1st, 2nd, 4th, and 6th columns be changed to int data type
Upvotes: 0
Views: 184
Reputation: 402663
Use filter
to select and convert using astype
:
u = df.filter(regex='|'.join(lst))
df[u.columns] = u.astype(int)
df.dtypes
ID_Loan int64
Loan_Amount int64
Team object
CreditAmount int64
College object
ID_Num int64
dtype: object
P.S, I've renamed your column list to lst
because using list
shadows the builtin which is bad practice.
Another option is using str.contains
to get a mask and slice with loc
:
m = df.columns.str.contains('|'.join(lst))
df.loc[:, m] = df.loc[:,m].astype(int)
df.dtypes
ID_Loan int64
Loan_Amount int64
Team object
CreditAmount int64
College object
ID_Num int64
dtype: object
Finally, you can also use astype
with a dictionary to selectively convert some columns, but return the whole DataFrame.
df = df.astype(
dict.fromkeys(df.columns[df.columns.str.contains('|'.join(lst))], 'int'))
df.dtypes
ID_Loan int64
Loan_Amount int64
Team object
CreditAmount int64
College object
ID_Num int64
dtype: object
Upvotes: 1