Reputation: 45
I have a dataframe df_alltrades with a number of similar columns where the string includes integers:
instrument Bid0Mkt Bid1Mkt Bid2Mkt Bid3Mkt Bid4Mkt ...
0 EURUSD 1.1 1.2 1.2 1.3 1.3
1 NZDUSD 0.6 0.65 0.7 0.9 0.92
. . . . . .
. . . . . .
I also have corresponding columns for OfferXMkt
and a number of other columns formatted as str.
I want to remove any columns where X
in BidXMkt
and OfferXMkt
is greater than 0 so that I end up with something like:
instrument Bid0Mkt Offer0Mkt ...
0 EURUSD 1.1
1 NZDUSD 0.6
. . .
. . .
I can do this by writing out each if statement
for column in df_alltrades.columns:
if '1' in column:
df_alltrades.drop(columns=column, inplace=True)
if '2' in column:
df_alltrades.drop(columns=column, inplace=True)
if '3' in column:
df_alltrades.drop(columns=column, inplace=True)
if '4' in column:
df_alltrades.drop(columns=column, inplace=True)
but I was hoping to clean it up with a loop, something like:
for column in df_alltrades.columns:
for C in range(1, 5):
if C in column:
df_alltrades.drop(columns=column, inplace=True)
but I'm getting the error:
TypeError: 'in <string>' requires string as left operand, not int
I have also tried things like;
for X in range(1, 5):
BidMkt = 'Bid{}Mkt'.format(X)
df_new = df_alltrades.drop([BidMkt], axis = 1)
which even though print(BidMkt)
gives the full list, only Bid4Mkt
is dropped.
Thanks for any help!
Upvotes: 2
Views: 1168
Reputation: 14103
Create a remove list using list comprehension and drop those columns
d = {'instrument': {0: 'EURUSD', 1: 'NZDUSD'},
'Bid0Mkt': {0: 1.1, 1: 0.6},
'Bid1Mkt': {0: 1.2, 1: 0.65},
'Bid2Mkt': {0: 1.2, 1: 0.7},
'Bid3Mkt': {0: 1.3, 1: 0.9},
'Bid4Mkt': {0: 1.3, 1: 0.92},
'Another': {0: 1, 1: 2}}
df = pd.DataFrame(d)
remove = [col for col in df.columns if col.startswith('Bid') and int(col[3]) > 0]
new_df = df.drop(columns=remove)
instrument Bid0Mkt Another
0 EURUSD 1.1 1
1 NZDUSD 0.6 2
Upvotes: 1
Reputation: 147
Check what columns don't have 0 in it:
for column in df_alltrades.columns:
if 'Bid' in column and '0' not in column:
df_alltrades.drop(columns=column, inplace=True)
Upvotes: 1
Reputation: 1546
The error is telling you where the problem is C needs to be a string not an integer:
for column in df_alltrades.columns:
for C in range(1, 5):
if str(C) in column:
df_alltrades.drop(columns=column, inplace=True)
but if I were you I would first do this to get the headers as a list
headers = list(df.columns.values)
Filter the necessary columns from this list and then use it to slice the Dataframe like this
df = df[headers]
Upvotes: 1