Reputation: 632
I have a dataframe like below:
df = pd.DataFrame({"location": ["north", "south", "south", "north"], "store" : ["a", "b", "c", "d"], "12345" : [2,4,3,6], "23434": [4,3,7,0], "23454": [8,9,0,1], "65432": [4,6,5,7], "34254": [0,0,9,1]})
I would like to subset it by below variable:
var = pd.Series([12345, 65432, 34254])
the final output should look like below:
my original dataframe and variable are big and I cannot use column names or use .loc and .iloc for sub setting, so any help would be appreciated.
Upvotes: 1
Views: 448
Reputation: 863751
Convert values to strings, because columns names are strings by Index.astype
and then to list and select by subset
:
df = df[['location', 'store'] + var.astype(str).tolist()]
print (df)
location store 12345 65432 34254
0 north a 2 4 0
1 south b 4 6 0
2 south c 3 5 9
3 north d 6 7 1
If possible some values in var
not match add Index.intersection
:
var = pd.Series([12345, 65432, 34254, 1000])
df = df[['location', 'store'] + df.columns.intersection(var.astype(str), sort=False).tolist()]
print (df)
location store 12345 65432 34254
0 north a 2 4 0
1 south b 4 6 0
2 south c 3 5 9
3 north d 6 7 1
Another idea is create MultiIndex
:
df = df.set_index(['location','store'])
df = df[var.astype(str).tolist()]
#if possible some values not match
#df = df[df.columns.intersection(var.astype(str), sort=False).tolist()]
print (df)
12345 65432 34254
location store
north a 2 4 0
south b 4 6 0
c 3 5 9
north d 6 7 1
Upvotes: 2