Reputation: 71
So I have a dataframe of categories of venues in each neighbourhood. It looks like:
The values in each row represent the no. of each venue in the specific neighbourhood.
I want to find out the total number of restaurants in each neighbourhood. To do so, I know I have to sum up the values in a row where the column contains the string "Restaurant".
I've tried using str.contains
function but that sums up True cases - how many times a column containing the string restaurant
has a value >0
in that row. But instead, what I'd like is, to sum up, the total no. of restaurants in the neighbourhood instead.
Upvotes: 3
Views: 4629
Reputation: 20669
You can use pd.Index.str.contains
with df.loc
here.
df['sum_rest'] = df.loc[:,df.columns.str.contains('Restaurant')].sum(axis=1)
Upvotes: 5
Reputation: 1572
Define a list of columns containing "Restaurant" :
lr = ["Afgan Restaurant", "American Restaurant", "Argentinian Restaurant"]
Then parse the result and put it in a column :
df["sum_restaurant"] = df.loc[:, columns=lr].apply(lambda row : np.sum(row.to_numpy()))
Upvotes: 0
Reputation: 12503
Here's a way to do that:
df = pd.DataFrame({"restaurant_a": [1,2,3], "shop": [2,3,4], "restaurant_b": [4,5,6]})
df["sum_rest"] = df[[x for x in df.columns if "restaurant" in x]].sum(axis = "columns")
df
The result is:
restaurant_a shop restaurant_b sum_rest
0 1 2 4 5
1 2 3 5 7
2 3 4 6 9
Upvotes: 1