lyh198
lyh198

Reputation: 71

How to sum up all values in a row where the column contains a specific string?

So I have a dataframe of categories of venues in each neighbourhood. It looks like:

enter image description here

The values in each row represent the no. of each venue in the specific neighbourhood.

I want to find out the total number of restaurants in each neighbourhood. To do so, I know I have to sum up the values in a row where the column contains the string "Restaurant".

I've tried using str.contains function but that sums up True cases - how many times a column containing the string restaurant has a value >0 in that row. But instead, what I'd like is, to sum up, the total no. of restaurants in the neighbourhood instead.

Upvotes: 3

Views: 4629

Answers (3)

Ch3steR
Ch3steR

Reputation: 20669

You can use pd.Index.str.contains with df.loc here.

df['sum_rest'] = df.loc[:,df.columns.str.contains('Restaurant')].sum(axis=1)

Upvotes: 5

Catalina Chircu
Catalina Chircu

Reputation: 1572

Define a list of columns containing "Restaurant" :

lr = ["Afgan Restaurant", "American Restaurant", "Argentinian Restaurant"]

Then parse the result and put it in a column :

df["sum_restaurant"] = df.loc[:, columns=lr].apply(lambda row : np.sum(row.to_numpy()))

Upvotes: 0

Roy2012
Roy2012

Reputation: 12503

Here's a way to do that:

df = pd.DataFrame({"restaurant_a": [1,2,3], "shop": [2,3,4], "restaurant_b": [4,5,6]})
df["sum_rest"] = df[[x for x in df.columns if "restaurant" in x]].sum(axis = "columns")
df

The result is:

   restaurant_a  shop  restaurant_b  sum_rest
0             1     2             4         5
1             2     3             5         7
2             3     4             6         9

Upvotes: 1

Related Questions