Sher
Sher

Reputation: 415

Grouping rows based on the True value Pandas

I have Point geodataframe with True and False values. Geodataframe has 48 True/False columns and each row has only one True at whatever column and False all other columns. I would like to group the points based on the True values in columns foo and bar as the example below. For example points A and D are True at bar column, so they group together and points B and C are True at column foo therefore they group together. Point geometry should be preserved as they may be used in analysis at some point.

from shapely.geometry import Point, Polygon
import geopandas

polygons = geopandas.GeoSeries({
    'foo': Polygon([(6, 6), (7, 16), (13, 14), (13, 9)]),
    'bar': Polygon([(15, 18), (14, 15), (17, 10), (21, 9)]),
})

points = [Point(16, 13), Point(7, 8), Point(10, 12), Point(16,14)]
pnts = geopandas.GeoDataFrame(geometry=points, index=['p1', 'p2', 'p3', 'p4'])
pnts = pnts.assign(**{key: pnts.within(geom) for key, geom in 
polygons.items()})

out[]

       geometry    foo    bar
p1  POINT (16 13)  False   True
p2    POINT (7 8)   True  False
p3  POINT (10 12)   True  False
p4  POINT (16 14)  False   True

Expected output should be:

    geometry
foo POINT(7 8)
    POINT(10,12)
bar POINT (16 13)
    POINT (16 14) 

Can anyone help with this?

Upvotes: 1

Views: 541

Answers (2)

MNA
MNA

Reputation: 323

You may do it this way. Lets say all your geodataframes col are in list columns.

columns = df.columns.tolist() # get all the column names in a list
columns.remove("geometry")  # make sure only those column names are present that contain 
                        # geodataframe i.e. 48 columns in your case and thats why removed column name "geometry"

df1 = pd.DataFrame(columns = ["geometry", "new_col"]) # created an empty dataframe so that we can append new rows to it

for col in columns:
    df_subset = df.loc[df[col],["geometry"]] 
    df_subset["new_col"] = col
    df1 = pd.concat([df1, df_subset])
df1.index = df1.new_col
df1.drop(["new_col"], axis = 1)

The result will look something like this.

           geometry
new_col 
foo POINT (7 8)
foo POINT (10 12)
bar POINT (16 13)
bar POINT (16 14)

Upvotes: 1

Ub2r
Ub2r

Reputation: 31

You might want to use pandas.where :

import pandas as pd

df = pd.DataFrame(np.random.choice([True, False], 10).reshape(-1, 2), columns=['A', 'B'])

A_result = df.where(df["A"] == True).dropna()
B_result = df.where(df["B"] == True).dropna()

Upvotes: 0

Related Questions