user1718097
user1718097

Reputation: 4292

Using GeoPandas to plot groups of points on a map produces a blank image

I have recently started using GeoPandas to produce maps and find it extremely useful. I have used Pandas for a while and I have found the move to GeoPandas to be relatively painless. However, I'm having problems plotting points on a map following grouping the points using the .dissolve() function.

Basically, I have a selection of UK postcode data with associated longitude and latitude values downloaded from the Office for National Statistics Postcode Directory (ONSPD). The longitude and latitude values are based on CRS WGS84. I can convert to CRS OSGB36 and plot all the points on a map with no problem. However, if I groupby the points based on other variables (e.g. 'group1' and 'groups') using the .dissolve() method, I can no longer plot the points.

Here is the code I have so far to plot all the points together:

import pandas as pd
import geopandas as gif
import matplotlib.pyplot as plot
import shapely

# Define a Pandas dataframe containing postcodes and 
postcodeDF = pd.DataFrame({'pcd': ['RM175AG', 'NP181PH', 'LS8 1EN', 'HG1 1XQ', 'G11 6YB', 'TN218AB', 'GU138AL', 'CV344BD', 'YO126PH', 'SO172WT', 'PR2 8HN', 'TF1 2HD', 'M31 4FR', 'CH460UB', 'EX111LN', 'TS214DX', 'BN4 2LS', 'FY8 1XL', 'KA256BP', 'DA1 1QR'],
                           'ctry': ['E92000001', 'W92000004', 'E92000001', 'E92000001', 'S92000003', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'E92000001', 'S92000003', 'E92000001'],
                           'long': [0.320423, -2.968257, -1.51314, -1.522386, -4.309171, 0.255878, -0.8502959999999999, -1.588886, -0.41805299999999995, -1.382926, -2.699804, -2.531778, -2.425061, -3.110276, -3.317868, -1.429442, -0.22178699999999998, -3.019257, -4.686457, 0.224912],
                           'lat': [51.491329, 51.628333000000005, 53.840114, 54.002427000000004, 55.870383, 50.966097999999995, 51.275081, 52.280791, 54.296222, 50.921857, 53.780733999999995, 52.692078, 53.415147, 53.394830000000006, 50.758146, 54.680499, 50.84042, 53.753078, 55.749168999999995, 51.441911],
                           'group1':['A']*10 + ['B']*10,
                           'group2':[True,False]*10})

# Set up geodataframe, initially with CRS = WGS84 (since that matches the long and lat co-ordinates)
crs = {'init':'epsg:4326'}
geometry = [shapely.geometry.Point(xy) for xy in zip(postcodeDF['long'], postcodeDF['lat'])]

postcodeGDF = gpd.GeoDataFrame(postcodeDF,
                               crs = crs,
                               geometry = geometry)

# Convert geometry to OSGB36
postcodeGDF = postcodeGDF.to_crs(epsg = 27700)

print(postcodeGDF)

The geodataframe contains the following information:

ctry group1  group2        lat      long      pcd  \
0   E92000001      A    True  51.491329  0.320423  RM175AG   
1   W92000004      A   False  51.628333 -2.968257  NP181PH   
2   E92000001      A    True  53.840114 -1.513140  LS8 1EN   
3   E92000001      A   False  54.002427 -1.522386  HG1 1XQ   
4   S92000003      A    True  55.870383 -4.309171  G11 6YB   
5   E92000001      A   False  50.966098  0.255878  TN218AB   
6   E92000001      A    True  51.275081 -0.850296  GU138AL   
7   E92000001      A   False  52.280791 -1.588886  CV344BD   
8   E92000001      A    True  54.296222 -0.418053  YO126PH   
9   E92000001      A   False  50.921857 -1.382926  SO172WT   
10  E92000001      B    True  53.780734 -2.699804  PR2 8HN   
11  E92000001      B   False  52.692078 -2.531778  TF1 2HD   
12  E92000001      B    True  53.415147 -2.425061  M31 4FR   
13  E92000001      B   False  53.394830 -3.110276  CH460UB   
14  E92000001      B    True  50.758146 -3.317868  EX111LN   
15  E92000001      B   False  54.680499 -1.429442  TS214DX   
16  E92000001      B    True  50.840420 -0.221787  BN4 2LS   
17  E92000001      B   False  53.753078 -3.019257  FY8 1XL   
18  S92000003      B    True  55.749169 -4.686457  KA256BP   
19  E92000001      B   False  51.441911  0.224912  DA1 1QR   

                                       geometry  
0   POINT (561188.9840165515 179484.0452796911)  
1   POINT (333075.0000681121 192612.9537310874)  
2    POINT (432134.031689987 438316.9950631865)  
3    POINT (431404.026064762 456371.9915770486)  
4   POINT (255609.0244790429 666546.0027781442)  
5   POINT (558502.0104336547 120942.0242662177)  
6   POINT (480294.0287511199 153509.0281225364)  
7   POINT (428143.9880141384 264818.0084207859)  
8   POINT (503054.9928648224 490110.0245871434)  
9    POINT (443470.0222579727 113781.050039533)  
10  POINT (353983.9862037547 431827.9554603411)  
11   POINT (364154.9903684837 310620.967712217)  
12  POINT (371845.0195746602 391010.9943895991)  
13   POINT (326267.022012488 389240.9700794323)  
14   POINT (307141.054758316 96222.92213930591)  
15  POINT (436886.0245473493 531865.0198253235)  
16  POINT (525299.9996502266 106049.9600256799)  
17  POINT (332890.0335889475 429005.9677596154)  
18  POINT (231483.9972917534 653913.0284422053)  
19  POINT (554726.0039838878 173783.0302315076)  

Which can be used to plot the map:

# Plot map
fig, ax = plt.subplots(1,
                       figsize = (4,5),
                       dpi = 72,
                       facecolor = 'lightblue')

ax.set_position([0,0,1,1])   # Puts axis to edge of figure
ax.set_axis_off()            # Turns axis off so facecolour applies to axis area as well as bit around the outside
ax.get_xaxis().set_visible(False)   # Turns the x axis off so that 'invisible' axis labels don't take up space
ax.get_yaxis().set_visible(False)

lims = plt.axis('equal')

# N.B. Code to plot shapefile has been deleted for clarity

postcodeGDF.plot(ax=ax)

plt.show()

The map (including the shapefile outline) looks like:

Map showing postcode points

However, I would like to group the postcodes based on other variables in the geodataframe (in this case variables 'group1' and 'group2') (and ultimately plot different colours and markers for each group – although I haven't got that far). I have grouped the points using the .dissolve() method.

postcodesGroupby = postcodeGDF.dissolve(by = ['group1','group2'])
print(postcodesGroupby)

The groupby dataframe looks like:

                                                        geometry       ctry
group1 group2                                                                 
A      False   (POINT (333075.0000681121 192612.9537310874), ...  W92000004   
       True    (POINT (255609.0244790429 666546.0027781442), ...  E92000001   
B      False   (POINT (326267.022012488 389240.9700794323), P...  E92000001   
       True    (POINT (231483.9972917534 653913.0284422053), ...  E92000001   

                     lat      long      pcd  
group1 group2                                
A      False   51.628333 -2.968257  NP181PH  
       True    51.491329  0.320423  RM175AG  
B      False   52.692078 -2.531778  TF1 2HD  
       True    53.780734 -2.699804  PR2 8HN  

However, when I try to plot the points using:

postcodesGroupby.plot(ax=ax)

...no points appear on the map.

I suspect I'm missing something obvious but I have stared at the code for a while and can no longer see the wood for the trees. Any suggestions how I can fix this issue would be gratefully received.

Upvotes: 3

Views: 6724

Answers (1)

joris
joris

Reputation: 139222

The problem is that geopandas at this moment does not yet support plotting MultiPoints (and the dissolve method grouped the Points into MultiPoints). The fact that you get a blank image instead of a good error message is somewhat unfortunate ..

But, there has just been a PR merged to add support for plotting MultiPoints: https://github.com/geopandas/geopandas/pull/683. So this will work in the next geopandas release.

A workaround for now is to plot the individual points, but to have to appropriate grouped colors, to add a column reflecting those groups:

# add a new column with an integer indicating the group number
postcodeGDF['group'] = postcodeGDF.groupby(['group1','group2']).ngroup()
postcodeGDF.plot(column='group', categorical=True, legend=True)

gives:

enter image description here

Upvotes: 5

Related Questions