user2033412
user2033412

Reputation: 2119

Compressed representation of polygons

I have a lot of (millions) of polygons from openstreetmap-data with mostly (more than 99%) exactly four coordinates representing houses.

ExampleExample

I currently save the four coordinates for each house explicitly as Tuple of floats (Latitude and Longitude), hence taking 32 bytes of memory.

Is there a way to store this information in a compressed way (fewer than 32 byte) since the four coordinates only differ very few in the last decimals?

Upvotes: 0

Views: 263

Answers (3)

user1196549
user1196549

Reputation:

You are giving no information on the resolution you want to keep.

Assuming 1 m accuracy is enough, 24 bits can cover up to 16000 km. Then 8 bits should also be enough to represent the size information (up to 256 m).

This would make 8 bytes per house.

More aggressive compression for instance with Huffman coding will probably not work on the locations (relatively uniform distribution); a little better on the sizes, but the benefit is marginal.

Upvotes: 0

Ripi2
Ripi2

Reputation: 7198

As @MBo said, you can store one corner of each house and compress the other three corners as relative to the first corner.

Also, if buildings are so similar you can set a "dictionary" of buildings. For each building you store its index in the dictionary and some feature, like its first corner coordinates and rotation.

Upvotes: 1

MBo
MBo

Reputation: 80197

If your map patch is not too large, you can store relative coordinates against some base point (for example, bottom left corner). Get these differences, norm them by map size like this:

   uint16_diff  = (uint16) 65535 * (lat - latbottom) / (lattop - latbottom)

This approach allows to store 16-bit integer values.

For rectangles (you can store them in separate list) there is a way to store 5 16-bit values instead of 8 values - coordinates of left top corner, width, height, and angle of rotation (there might be another sets of data, for example, including the second corner)

Combining both these methods, one might get data size loss upto 3.2 times

Upvotes: 1

Related Questions