Reputation: 1
I am attempting to load WKT representations of polygons and multipolygons into a Snowflake GEOGRAPHY-type column. Each polygon has thousands of points, each of which is defined by lat/long points defined with upwards of 10 points of precision after the decimal.
As I attempt to cast my WKT strings to the proper format for insertion, ie TO_GEOGRAPHY(), I get variations of this error:
100217 (P0000): Geography validation failed: Edge (-77.495072,35.477751) -- (-77.495247,35.477899) crosses edge (-77.495247,35.477899) -- (-77.495247,35.477899)
This particular error was thrown when I tried to process the following WKT: (its very long, so hosted here; my apologies)
For convenience, here's the section of that string with the crossing edges:
...
-77.49480800034235 35.47764599906043, -77.49480800036956 35.477645999070795,
-77.49507200036956 35.47775099907079, -77.49507200051436 35.47775099914243,
-77.49507200064573 35.47775099923645, -77.49524700064573 35.47789899923645,
-77.49524700064573 35.477898999236444, -77.49524700072536 35.47789899931164,
-77.49552600072536 35.47819299931164,-77.4955260008524 35.4781929994771,
...
Obviously I can just ignore the invalid edges by setting allow_invalid = TRUE
in the TO_GEOGRAPHY()
fn, forcing the load, but that doesn't meet my needs - I need some way to make these polygons valid, ideally without truncating every single coordinate value to make it less precise.
I did find this thread somewhat useful, but the recommended solution doesn't seem to apply to all my polygons. Applied to the example string above, a new edge is revealed as a problem:
100217 (P0000): Geography validation failed: Edge (-77.173830,35.470360) -- (-77.173830,35.469260) crosses edge (-77.173830,35.469266) -- (-77.173831,35.468991)
Does anyone have any advice for next steps - solutions or even debugging steps? Like I said, this error occurs on many of my thousands of polygons, and I'm stumped.
Happy to provide more details wherever needed, or additional examples.
Upvotes: 0
Views: 79
Reputation: 26078
So I grab the huge block of points, put them into VS Code, regex'ed it a valid SQL value list:
,
-> ),\n(
the \n
should be a shift+enter
to put a new line token in. Then
-> ,
Now you can slap that into a insert statement like:
create table lat_lons as
select
$1::number(30,16) as lat,
$2::number(30,16) as lon,
row_number() over (order by true) as rn,
from values
(-77.6865440007049,35.6471289992901),
(-77.68671800056762,35.6472919991615),
(-77.68744500042364,35.647631999094166),
(-77.68744500042364,35.64763199909417),
albeit 7K lines long.
Then you can truncate those values to 5dp, and only keep the first of each set.
select * from
(
select *,
floor(lat,5) as lat_s,
floor(lon,5) as lon_s,
conditional_change_event(array_construct(lat_s,lon_s)) over (order by rn) as change_group
from lat_lons
)
qualify row_number() over (partition by change_group order by rn) = 1
then turn those back into a text list:
select listagg(lat_s||' '||lon_s, ',') within group (order by rn) as l,
count(*) as c
from d1
then turn that into a GEOGRAPTHY:
with d1 as (
select * from
(
select *,
floor(lat,5) as lat_s,
floor(lon,5) as lon_s,
conditional_change_event(array_construct(lat_s,lon_s)) over (order by rn) as change_group
from lat_lons
)
qualify row_number() over (partition by change_group order by rn) = 1
), to_list as (
select listagg(lat_s||' '||lon_s, ',') within group (order by rn) as l,
count(*) as c
from d1
)
select c,
length(l),
to_geography('MULTIPOLYGON((('|| l ||')))')
from to_list;
for 7 DP you get 7700 points instead of 7273. I found the value up to 14 worked.
Upvotes: 0