-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impossible to create indices on expressions involving some h3 functions in PosrgreSQL 17 #165
Comments
@spaghettiguru: Hey! It's my understanding that GiST support is pending: I'm having a hard time understanding the use case for converting a cell back into geometry and then doing a geospatial index on that... I appreciate that you've slimmed down to a minimally reproducible example, however, it's a little unclear to me why you'd create an index on that particular expression. Could you explain your use case? (or was it just to illustrate this issue 🤔 ) |
Like @jmealo, this is not something I have any experience with. Can you elaborate on how to "qualify the references to extension objects inside h3 functions with the right schema"? |
Hi @jmealo, the index creation command from my original post is the real thing we are doing and not merely an example. @zachasme, hi man. I do not have experience with writing or maintaining PostgreSQL extensions, so I am not sure I am able to provide the complete solution. However, I did a little digging and had a look at how PostGIS extension dealt with this breaking PostgreSQL change and what they did is they qualified the references to extension objects in functions using a reserved symbol
Hope it will help. |
This is super helpful. What are you trying to do at a higher level and how many records are you dealing with? That would help see if there's a better way. For context: Using a bounding box to do a coarse filter to reduce the number of distance calculations is something we'd do to speed up queries in PostGIS sans-H3. To help you find the best way to do something we'll need to start before applying the bounding box and focus on your end goal(s). There are often h3 native ways to do things as well depending on your use case. If you're new to geospatial or H3: Understanding the different zoom levels and how h3 works can be very helpful. I'll admit, the examples of how to do some of the more advanced things in h3 (particularly using h3-pg) is lacking. The bright side is that the h3 documentation is pretty excellent and you'll want to skim that. What's really missing is a cook book/set of recipes/patterns. I'd be interested in helping to compile that. |
@jmealo , thanks man, What I would expect ideally is that H3 support a function that accepts a polygon and an H3 resolution and returns the list of cell indexes in that resolution which intersect with that polygon. From what I understand, there is an ongoing effort to support the new H3 function that returns the list of cells which intersect with a given polygon but IMO that is not ideal when dealing with a large polygon and a high H3 resolution since the number of returned cells can be huge in such case. |
@spaghettiguru The good news is, for your use case, you shouldn't need to convert between H3 and geometry at all. This completely sidesteps/avoids the issue reported in this ticket. It was a little unclear for me how to use h3-pg properly, and, nobody has written extensive usage examples on it yet. Let me know if the following is helpful, perhaps we'll adapt it to the official documentation, or I'll make a blog post. Here's a contrived example demonstrates how to efficiently handle spatial + temporal queries in PostgreSQL using H3 indexes. It shows a pattern that works well for complex use cases (multiple geometry types, high volume, temporal data) while noting where simplifications can be made for simpler scenarios. The example uses a parking lot management system to illustrate the concepts, but the patterns can be applied to any spatial + temporal data. Helper functionsThese convert geography/geometry to a set of H3 cells at a given resolution. @zachasme and I are discussing adding these (or something similar) to create function public.h3_geography_to_cells(geog geography, resolution integer) returns SETOF h3index
immutable
strict
parallel safe
language plpgsql
as
$$
DECLARE
geom_type text;
BEGIN
-- Get the geometry type
geom_type := ST_GeometryType(geog::geometry);
-- Handle different geometry types
CASE
WHEN geom_type = 'ST_Point' THEN
-- For points, use h3_geo_to_h3
RETURN QUERY SELECT h3_lat_lng_to_cell(geog, resolution);
WHEN geom_type = 'ST_LineString' THEN
-- For linestrings, convert to polygon and use h3_polygon_to_cells
RETURN QUERY SELECT h3_polygon_to_cells(ST_Buffer(geog, 0.00001), resolution);
WHEN geom_type IN ('ST_Polygon', 'ST_MultiPolygon') THEN
-- For polygons and multipolygons, use h3_polygon_to_cells directly
RETURN QUERY SELECT h3_polygon_to_cells(geog, resolution);
ELSE
RAISE EXCEPTION 'Unsupported geometry type: %', geom_type;
END CASE;
END;
$$;
create function public.h3_geography_to_cells_buffered(geog geography, resolution integer, buffer_meters double precision DEFAULT 0) returns SETOF h3index
immutable
strict
parallel safe
language plpgsql
as
$$
DECLARE
buffered_geog geography;
BEGIN
-- Apply buffer if specified
IF buffer_meters > 0 THEN
buffered_geog := ST_Buffer(geog, buffer_meters)::geography;
ELSE
buffered_geog := geog;
END IF;
-- call existing function
RETURN QUERY SELECT * FROM h3_geography_to_cells(buffered_geog, resolution);
END;
$$;
create function public.h3_geometry_to_cells(geom geometry, resolution integer) returns SETOF h3index
immutable
strict
parallel safe
language plpgsql
as
$$
DECLARE
geom_type text;
BEGIN
-- Get the geometry type
geom_type := ST_GeometryType(geom);
-- Handle different geometry types
CASE
WHEN geom_type = 'ST_Point' THEN
-- For points, use h3_geo_to_h3
RETURN QUERY SELECT h3_lat_lng_to_cell(geom, resolution);
WHEN geom_type = 'ST_LineString' THEN
-- For linestrings, convert to polygon and use h3_polygon_to_cells
RETURN QUERY SELECT h3_polygon_to_cells(ST_Buffer(geom, 0.00001), resolution);
WHEN geom_type IN ('ST_Polygon', 'ST_MultiPolygon') THEN
-- For polygons and multipolygons, use h3_polygon_to_cells directly
RETURN QUERY SELECT h3_polygon_to_cells(geom, resolution);
ELSE
RAISE EXCEPTION 'Unsupported geometry type: %', geom_type;
END CASE;
END;
$$; Example schemaI didn't test these queries exactly, but. they're adapted from a working schema. -- Core tables
CREATE TABLE lots (
lot_id uuid PRIMARY KEY,
lot_name text NOT NULL,
location geography(POINT) NOT NULL
);
CREATE TABLE spots (
spot_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
lot_id uuid REFERENCES lots,
spot_number text NOT NULL,
created timestamptz DEFAULT CURRENT_TIMESTAMP NOT NULL,
deleted timestamptz
);
-- H3 spatial index table
CREATE TABLE spot_hexagons (
spot_id bigint NOT NULL REFERENCES spots,
hex_id h3index NOT NULL,
PRIMARY KEY (spot_id, hex_id)
);
CREATE INDEX ON spot_hexagons(hex_id);
-- Example occupancy history partitioned by hash
CREATE TABLE occupancy_history (
spot_id bigint NOT NULL REFERENCES spots,
status text NOT NULL, -- 'occupied', 'available', 'reserved', etc
created timestamptz DEFAULT CURRENT_TIMESTAMP NOT NULL,
expires_at timestamptz,
PRIMARY KEY (spot_id, created)
) PARTITION BY HASH (spot_id); Example queryThis query finds parking spots and their current occupancy. WITH input_point AS (
-- Convert input point to geography
SELECT ST_GeomFromGeoJSON(%L)::geography AS geog
),
nearby_hexes AS (
-- Get H3 cells within buffer of point
SELECT h3_geography_to_cells_buffered(
(SELECT geog FROM input_point),
%(resolution)s, -- e.g. 9
%(radius_m)s -- e.g. 1000
) AS hex_id
),
matching_spots AS (
-- Find spots in those H3 cells
SELECT DISTINCT s.spot_id,
s.lot_id,
s.spot_number,
l.lot_name,
s.created,
s.deleted
FROM spots s
JOIN spot_hexagons h ON s.spot_id = h.spot_id
JOIN lots l ON s.lot_id = l.lot_id
WHERE h.hex_id IN (SELECT hex_id FROM nearby_hexes)
AND (s.deleted IS NULL OR s.deleted > NOW())
)
-- Get latest occupancy for each matching spot
SELECT s.*,
o.status AS occupancy_status,
o.created AS status_updated,
o.expires_at
FROM matching_spots s
LEFT JOIN LATERAL (
SELECT spot_id, status, created, expires_at
FROM occupancy_history
WHERE spot_id = s.spot_id
AND created <= NOW()
ORDER BY created DESC
LIMIT 1
) o ON true; ExplanationThis example demonstrates an efficient pattern for spatial + temporal queries using H3 indexes in PostgreSQL using h3-pg and PostGIS. Schema Design PatternThe schema follows these principles:
Why This Pattern Works WellEfficient Spatial Queries:
Complex Geometries Support:
Temporal Data Management:
Partitioning matters
When to Use This PatternBest suited for:
Consider simplifying if:
Implementation NotesResolution Choice:
Helper Functions:
Query Structure:
Performance Considerations:
Maintaining the Hexagon TableWhen storing both raw geometries and H3 indexes, you'll want to keep them in sync. Here's a pattern using triggers: -- Add geometry column to spots table
ALTER TABLE spots
ADD COLUMN location geography(POINT);
-- Function to calculate hexagons for a spot
CREATE OR REPLACE FUNCTION update_spot_hexagons()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
-- Delete existing hexagons for this spot
DELETE FROM spot_hexagons
WHERE spot_id = NEW.spot_id;
-- Insert new hexagons
-- Note: Resolution (9) should be configured based on your needs
INSERT INTO spot_hexagons (spot_id, hex_id)
SELECT NEW.spot_id, h3_index
FROM h3_geography_to_cells(NEW.location, 9) AS h3_index;
RETURN NEW;
END;
$$;
-- Trigger to maintain hexagons on insert/update
CREATE TRIGGER maintain_spot_hexagons
AFTER INSERT OR UPDATE OF location
ON spots
FOR EACH ROW
WHEN (NEW.location IS NOT NULL)
EXECUTE FUNCTION update_spot_hexagons();
-- Optional: Trigger to clean up hexagons on delete
CREATE TRIGGER cleanup_spot_hexagons
AFTER DELETE
ON spots
FOR EACH ROW
EXECUTE FUNCTION
(DELETE FROM spot_hexagons WHERE spot_id = OLD.spot_id); Benefits of the trigger approach
Note that for bulk operations, you might want to disable triggers temporarily and rebuild the hexagon table using a batch process for better performance. If your geometries are complex or you're indexing at multiple resolutions, consider adding appropriate indexes and potentially partitioning the hexagon table as well. The exact strategy will depend on your query patterns and data volume. Further ReadingH3 Resources
PostgreSQL Resources
PostGIS Resources
|
@jmealo thanks for such an elaborate response, man, this is very nice of you. While I was familiar with some of this stuff, there is a lot of useful info here and some great tips!
In any case, we will give it a try and see if it solves the original issue. Thanks again for your help! |
@spaghettiguru: No problem! Glad I could help. You raise good points about the trade-offs, but let me clarify something important about operation costs. While a million H3 comparisons might sound expensive, they're actually much cheaper than even a few hundred geospatial operations, especially with complex geometries. This is because H3 comparisons are basically integer operations with good memory locality, while geospatial operations involve complex floating-point math and potentially scattered memory access. A few key points to consider:Query Optimization:
Mental Model:
Best Practices:
The key insight is that H3 trades exact calculations for fast approximations. Use it to quickly narrow down your dataset, then apply precise geospatial operations only on that smaller set. When properly indexed, these queries will perform more like standard SQL queries rather than expensive geospatial ones. This strategy works best when you can standardize on a single resolution per table/column, as it simplifies both indexing and query patterns. |
Thanks a lot, @jmealo , these are important insights that will surely be useful not just to our team but to others as well. Really grateful for your help! Have a great week! |
First of all, thanks for maintaining this, to everyone involved!
The issue
Given the table defined like this:
An attempt to create the following index on it
CREATE INDEX ON my_table USING GIST (h3_cell_to_geometry(h3cell))
fails in Postgres >= 17 .This seems to happen because of the changes in Postgres 17 around index creation which make accessing non-standard schema-unqualified object references during index creation to fail, since PostgreSQL now temporarily modifies the search_path when running some operations and these objects cannot be found.
Here is the quote from the PostgreSQL 17 release notes:
They also updated the docs for
CREATE INDEX
command to contain this:So I guess the solution would be to qualify the references to extension objects inside h3 functions with the right schema - where needed.
The text was updated successfully, but these errors were encountered: