Geospatial indexing on Hilbert curves

By poudro - a day ago

Showing first level comment(s)

> Instead of projecting the Earth directly onto a single square, it projects the Earth onto the 6 faces of a cube enclosing the Earth and then applies an extra non-linear transformations to reduce even more the deformations. Each cell in s2 is in fact part of one of six quadtrees that describe the whole planet.

That's a super cool detail! I once implemented a 2D index using a Z-Order curve that directly translated lat/lon coordinates to a linear ordering. It works well enough because nobody really lives at the poles--the search regions with a single projection get really obtuse there. Projecting the earth onto a 6-sided die is a really elegant solution to that problem! Go go Google engineering!

digsmahler - 15 hours ago

Tinder's engineering team wrote a post about using Hilbert Curves for their geolocation-based recommendations engine[1]:

Part 2 of Tinder post is still in my bookmarks to read, but already I am loving Zenly's in-depth analysis of their findings!

It seems that using Google's S2 library is pretty much standard for this problem, I'm curious if other companies are doing this too?

[1]https://tech.gotinder.com/geosharded-recommendations-part-1-...

kgraves - 15 hours ago

Any particular reason why you'd use a Hilbert curve instead of, say, a Z curve (https://en.wikipedia.org/wiki/Z-order_curve)? Conversion to and from actual coordinates seems much more straightforward for that one (just bit [de]interleaving).

I mean, any Quad/Octree/N-dimensional equivalent can have its cells numbered by giving each quadrant/octant/each of the 2^N sub-cells a certain bit combination and then chaining those together as you descend the tree. The Hilbert curve version is just a special case of this with complicated rules for the "sub cell" <-> "bit sequence" mapping. If you were to use a Z curve, the resulting data format and querying algorithm would be exactly identical to the one in the article, just a lot less complicated in the (not presented) details of "where is this child" than the Hilbert version...

MauranKilom - 14 hours ago

I'm interested in their reasons for not using an R-Tree or an R*-Tree for the index. I know they mentioned the debate in this post but I'm quite curious how they arrived at their decision. Have they done performance tests with both methods? Many other high performance applications use R-Tree based structures, and I've always been under the impression that R-Trees usually outperform Quadtree style indexes in PIP queries.

dylrich - 15 hours ago

Shameless plug: why not use H3 (https://github.com/uber/h3)?

isaachier - 15 hours ago

Didn't Randal Munroe invent this technique?

https://xkcd.com/195/

speleo - 15 hours ago