Logo
Modern Indexing: GIN and Spatial Indexes

Modern Indexing: GIN and Spatial Indexes

Nov 20, 2025
6 min read

Beyond Simple Key-Value Lookups

So far, we’ve explored indexes optimized for simple data types: integers, strings, dates. But modern applications store rich, complex data:

  • JSON documents with nested fields
  • Arrays of tags or categories
  • Geographic coordinates for location-based queries
  • Full-text documents for search

B+ Trees, LSM Trees, and Hash indexes struggle with this kind of data. Enter specialized indexes designed for these exact use cases.

GIN: Generalized Inverted Index

When you search Google for “database indexing”, Google doesn’t scan every webpage word-by-word. It uses an inverted index - a mapping from words to the documents containing them.

PostgreSQL’s GIN (Generalized Inverted Index) brings this same power to your database.

How Inverted Indexes Work

Instead of mapping Row → Data, an inverted index maps Value → List of Rows.

Example: Blog posts with tags

Post 1: ['databases', 'performance', 'indexing']
Post 2: ['databases', 'sql']
Post 3: ['performance', 'caching']

Traditional B+ Tree: Would index the entire array as a single value (useless for finding posts by individual tag)

GIN Inverted Index:

'databases' → [Post 1, Post 2]
'performance' → [Post 1, Post 3]
'indexing' → [Post 1]
'sql' → [Post 2]
'caching' → [Post 3]

Now you can instantly find all posts tagged with ‘databases’!

Tip

Query Speed: Finding all documents containing a specific word is O(1) with an inverted index, versus O(N) with a full table scan.

Use Cases for GIN

1. Full-Text Search

CREATE INDEX articles_content_gin ON articles
USING GIN (to_tsvector('english', content));
SELECT * FROM articles
WHERE to_tsvector('english', content) @@ to_tsquery('database & indexing');

2. JSONB Indexing

CREATE INDEX user_metadata_gin ON users USING GIN (metadata);
SELECT * FROM users
WHERE metadata @> '{"premium": true}';

3. Array Containment

CREATE INDEX post_tags_gin ON posts USING GIN (tags);
SELECT * FROM posts WHERE tags @> ARRAY['databases'];

GIN Trade-Offs

Pros:

  • Extremely fast for containment queries (@>, @@)
  • Supports complex data types (arrays, JSON, full-text)
  • Compresses posting lists for space efficiency

Cons:

  • Slower bulk inserts (must update many index entries per row)
  • Larger index size than B+ Trees
  • Not suitable for equality or range queries on simple types
Did You Know?

GIN uses posting list compression techniques like VarByte encoding to reduce index size by 50-70% compared to naive implementations.

Spatial Indexes: R-Trees and Geohashes

“Find all gas stations within 5km of my current location.”

This is a proximity query - fundamentally different from equality or range queries. Standard indexes fail because:

  • Latitude/longitude are 2D coordinates (B+ Trees are 1D)
  • “Closeness” doesn’t have a simple sort order

R-Trees (Rectangle Trees)

R-Trees organize spatial data using Minimum Bounding Rectangles (MBRs).

How It Works:

  1. Group nearby objects into rectangles
  2. Group these rectangles into larger rectangles
  3. Build a tree hierarchy of nested rectangles
  4. To search: traverse rectangles that overlap with your query region

Example: Finding coffee shops near you

CREATE INDEX locations_gist ON places USING GIST (location);
SELECT * FROM places
WHERE ST_DWithin(
location,
ST_MakePoint(-122.4194, 37.7749), -- San Francisco
5000 -- 5km radius in meters
);

Used By: PostGIS (PostgreSQL extension), Oracle Spatial, MySQL Spatial

Geohashes

Geohashes convert 2D coordinates into a single string by interleaving latitude and longitude bits.

Example:

(lat: 37.7749, lon: -122.4194) → "9q8yy"

Key Property: Points that are geographically close have similar geohash prefixes!

San Francisco: 9q8yy9mf
Oakland (nearby): 9q9p1b9h (shares "9q" prefix)
New York (far): dr5ru7p4 (completely different)

This allows using a standard B+ Tree for proximity queries by searching for geohash prefixes.

Tip

Real-World Scale: Uber uses H3 (hexagonal hierarchical geospatial indexing system) to handle 100+ billion trips and location queries.

Spatial Index Trade-Offs

Pros:

  • Enable fast proximity/containment queries
  • Handle complex shapes (polygons, lines, not just points)
  • Essential for GIS applications

Cons:

  • More complex than B+ Trees
  • Higher memory overhead for MBRs
  • Update performance degrades with high overlap

Real-World Examples

PostgreSQL GIN for JSON

-- E-commerce: product catalog with varying attributes
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
attributes JSONB
);
CREATE INDEX attributes_gin ON products USING GIN (attributes);
-- Find all laptops with 16GB RAM
SELECT * FROM products
WHERE attributes @> '{"category": "laptop", "ram_gb": 16}';

Result: Millisecond queries on millions of products with complex, schema-less attributes.

PostGIS for Delivery Routing

-- Find all delivery drivers within 10km of an order
SELECT driver_id, ST_Distance(location, order_location) AS distance
FROM drivers
WHERE ST_DWithin(location, order_location, 10000)
ORDER BY distance
LIMIT 5;

Used By: Uber, DoorDash, Lyft for real-time driver matching.

MongoDB Geospatial Queries

MongoDB includes built-in support for 2D and 2D sphere indexes:

db.places.createIndex({ location: "2dsphere" });
db.places.find({
location: {
$near: {
$geometry: { type: "Point", coordinates: [-122.4194, 37.7749] },
$maxDistance: 5000
}
}
});

Comparison Table

Index TypeBest ForNot Suitable For
B+ TreeEquality, ranges, sortingComplex types, containment
GINFull-text, JSON, arraysSimple equality, ranges
R-TreeProximity, containmentExact matches, 1D data
HashExact equalityEverything else
LSMWrite-heavy workloadsRandom point lookups

When to Use Modern Indexes

Use GIN When:

  • Searching within JSON documents
  • Full-text search requirements
  • Array containment queries
  • Tag-based filtering

Use Spatial Indexes When:

  • Location-based queries (nearby restaurants, drivers, etc.)
  • GIS applications (mapping, routing)
  • Geometry containment (is this point in this polygon?)
  • Any 2D/3D proximity search
Warning

Don’t Over-Index: GIN and spatial indexes are larger and slower to maintain than B+ Trees. Only use them for queries that actually need their specialized capabilities.

Conclusion

Modern indexes expand databases beyond simple tables and rows. They enable:

  • Searching unstructured data (JSON, documents)
  • Building location-aware applications
  • Implementing full-text search without external services

Understanding these specialized indexes helps you:

  • Build schema-less features with JSONB + GIN
  • Implement proximity search for ride-sharing or delivery apps
  • Add full-text search without Elasticsearch
  • Choose PostgreSQL over NoSQL for complex query needs

Series Conclusion

We’ve journeyed from the fundamentals of indexing through WAL, B+ Trees, LSM Trees, Hash indexes, and finally specialized modern indexes. Each has its place:

  • B+ Trees: The reliable default for general OLTP
  • WAL: The foundation ensuring durability
  • LSM Trees: The write-optimized choice for ingestion
  • Hash Indexes: The specialist for exact lookups
  • GIN/Spatial: The specialists for complex data

Choose wisely, and your database will reward you with excellent performance!