Databases

A consolidated reference covering the full data stack — relational fundamentals, SQL practice, dimensional modeling, NoSQL and graph stores, ingestion and streaming pipelines, open lakehouse table formats, and modern vector databases for RAG and semantic search.

SQL

Language reference (SELECT, JOIN, window functions), query performance and EXPLAIN plans, plus interview-style worked exercises.

Relational & Modeling

RDBMS fundamentals, star/snowflake/galaxy schemas, and Kimball dimensional modeling — bus matrix, conformed dimensions, and a worked sales model.

NoSQL & Graph

The full non-relational landscape — in-memory (Redis), documents (MongoDB), wide-column (Cassandra), key-value (etcd, RocksDB), graph (Neo4j, Neptune), time-series, and search.

Pipelines

ETL and ELT patterns, large-scale ingestion, Apache NiFi flows, Kafka streaming, and Parquet columnar storage — the data movement layer.

Lakehouse

Open table formats (Hudi, Iceberg, Delta), catalogs (Polaris, Unity, Nessie), and query engines (Trino, StarRocks) — the open lakehouse stack.

Vector Databases

pgvector, Chroma, Weaviate, FAISS — embeddings, ANN indexes (HNSW, IVF, PQ), and the retrieval layer behind RAG and semantic search.


Quick Reference — Kimball & RDBMS Schemas

The rest of this page is a one-screen quick reference. For depth, follow the cards above into the section landing pages and their per-topic deep dives.

Kimball Bottom-Up Data Warehouse Architecture

Common RDBMS Schemas

  1. Star Schema. Central fact table connected directly to denormalized dimension tables. Fewer joins, faster queries.
  2. Snowflake Schema. Star schema with normalized dimensions. Reduces redundancy at the cost of more joins.
  3. Galaxy Schema (Fact Constellation). Multiple fact tables share conformed dimension tables — for warehouses spanning multiple business processes.
  4. Hierarchical Schema. Tree-structured data with parent–child relationships. Useful for organizational charts and nested catalogs.
  5. Network Schema. Like hierarchical, but supports many-to-many relationships. For non-hierarchical, interconnected entities.
  6. Flat Schema. A single table without hierarchy. Suitable for small, self-contained datasets.

Star schema diagram with central fact table and surrounding dimension tables