Star Schema

The star schema is the most common data warehouse modeling pattern. The fact table at the center contains the quantitative data you want to analyze: sales amounts, click counts, session durations. Surrounding dimension tables provide the context: who (customer dimension), what (product dimension), when (date dimension), and where (location dimension). Foreign keys in the fact table reference each dimension.

This denormalized structure is optimized for analytical queries. A query like "total revenue by product category by month by region" joins the fact table with three dimension tables using simple key lookups. The star shape makes queries intuitive and fast because most analytical questions follow the pattern of "measure X sliced by dimensions Y and Z."

For AI teams building feature pipelines, star schemas provide a clean structure for aggregating features. User-level features are computed by grouping the fact table by user dimension. Time-based features use the date dimension for windowed aggregations. The clear separation of facts and dimensions makes feature engineering queries straightforward and maintainable.

Related Terms

Cosine Similarity

Dimensionality Reduction

Batch Inference

Real-Time Inference

Data Pipeline

ETL (Extract, Transform, Load)