This project was created as a midterm for Sakarya University’s Data Visualization course (ISE314).

This document outlines the end-to-end development of the data visualization project across three phases: Data Generation & Preprocessing (assignment phase) and Data Modeling & Visualization (project phase).

Phase 1: Procedural Dirty Data Generation

This Python script procedurally generates the project’s raw dataset. Its primary purpose is to intentionally introduce real-world data quality issues—such as anomalies, missing values, and formatting inconsistencies—to meet the assignment’s preprocessing requirements.

1. Helper Functions (Intentional Obfuscation)

The script uses several helper functions to produce intentionally messy fields:

2. Dimension Table (products_ref.json)

3. Fact Table (orders_raw.csv)