# Tooling Overview This project combines several tools to maintain code quality, automate deployments and process data efficiently. ## Data Processing - **PySpark & Delta Lake** – the ETL jobs are written in PySpark and persist data in Delta tables for reliable, ACID-compliant storage. - **Databricks Labs DQX** – expectation-based data quality checks stop bad data from progressing through the pipeline. ## Infrastructure & Deployment - **Databricks Asset Bundles (DABs)** define clusters, jobs and other workspace assets as code. The `databricks.yml` bundle is validated and deployed through CI/CD. - **GitHub Actions** run tests, linting and bundle validation on every pull request and push to main. ## Testing & Code Quality - **Pytest** covers unit and integration tests under the `tests/` directory. - **Ruff** enforces style and formatting rules and runs via the [`lint.sh`](../lint.sh) script. - **MyPy** performs static type checking, also invoked from `lint.sh`. ## Documentation - **Sphinx** with the MyST parser renders the Markdown files in `docs/` and builds the public documentation site.