Finance Data OS

🚀 Project Goal

To build a modular financial data platform that takes raw equities, options, and macroeconomic datasets and transforms them into analytics-ready features for trading and investment research.

🏗️ Architecture (Phase 1)

Untitled diagram _ Mermaid Chart-2025-09-04-222054

📂 Project Layout

docs/ # architecture diagrams, notes

artifacts/ # Power BI files, exported charts

Build Logs/ # Weekly build logs

notebooks/ # Jupyter notebooks (Week 1, Week 2, etc....)

🛣 Project Roadmap

This project is built and shipped in weekly artifacts. Each week delivers a small but meaningful piece of the pipeline.

Week 1 – Single-Ticker Prototype ✅

Week 2 – Multi-Ticker Ingest & Finance Chart ✅

Week 3 – Expanding History & Feature Store ✅

Week 4 – Signals, Backtest & 3-Page Power BI ✅

Week 5 – Costs, Controls & Tuning ✅

Week 6 - Rolling Metrics, Cost Modeling and Cleaner Pipeline ✅

Week 7 - Push-button backtest, parameter sweeps and one-command workflows ✅

Week 8 - Deterministic parameter tuning + clean analytics ✅

Week 9 - Performance Simulation & Validation ✅

Objective: Extend the Finance Data OS pipeline to simulate trade-level execution, validate equity reconciliation, and visualize performance metrics in Power BI.

Pipeline Summary:

Simulate: Load tuned parameters and signals; apply execution logic (slippage, commission, fees).

Validate: Check PK uniqueness, null policies, and reconciliation between trades and equity.

Visualize: Publish Power BI dashboards (Trade Blotter, Equity vs. Drawdown, KPI cards).

Artifacts Created:

/lake/trade_mart_v3/trade-test_2025w09.parquet

/lake/equity_curve_daily_v3/eq-test_2025w09.parquet

/lake/signals_mart_v3/combined_week9.parquet

/lake/tuning_mart_v3/combined_week9.parquet

Power BI Deliverables:

Trade Blotter (PnL, Slippage, Fees, Entry/Exit Reason)

Equity (NAV) vs. Drawdown (%) Chart

KPIs: Sharpe (252d), CAGR, Win Rate

Slicers: Run ID, Symbol, Entry Reason

Results:

Metric Value Sharpe (252d) 1.34 CAGR (%) 21.96 Win Rate (%) 59.43

Validation Summary: ✅ PK uniqueness ✅ Null policy ✅ Drawdown ≤ 0 ✅ Reconciliation (trades ↔ equity)

⚡ Quick Start (Follow along with me!)

Clone the repo:

Set up virtual environment:

Install Dependencies:

Run the notebooks:

📝 Build Logs

Build Log – Week 1 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk1

Build Log – Week 2 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk2

Build Log - Week 3 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk3

Build Log - Week 4 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk4

Build Log - Week 5 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk5

Build Log - Week 6 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk6

Build Log - Week 7 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk7

Build Log - Week 8 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk8

Build Log - Week 9 https://github.com/Si944-byte/Finance-Data-OS/blob/main/Build%20Logs/Build%20Log%20wk9

🗺️ Week 9 — What's New

Pipeline Enhancements

Added full simulation-validation loop (simulate(), validate_run(), reconciliation parity).
Introduced trade_mart_v3 and equity_curve_daily_v3 with deterministic structure.
Implemented timestamp normalization (UTC-safe ordering).

Testing & Reliability

Expanded pytest coverage for validation, PK uniqueness, and reconciliation.
Enforced schema-on-append verification.
Validation summary now logs ✅ pass/fail states for all checks.

Visualization & Reporting

Built Trade Blotter table (PnL, Commission, Fees, Entry/Exit Reason).
Added Equity (NAV) vs Drawdown (%) chart with dual axis.
Created KPI cards: Sharpe (252d), CAGR %, Win Rate %.
Added slicers for Run ID, Symbol, Entry Reason for filtering and analysis.

Performance & Usability

Introduced --max-combos, --allow-large safety flags in simulation.
Batched Parquet writes for faster runs and cleaner logs.
Seed-controlled simulation for reproducibility.

🧠 What I Learned

Week 9 – Performance Simulation & Validation

End-to-End System Thinking I learned how each mart (signals → tuning → trades → equity) connects as a complete system. Every step now produces validated outputs that feed the next stage — transforming raw data into a reliable simulation.
Deterministic Design Matters Reproducibility isn’t optional. Setting seeds, controlling timezones, and enforcing schema validation ensured that identical inputs always yield identical outputs. It made debugging predictable and CI-safe.
Validation is the Final Guardrail Having validation scripts that check PK uniqueness, null policies, and equity-trade reconciliation gave confidence that the pipeline’s math actually holds up. It shifted the mindset from “does it run” to “is it right.”
Power BI as an Analysis Surface Building the Trade Blotter and Equity vs. Drawdown views clarified how to communicate system performance visually. Every KPI (Sharpe, CAGR, Win Rate) now ties directly to verified data — not estimates.
Clean Models → Clear Insights Simplifying relationships to a star schema (Date → Facts, Symbol → Facts) made the visuals snap into place. Data lineage now feels intuitive rather than tangled.

🗂️ Artifacts (Week 8: current week)

Dashboard Pages:

Page 1 - Signals

Page 2 - Back-test

Page 3 - Tuning Results

Page 4 - Performance Overview

Page 5 - About Page

🤝 Contributing

This is an open project for learning and sharing best practices in data engineering for financial markets. Suggestions, issues, and PRs are welcome.

📜 License

MIT License — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 154 Commits
Build Logs		Build Logs
artifacts		artifacts
docs		docs
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Finance Data OS

🚀 Project Goal

🏗️ Architecture (Phase 1)

🧠 What I Learned

🗂️ Artifacts (Week 8: current week)

About

Uh oh!

Releases

Packages

Languages

License

Si944-byte/Finance-Data-OS

Folders and files

Latest commit

History

Repository files navigation

Finance Data OS

🚀 Project Goal

🏗️ Architecture (Phase 1)

🧠 What I Learned

🗂️ Artifacts (Week 8: current week)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages