# TPC Financial System — Project Handoff

## Who Is Ted

Ted Peters, 60, retired software engineer. CS degree from UW (1988). Worked at Microsoft in the 1990s building UI systems — his company Cooper and Peters made Smalltalk UI tools, pivoted to Java browser tools, was acquired by Microsoft in 1997. After tech, spent 15 years building 20 spec houses. Based in Ferndale/Blaine, Washington.

**Technical profile**: Expert in Smalltalk, Java, C#, competent in JavaScript/TypeScript. Prefers GUI over command line. Values clean data structures over clever algorithms. Fascinated with automation. Has a builder's practical instinct — wants things that work, not theoretical perfection.

## The Project

Replace personal use of Quicken Mac for tracking finances across multiple businesses and personal spending. Quicken handles bank transaction downloads well but everything else is painful — categorization, reporting, multi-entity views, inflexible data model.

### Businesses
- **TPC (Ted Peters Construction)** — spec home building. Buys lots, builds houses, sells them. Currently finishing 1351 Leighton and 2520 Bora Bora, which go on sale soon.
- **DB (Ted Peters DB)** — rental property holdings and management. Multiple properties, some multi-unit.
- **Personal** spending also tracked in the same Quicken file.

### Why Quicken Stays (For Now)
Quicken connects to ~20 financial institutions and downloads transactions. This is genuinely hard to replace (Plaid/MX APIs are company-oriented, scraping is fragile). Plan: keep Quicken as the ingestion layer, read its SQLite database directly for everything else.

---

## Quicken Database Access

### Location and Format
Quicken Mac stores data in a `.qmobile` package folder. Inside is a file called `data` which is a **SQLite database using Apple Core Data**. ~160MB for Ted's full history (20+ years).

### Core Data Conventions
- All table/column names prefixed with `Z` (Core Data ORM convention)
- `Z_PK` = primary key, `Z_ENT` = entity type, `Z_OPT` = optimistic locking counter
- **Timestamps are seconds since January 1, 2001** (not Unix epoch 1970). Offset: add 978307200 to get Unix time.
- `ZDELETIONCOUNT` — filter on `= 0` to exclude soft-deleted records

### Key Tables

**ZACCOUNT** — All accounts
- `ZNAME` — account name (e.g., "TPC-Washington Federal")
- `ZTYPENAME` — CHECKING, CREDITCARD, CREDITLINE, ASSET, LIABILITY, REALESTATE, COMMERCIAL
- `ZACTIVE`, `ZCLOSED` — status flags

**ZTRANSACTION** — Transaction headers
- `ZACCOUNT` — FK to ZACCOUNT.Z_PK
- `ZPOSTEDDATE` — bank-provided date (NULL for manually entered transactions!)
- `ZENTEREDDATE` — user-entered date (always populated)
- **CRITICAL: Use `COALESCE(ZPOSTEDDATE, ZENTEREDDATE)` for dates.** Credit lines, asset accounts, and liability accounts have NULL posted dates. Only bank-downloaded transactions (checking, credit cards) get posted dates.
- `ZAMOUNT` — transaction total
- `ZRECONCILESTATUS` — reconciliation state
- `ZCHECKNUMBER`, `ZNOTE` — check number and memo
- `ZUSERPAYEE` — FK to ZUSERPAYEE.Z_PK

**ZCASHFLOWTRANSACTIONENTRY** — Line items (the important table!)
- `ZPARENT` — FK to ZTRANSACTION.Z_PK
- `ZAMOUNT` — line item amount
- `ZNOTE` — line item memo
- `ZCATEGORYTAG` — FK to ZTAG.Z_PK (the category)
- `ZTRANSFER` — transfer target (numeric FK when transfer, empty otherwise)
- `ZSEQUENCENUMBER` — ordering within splits
- Every transaction has at least one entry. Splits = multiple entries per transaction. No special split flag needed — just count entries per parent.

**ZTAG** — Categories AND user tags (both in same table)
- `ZNAME` — display name
- `ZPARENTCATEGORY` — FK to ZTAG.Z_PK (for hierarchy). Walk this recursively to build full paths like `TPC:Projects:Expenses:P4:Drywall`
- `ZTYPE` — distinguishes category types from user tags

**Z_18USERTAGS** — Junction table linking line items to user tags (many-to-many)
- `Z_18CASHFLOWTRANSACTIONENTRIES` — FK to ZCASHFLOWTRANSACTIONENTRY.Z_PK
- `Z_70USERTAGS` — FK to ZTAG.Z_PK

**ZUSERPAYEE** — Payee master list
- `ZNAME` — display name

**ZRENAMINGRULE** — Auto-categorization rules
- `ZORIGINALSTATEMENTNAME` → `ZTARGETPAYEENAME`

### Unused/Vestigial Tables
- **ZBUSINESS** — Has schema for business entities but is empty. Not wired into transactions.
- **ZPROJECT** — Has schema for projects but is empty. Not exposed in Quicken Mac UI.

### Working Extraction Query

This query was validated against CSV exports and produces correct results:

```sql
SELECT 
    a.ZNAME as account,
    datetime(COALESCE(t.ZPOSTEDDATE, t.ZENTEREDDATE) + 978307200, 'unixepoch') as tx_date,
    t.Z_PK as tx_id,
    t.ZAMOUNT as tx_amount,
    t.ZRECONCILESTATUS as reconcile_status,
    t.ZCHECKNUMBER as check_num,
    t.ZNOTE as tx_memo,
    e.Z_PK as entry_id,
    e.ZAMOUNT as entry_amount,
    e.ZNOTE as entry_memo,
    e.ZSEQUENCENUMBER as seq_num,
    e.ZTRANSFER as transfer_target,
    e.ZCATEGORYTAG as category_id,
    COALESCE(up.ZNAME, '') as payee
FROM ZTRANSACTION t
JOIN ZACCOUNT a ON t.ZACCOUNT = a.Z_PK
LEFT JOIN ZCASHFLOWTRANSACTIONENTRY e ON e.ZPARENT = t.Z_PK
LEFT JOIN ZUSERPAYEE up ON t.ZUSERPAYEE = up.Z_PK
WHERE a.ZNAME LIKE 'TPC-%'
AND t.ZDELETIONCOUNT = 0
AND COALESCE(t.ZPOSTEDDATE, t.ZENTEREDDATE) >= 725846400   -- 2024-01-01
AND COALESCE(t.ZPOSTEDDATE, t.ZENTEREDDATE) < 757382400    -- 2025-01-01
ORDER BY a.ZNAME, COALESCE(t.ZPOSTEDDATE, t.ZENTEREDDATE), t.Z_PK, e.ZSEQUENCENUMBER;
```

**Epoch reference for date filtering:**
- Core Data epoch offset from Unix: 978307200
- To convert a calendar date: `unix_timestamp - 978307200 = core_data_timestamp`
- 2024-01-01 core data = 725846400
- 2025-01-01 core data = 757382400

Category paths need reconstruction: query ZTAG separately and walk ZPARENTCATEGORY recursively in code.

User tags need a separate join through Z_18USERTAGS.

### Additional Validated Queries

**Category tree** (for path reconstruction):
```sql
SELECT Z_PK, ZNAME, ZPARENTCATEGORY, ZTYPE
FROM ZTAG WHERE ZDELETIONCOUNT = 0 ORDER BY Z_PK;
```

**User tags** (property names on line items):
```sql
SELECT j.Z_18CASHFLOWTRANSACTIONENTRIES as entry_id, ut.ZNAME as tag_name
FROM Z_18USERTAGS j
JOIN ZTAG ut ON j.Z_70USERTAGS = ut.Z_PK
WHERE ut.ZDELETIONCOUNT = 0;
```

**Account list with types**:
```sql
SELECT Z_PK, ZNAME, ZTYPENAME, ZACTIVE, ZCLOSED
FROM ZACCOUNT WHERE ZDELETIONCOUNT = 0 ORDER BY ZNAME;
```

### Validation Results (2024 TPC accounts vs CSV export)

| Account | Txns | Net | Status |
|---|---|---|---|
| TPC-BofA Alaska Airlines Visa | 492 | $19,941.07 | MATCH |
| TPC-Builders Alliance Credit Line | 68 | -$1,413.42 | MATCH |
| TPC-Lowes LCA | 122 | -$5,816.83 | MATCH |
| TPC-Savibank | 53 | — | MATCH (count) |
| TPC-Drayton Reach | 48 | -$129,652.87 | MATCH |
| TPC-Washington Federal | 208 vs 206 | -$6,415 diff | EXPLAINED: 2 Gusto payroll txns on 12/31 have entered date in 2024 but posted date in 2025 |

### How Quicken Stores Transfers

**Transfers are NOT explicitly linked in the database.** A transfer creates two independent transactions (one per account), each with its own `ZTRANSACTION` and `ZCASHFLOWTRANSACTIONENTRY`. Both entries are tagged with the generic "Transfer" category. There is no foreign key connecting the two sides.

**What ZTRANSFER is NOT:**
- `ZTRANSFER` on `ZCASHFLOWTRANSACTIONENTRY` is a VARCHAR field that stores numeric values, but these are NOT foreign keys to the counterparty entry, transaction, or account. Joining on them produces nonsensical results. The field appears to be a Core Data internal bookkeeping value.
- `ZTARGETACCOUNT` and `ZSENDACCOUNT` on `ZTRANSACTION` exist in the schema but are always NULL (72,075 transactions checked, zero populated). Ghost columns from an unimplemented feature.

**How to match transfer pairs:**
The only reliable method is heuristic matching: same date, same absolute amount, opposite signs, different accounts. Use two passes:
1. Exact date + exact amount match
2. Fuzzy date (±3 days) + exact amount match (handles posted vs entered date offsets)

This approach successfully paired 100% of TPC transfer entries in 2024 testing.

**Why it works this way:** Each bank connection delivers transactions independently. WaFed reports a debit and BofA reports a credit at different times with different descriptions. Quicken stores them independently and matches on-demand for display (the CSV export resolves "Transfer:TPC-Washington Federal" as a label at export time using the same heuristic matching).

**Implications for the replacement system:**
- Transfers should be first-class objects that explicitly reference both sides
- Deleting one side should flag/warn, not silently orphan the other
- Import from Quicken needs the heuristic matching logic to reconstruct pairs
- TPC-internal transfers must net to exactly zero (validated)

### Transfer Classification for Reporting

Transfer entries from TPC accounts go to four categories:
- **TPC Internal** — between the 6 operating TPC accounts (nets to $0)
- **TPC Other Accounts** — to TPC asset/liability accounts not in the main extract (e.g., Drayton Reach - Jimmy Wong Mortgages for construction loan draws)
- **DB** — to DB-WECU, DB-Venmo, DB-Paypal, DB-Fidelity Visa
- **Personal** — to WashFed Money Market, Southwest Visa, Schwab accounts, First Tech, Mary accounts, etc.

The transfer matching query pulls ALL accounts (not just TPC) to find counterparties:

```sql
-- Pull all transfer-categorized entries across ALL accounts for a year
SELECT e.Z_PK as entry_id, e.ZAMOUNT, e.ZTRANSFER,
    t.Z_PK as tx_id,
    datetime(COALESCE(t.ZPOSTEDDATE,t.ZENTEREDDATE) + 978307200, 'unixepoch') as tx_date,
    a.ZNAME as account,
    tag.ZNAME as category
FROM ZCASHFLOWTRANSACTIONENTRY e
JOIN ZTRANSACTION t ON e.ZPARENT = t.Z_PK
JOIN ZACCOUNT a ON t.ZACCOUNT = a.Z_PK
LEFT JOIN ZTAG tag ON e.ZCATEGORYTAG = tag.Z_PK
WHERE tag.ZNAME = 'Transfer'
AND t.ZDELETIONCOUNT = 0
AND COALESCE(t.ZPOSTEDDATE,t.ZENTEREDDATE) >= 725846400  -- 2024-01-01
AND COALESCE(t.ZPOSTEDDATE,t.ZENTEREDDATE) < 757382400   -- 2025-01-01
ORDER BY tx_date, e.ZAMOUNT;
```

Then match pairs in code using the two-pass heuristic described above.

---

## Ted's TPC Account Structure

### Operating Accounts
- **TPC-Washington Federal** — main checking, operating account. Starting balance 2024: $23,014.22
- **TPC-Savibank** — secondary checking, primarily payroll. Starting balance 2024: $5,988.18
- **TPC-BofA Alaska Airlines Visa** — primary credit card. Shared across TPC, DB, and personal (needs reimbursement tracking)
- **TPC-Builders Alliance Credit Line** — materials credit line
- **TPC-Lowes LCA** — Lowe's commercial account credit line

### Asset/Liability Accounts
- **TPC-Drayton Reach** — tracks cost basis of 9 purchased lots from developer Jimmy Wong. Started ~$1.3M (purchase price). Balance depletes as houses sell and basis is allocated to cost-of-goods-sold. The "Land" category entry = 1/9 of purchase price per lot. "ShareOfPropTaxes-Interest-HOA-Etc" = accumulated carrying costs allocated at sale.
- **Drayton Reach - Jimmy Wong Mortgages** — the actual owner-financed loan from Wong (separate from basis tracking)
- **TPC-2547 Dodd**, **TPC-2521 Dodd**, **TPC-2520 Bora Bora**, **TPC-Drayton Lot 1** — per-property asset accounts
- **TPC-Kubota Excavator** / **Kubota Excavator Loan** — equipment
- **TPC-Kagari ADU** / **Kagari ADU Receivable** — ADU project

### Drayton Reach Lot Basis Mechanics
- Carrying costs (Wong interest, property taxes, HOA, AppFolio fees) flow from operating accounts INTO the Drayton Reach asset account as transfers
- At sale time, basis is moved OUT via category entries: `TPC:Projects:Expenses:Land` (1/9 of purchase ≈ $144,941.51) and `TPC:Projects:Expenses:Land:ShareOfPropTaxes-Interest-HOA-Etc`
- After all 9 properties sell, this account should zero out

---

## Category/Tag Structure (Current Quicken Scheme)

### How It Works Now
- **Category** = hierarchical, colon-delimited. Describes what KIND of spend/income.
- **Tag** = which PROPERTY (house address for TPC, rental property for DB)
- Quicken forces categories to be declared as Income OR Expense type, which creates hacks

### TPC Category Tree
```
TPC (expense root)
├── Projects
│   └── Expenses
│       ├── P0: Permits, Design
│       ├── P1: Foundation, Sitework
│       ├── P2: Framing, Trusses
│       ├── P3: Windows-Ext Doors, Roof-Gutters, Siding, Exterior Trim
│       ├── P4: Plumbing, Electrical, HVAC, Insulation, Drywall
│       ├── P5: Cabinets, Interior Trim, Interior Doors, Flooring, Painting
│       ├── P6: Countertops, Tile, Mirrors-Closet-Bath, Appliances
│       ├── P7: Landscaping, Flatwork
│       ├── P99: Cleaning-Garbage, Utility Expenses, Tools, Labor, Misc, 
│       │        Property Tax, Interest-Bank Fees, Selling Fees, UNCATEGORIZED-FIX
│       └── Land, Land:ShareOfPropTaxes-Interest-HOA-Etc
├── Accounting
├── Banking (Interest, Fees, Monthly Service Fee, Insufficient Funds, Late Fee, etc.)
├── Business
├── License-Bond-Insurance (and sub :Insurance)
├── Payroll (Wages, Taxes, Processing)
├── Research
├── Tools
├── Warranty Repairs
└── 547 E (vestigial old-style with address in category — 7 transactions for 2831 Duplex)

TPC-Income (separate root because Quicken forces income/expense separation)
└── (house sale proceeds, tagged by property)
```

### Phase Codes (P0-P7, P99)
Map directly to construction sequence. Used for cost comparison across houses at the same phase. Phases happen roughly in numerical order.

### Property Tags (Current 2024)
- **2547 Dodd** — built and sold mid-2024 ($813K sale via Chicago Title)
- **1375 Leighton** — lot sale early 2024 ($186K)
- **2521 Dodd** — under construction 2024, sold 2025
- **2520 Bora Bora** — under construction 2024, finishing now (early 2025)
- **1351 Leighton** — permitted late 2024, finishing now
- **2831 Duplex** — rental rehab (DB property, not TPC sale)

### Non-TPC Spending in TPC Accounts
The BofA card is shared across TPC, DB, and personal. Non-TPC categories found:
- **Travel:Airfare** — Alaska Air (personal)
- **Food & Dining:Restaurants:Coffee** — Starbucks (personal)
- **Automobile:Gasoline**, **Healthcare**, **Automobile:Licensing** — personal
- **DB-Expenses:Rentals** and subcategories — rental property business

### "from X to Y" Tags
These are reallocation hacks — materials bought for one house but used at another. E.g., "from 2521 to 2547" means the expense was originally tagged to 2521 Dodd but should be allocated to 2547 Dodd.

---

## Ideal Data Model (Designed During This Session)

### Core Insight
Quicken jams too much into two fields (Category and Tag). The ideal model has orthogonal dimensions:

### Transaction Record
```
Transaction (what the bank sees)
├── ID
├── Date
├── Account
├── Payee
├── Total Amount
├── Status (cleared, reconciled)
└── Line Items[] (what YOU see)
    ├── Amount (must sum to transaction total)
    ├── Entity (TPC, DB, Personal)
    ├── Nature (hierarchical — what kind of income/expense)
    ├── Property (hierarchical — which building/unit)
    └── Memo
```

### Key Design Principles
1. **Fields are flat columns, values are hierarchical.** Entity is a field; `TPC` is its value. Nature is a field; `Project Expenses:P5:Flooring` is its value. Property is a field; `2831 Nevada:North` is its value.
2. **"Nature" not "Category"** — describes the kind of money movement. Less overloaded than "category."
3. **Entity unifies income and expense** — no more `TPC` (expense root) + `TPC-Income` (income root) hack. Both live under the same entity.
4. **Property is orthogonal to Nature** — you can query "all P5:Flooring across all houses" (nature filter) or "everything for 2521 Dodd" (property filter) or both.
5. **Every transaction has line items**, even if just one. No special "split" mode. Splits are just transactions with multiple line items.
6. **Transfers are explicit** — link between two line items in different accounts, not encoded in the category string.

### Nature Tree (TPC)
```
TPC
├── Expenses (overhead, not house-specific)
│   ├── Accounting
│   ├── Banking
│   ├── Payroll (Wages, Taxes, Processing)
│   ├── Tools
│   ├── License-Bond-Insurance
│   └── ...
├── Project Expenses (direct construction costs)
│   ├── P0: Permits, Design
│   ├── P1-P7: (construction phases)
│   ├── P99: (cross-phase overhead)
│   └── Land, Land:SharedCarryingCosts
└── Income
    ├── Property Sales
    └── (future: Consulting, etc.)
```

### Same Pattern for DB
```
DB
├── Expenses
│   └── Repairs (Plumbing, etc.)
├── Income
│   └── Rent
```
Property tag: `2831 Nevada:North`, `2831 Nevada:South`, `789 Apt Avenue:Apt 3`

---

## Desired Features / Use Cases

### Smart Categorization
- LLM-based, using payee history, current project phases, and context. "Choice Concrete" = always P7:Flatwork. If house X is in P4 and house Y is in P7, a flatwork charge obviously goes to Y.
- Flag ambiguous cases for human review instead of forcing manual classification of everything.

### Cost Templates & Tracking
- 20 houses of historical data → derive expected costs per phase/trade, adjusted by house parameters (square footage, floors, lot grade, target market, finish level).
- Pre-build: plug in spec sheet → projected budget and timeline.
- During build: actuals replace projections phase by phase, remaining phases auto-adjust.
- Ted has heuristics (e.g., "framing $/sqft goes up 15% for second story") that can be encoded as rules.

### Schedule Awareness
- Phases happen in order with roughly predictable durations.
- System should infer current phase from expense patterns and project forward.
- Track what comes next and what's reasonable for each trade.

### Sub(contractor) Tracking
- Repeat vendors: same plumber, electrician, framing crew across houses.
- Track what they do, what they typically charge, flag anomalies.

### Procurement Automation (Future)
- System knows what materials/services are needed for next phase.
- Generate orders, potentially auto-place with vendors.
- Different vendors have different ordering methods (email, website, text).
- This requires line-item data from vendors (not just bank transaction totals).

### Line Item Data (The Hard Problem)
- Bank transactions show "LOWES $847" — no detail on what was purchased.
- Vendor portals have the detail but are inconsistent: some allow download, some don't, formats vary wildly.
- Matching vendor line items to bank transactions is very hard (dates don't align, amounts don't match exactly due to returns/adjustments).
- LLM fuzzy matching may be the breakthrough here.
- Design for graceful degradation: full line items when available, bank-level when not.

### Algorithmic Splits
- Define allocation rules (e.g., "split evenly across 9 properties") and auto-generate line items.
- Eliminates the manual split tedium that made Ted use the Drayton Reach accumulator approach.

### Ongoing Auditing
- Flag unusual amounts, potential miscategorizations, missing transactions.
- Requires more frequent data access than annual export — SQLite direct read enables this.

### Multi-Entity Reporting
- P&L per entity (TPC, DB, Personal)
- Cross-entity transfer tracking (business ↔ personal reimbursements)
- Per-property cost basis and profitability

---

## Technical Context for Next Steps

### What's Been Proven
1. Quicken's SQLite database is readable and well-structured enough to extract cleanly.
2. The extraction query (with COALESCE date fix) matches CSV exports exactly.
3. Category hierarchy can be reconstructed by walking ZTAG.ZPARENTCATEGORY.
4. User tags (property names) are accessible through Z_18USERTAGS junction table.
5. The full 20+ year history is available without manual export.

### Recommended Approach
- Keep Quicken as bank transaction ingestion layer.
- Read SQLite database directly (copy file first to avoid locking).
- Build new system with clean data model on top.
- LLM interface for queries, categorization, and analysis alongside traditional UI.

### Ted's Preferences
- GUI over command line (but warming up to terminal via this experience).
- Practical over theoretical — wants things that work for his specific use case.
- Iterative exploration before committing to architecture.
- Interested in Tauri or Wails for eventual desktop app.
- Wants LLM interaction as part of the runtime experience, not just as development tool.
