# Small City Schema — v0.1

*Derived from the Blaine, WA specimen. April 2026.*
*Goal: a generalizable data model for understanding, comparing, and simulating small American cities.*

---

## Design Philosophy

This schema emerged bottom-up from a single city rather than top-down from theory. The principle is that good data structures matter more than clever algorithms — get the model right and the analysis follows naturally.

Every entity includes a `notable` freetext field for observations that don't fit existing columns. After processing many cities, patterns in `notable` fields reveal new schema columns. The schema grows organically from data, not from assumptions.

### What We Mean by "Small City"
- Population roughly 1,000 to 25,000
- Incorporated municipality (not a CDP or unincorporated area)
- Has its own government, budget, and at least some of its own services
- Exists in relationship to a larger regional center (Blaine → Bellingham)

---

## Core Entities

### 1. City (the root object)

This is the top-level entity. Everything else hangs off it.

```
City {
  // Identity
  id                    : string        // unique key (e.g., "blaine-wa")
  name                  : string        // "Blaine"
  state                 : string        // "WA"
  county                : string        // "Whatcom"
  fips_state            : string        // "53"
  fips_county           : string        // "073"
  fips_place            : string        // "06505"
  incorporated_date     : date          // 1890-05-20
  named_for             : string        // "James G. Blaine, U.S. Senator"
  motto                 : string        // "Where America Begins"
  website_url           : string        // "ci.blaine.wa.us"
  website_platform      : string        // "CivicPlus" (useful for scraping patterns)

  // Geography
  latitude              : float         // 48.9937
  longitude             : float         // -122.7471
  elevation_ft          : int           // 49
  land_area_sq_mi       : float         // 5.63
  water_area_sq_mi      : float         // 2.80
  climate_zone          : string        // "Marine west coast"

  // Relationships
  regional_center       : string        // "Bellingham" (nearest city of consequence)
  regional_center_dist  : float         // 20 (miles)
  adjacent_communities  : string[]      // ["Birch Bay", "Semiahmoo", "Point Roberts"]
  special_position      : string[]      // ["international border", "coastal", "I-5 corridor"]

  notable               : string        // freetext for anything that doesn't fit
}
```

**Observations from Blaine:** The `special_position` field emerged because Blaine's identity is inseparable from being a border town. Other cities might have "college town", "military base", "tribal lands adjacent", "port city", "ski resort", etc. These position tags drive everything downstream — economy, culture, governance priorities.

---

### 2. Demographics (time-series)

```
DemographicSnapshot {
  city_id               : string
  year                  : int
  source                : string        // "ACS 5-year", "Decennial Census", "estimate"

  // Population
  population            : int
  population_density    : float         // per sq mi
  median_age            : float
  pct_under_18          : float
  pct_18_to_64          : float
  pct_65_plus           : float

  // Race/Ethnicity (pct)
  pct_white             : float
  pct_hispanic          : float
  pct_black             : float
  pct_asian             : float
  pct_native            : float
  pct_pacific_islander  : float
  pct_two_plus_races    : float

  // Nativity
  pct_foreign_born      : float
  pct_english_only      : float

  // Education (25+)
  pct_hs_or_higher      : float
  pct_bachelors_plus    : float

  // Income
  median_household_income : int
  per_capita_income     : int
  pct_below_poverty     : float

  // Housing
  total_housing_units   : int
  total_households      : int
  avg_household_size    : float
  pct_owner_occupied    : float
  pct_renter_occupied   : float
  vacancy_rate          : float
  median_home_value     : int
  median_rent           : int
  pct_single_family     : float

  // Commute
  pct_drive_alone       : float
  pct_work_from_home    : float
  pct_carpool           : float
  avg_commute_minutes   : float
  pct_work_in_county    : float

  // Connectivity
  pct_broadband         : float
  avg_download_mbps     : float

  notable               : string
}
```

**Observations from Blaine:** The 26.2% over-65 population immediately flags "retirement destination." The 17.8% foreign-born is notable for a small rural city. The 22.2% work-from-home is very high — suggests remote workers choosing lifestyle location. These patterns become the basis for city archetype classification.

---

### 3. Government Structure

```
GovernmentStructure {
  city_id               : string
  as_of_date            : date

  // Form
  form_of_government    : enum          // "council-manager", "mayor-council", "commission"
  charter_status        : string        // "noncharter code city"
  legal_classification  : string        // state-specific classification

  // Council
  council_size          : int           // 7
  council_districts     : int           // 3 wards
  at_large_seats        : int           // 1
  meeting_schedule      : string        // "2nd and 4th Monday"
  meeting_time          : string        // "6 PM"

  // Staff
  has_city_manager      : bool          // true
  departments           : string[]      // ["Finance", "Public Works", ...]

  // Boards & Commissions
  boards                : Board[]       // see sub-entity

  // Regional
  regional_memberships  : string[]      // ["WCOG", "WTA", "IMTC"]

  notable               : string
}

Board {
  name                  : string        // "Planning Commission"
  size                  : int           // 7
  appointment_method    : string        // "council-appointed" | "manager-appointed"
  meeting_schedule      : string
  purpose               : string
}
```

**Observations from Blaine:** The hybrid structure (legally mayor-council, operationally council-manager) is common in small WA cities and would be missed by a simple enum. The number and type of boards reveals city priorities — an Arts Commission (est. 2021) and Tourism Advisory suggest a community investing in identity. A Downtown Advisory Committee signals active development tension.

---

### 4. Elected Officials (time-series)

```
ElectedOfficial {
  city_id               : string
  name                  : string
  position              : string        // "Mayor", "Council Position 5"
  ward                  : string        // "Ward 3" or "At-Large"
  term_start            : date
  term_end              : date
  is_mayor_pro_tem      : bool
  external_boards       : string[]      // ["WTA Board", "WCOG"]
  notable               : string
}
```

---

### 5. Finances (annual time-series)

```
CityBudget {
  city_id               : string
  fiscal_year           : int
  ordinance_number      : string        // "25-3039"

  total_budget          : int           // 44,300,000
  total_revenues        : int
  fund_balance_usage    : int

  general_fund_budget   : int           // 10,000,000
  general_fund_ending_balance : int
  all_funds_ending_balance    : int

  notable               : string
}

RevenueSource {
  city_id               : string
  fiscal_year           : int
  source_name           : string        // "Retail Sales Tax", "Property Tax"
  fund                  : string        // "General Fund", "Electric Fund"
  amount                : int
  yoy_change_pct        : float
}

TaxRate {
  city_id               : string
  as_of_date            : date
  tax_type              : string        // "sales", "property", "B&O", "utility"
  rate                  : float
  notes                 : string        // "Combined state+city"
}

UtilityRate {
  city_id               : string
  effective_date        : date
  utility_type          : string        // "electric", "water", "sewer", "stormwater"
  rate_increase_pct     : float
  residential_base      : float
  residential_per_unit  : float
  unit                  : string        // "kWh", "gallon", "flat"
}

FundBalance {
  city_id               : string
  fiscal_year           : int
  fund_name             : string        // "Electric Fund", "Hotel/Motel Fund"
  beginning_balance     : int
  planned_drawdown      : int
  ending_balance        : int
  pct_depleted          : float
  risk_flag             : string        // "critical", "warning", "healthy"
}

DebtObligation {
  city_id               : string
  as_of_date            : date
  instrument            : string        // "Water/Sewer Refunding Bonds 2020"
  original_amount       : int
  outstanding_amount    : int
  purpose               : string
  maturity_date         : date
}

CapitalPlan {
  city_id               : string
  plan_period           : string        // "2026-2030"
  category              : string        // "Transportation", "Utilities", "Parks"
  total_amount          : int
}

AuditResult {
  city_id               : string
  audit_period          : string        // "2022-2023"
  result                : string        // "clean", "findings", "material weakness"
  findings_summary      : string
  auditor               : string        // "WA State Auditor"
}
```

**Observations from Blaine:** The financial data is the most immediately comparable across cities. Washington State standardizes municipal accounting via the BARS manual, so the same fund names and categories appear everywhere. The `FundBalance` entity with `risk_flag` emerged because Blaine's electric fund crisis is the kind of thing that distinguishes cities in trouble from healthy ones. The `RevenueSource` entity captures the crucial insight that Blaine is sales-tax-dependent (border shopping) while other cities may be property-tax-dependent — this single structural difference drives totally different vulnerability profiles.

---

### 6. Infrastructure Assets

```
InfrastructureAsset {
  city_id               : string
  asset_type            : enum          // see below
  name                  : string
  description           : string
  owner                 : string        // "City", "County", "State", "Federal", "Private"
  year_built            : int
  condition             : string        // "good", "fair", "poor", "critical"
  capacity              : string        // freetext: "3.1M gal/day", "350 slips"
  current_utilization   : string        // "500K gal/day", "85%"
  replacement_cost      : int
  notable               : string
}

// Asset types discovered from Blaine:
// WATER: wells, treatment_plant, distribution_mains, storage
// SEWER: collection_mains, lift_stations, treatment_plant
// STORMWATER: ponds, basins, collection_system
// ELECTRIC: substation, distribution, interconnection
// TRANSPORTATION: road, bridge, sidewalk, trail
// MARINE: marina, pier, harbor, ferry
// RECREATION: park, trail, sports_facility, community_center
// PUBLIC_SAFETY: fire_station, police_station
// CIVIC: city_hall, library, school
```

**Observations from Blaine:** The capacity vs. utilization pattern is gold. Blaine's wastewater plant runs at 16% of capacity (500K/3.1M gal/day) — that's either massively overbuilt or ready for growth. Every city has these ratios hiding in their infrastructure and they tell you more about the city's future than any planning document.

---

### 7. Economy

```
EconomicProfile {
  city_id               : string
  year                  : int

  total_employed        : int
  unemployment_rate     : float
  labor_force_participation : float

  // Industry mix (parallel arrays or sub-entities)
  industries            : IndustryEmployment[]
  occupations           : OccupationEmployment[]

  // Pillars — qualitative but critical
  economic_pillars      : EconomicPillar[]

  cost_of_living_index  : float         // 128 = 1.28× national avg
  notable               : string
}

IndustryEmployment {
  industry_name         : string        // "Manufacturing", "Accommodation & Food"
  naics_code            : string
  employment_count      : int
  pct_of_workforce      : float
}

EconomicPillar {
  name                  : string        // "Cross-border trade"
  description           : string        // "Warehousing, freight, customs brokerage along I-5"
  major_employers       : string[]      // ["Pacific Customs Brokers", ...]
  vulnerability         : string        // "Directly exposed to US-Canada trade policy"
  estimated_pct_of_economy : float      // rough, qualitative OK
}
```

**Observations from Blaine:** The `EconomicPillar` entity is where the real insight lives. Census NAICS codes can't tell you that Blaine's economy is fundamentally structured around being a border crossing — you need narrative understanding to identify the pillars and their vulnerabilities. A "college town" has "university" as a pillar; a "military town" has "base" as a pillar. These pillars are the first thing to identify for any new city.

---

### 8. Issues & Controversies (living document)

```
CityIssue {
  city_id               : string
  issue_name            : string        // "Electric fund crisis"
  category              : enum          // "fiscal", "development", "infrastructure",
                                        // "governance", "environment", "public_safety",
                                        // "community_services", "intergovernmental"
  severity              : enum          // "existential", "major", "significant", "minor"
  status                : enum          // "active", "resolved", "dormant", "escalating"
  first_mentioned       : date
  last_mentioned        : date
  summary               : string
  key_actors            : string[]
  dollar_amount         : int           // if applicable
  related_issues        : string[]      // links to other issues
  timeline              : TimelineEvent[]
  notable               : string
}

TimelineEvent {
  date                  : date
  event                 : string
  source                : string
}
```

**Observations from Blaine:** Issues are the heartbeat of a city. They reveal what the community is actually fighting about, spending money on, and losing sleep over. The `severity` field (existential → minor) is subjective but crucial — Blaine's border traffic collapse is existential because it threatens the entire fiscal model, while the Birch Bay ZIP code issue is minor. The `related_issues` field matters because issues cluster: the border traffic decline → sales tax loss → electric fund crisis → utility rate increases → cost of living concerns. You can't understand one without the others.

---

### 9. Comprehensive Plan Goals

```
PlanGoal {
  city_id               : string
  plan_name             : string        // "Blaine 2036"
  plan_horizon          : string        // "2016-2036"
  element               : string        // "Housing", "Transportation", "Economic Development"
  goal_text             : string
  target_metric         : string        // "2,301 new single-family units"
  target_year           : int
  progress_status       : string        // "on track", "behind", "exceeded", "abandoned"
  notable               : string
}
```

---

### 10. Data Sources Registry

```
DataSource {
  city_id               : string        // or null if source covers all cities
  source_name           : string        // "The Northern Light"
  source_type           : enum          // "newspaper", "government_website", "state_portal",
                                        // "federal_api", "gis", "audit_report"
  url                   : string
  platform              : string        // "CivicPlus", "eCode360", "Socrata", "WordPress"
  format                : string        // "HTML", "PDF", "JSON API", "shapefile"
  update_frequency      : string        // "weekly", "annual", "continuous"
  coverage_start        : date          // earliest available data
  article_count         : int           // if applicable (Northern Light: 9,616)
  parse_difficulty      : enum          // "easy", "moderate", "hard", "requires_browser"
  parse_notes           : string
  transferable          : bool          // does this source pattern work for other cities?
  transfer_notes        : string        // "CivicPlus used by hundreds of small cities"
}
```

**Observations from Blaine:** The `transferable` flag is critical for scaling. CivicPlus websites, State Auditor FIT data, Census APIs, and even The Northern Light's WordPress-based article format are patterns that repeat across cities. Every time we crack a source for one city, we should tag whether the approach transfers.

---

## Cross-City Comparison Dimensions

Based on what we learned from Blaine, here are the axes along which cities can be meaningfully compared:

### Quantitative (direct comparison)
1. **Size** — population, area, density
2. **Wealth** — median income, home values, poverty rate
3. **Age** — median age, % elderly, % children
4. **Diversity** — racial/ethnic composition, foreign-born %
5. **Education** — attainment levels, school performance
6. **Fiscal health** — fund balances, debt ratios, revenue diversity
7. **Tax burden** — property, sales, utility, B&O rates
8. **Infrastructure utilization** — capacity vs. actual for water/sewer/electric
9. **Safety** — crime rates by type
10. **Connectivity** — broadband %, commute patterns, work-from-home %

### Qualitative (pattern matching)
1. **Economic archetype** — what are the pillars? (border town, college town, resort, bedroom community, agricultural center, etc.)
2. **Governance style** — council-manager vs. mayor-council; transparency culture; level of civic engagement
3. **Development tension** — NIMBY vs. growth; density vs. character preservation
4. **Fiscal vulnerability** — what's the single point of failure? (Blaine: Canadian traffic. Others: a single employer, a military base, a seasonal industry)
5. **Environmental burden** — contamination legacy, water quality, climate exposure
6. **Identity** — what's the story the city tells about itself? Motto, branding, festivals
7. **Inflection point** — is the city stable, growing, declining, or at a turning point?

---

## Simulation Considerations

If you wanted to simulate a small city (or model what a new one would need), the Blaine data suggests these are the critical systems:

### Minimum Viable City
1. **Water supply** — source, treatment, distribution capacity
2. **Wastewater** — collection, treatment, discharge permit
3. **Roads** — connectivity to regional network
4. **Public safety** — police + fire/EMS (can be contracted, as Blaine does with NWFR)
5. **Governance** — council + professional manager + finance + community development + public works
6. **Revenue model** — what combination of property tax, sales tax, utility revenue, fees?
7. **Land use code** — zoning, permits, development standards
8. **Schools** — often a separate district but inseparable from city identity

### What Makes a City "Work" (beyond infrastructure)
- A **local newspaper** that covers council meetings and holds government accountable
- **Civic engagement infrastructure** — boards, commissions, public comment processes
- **Identity anchors** — the things that make residents say "this is why I live here" (Peace Arch, marina, Semiahmoo, the border crossing itself)
- **Economic diversity** — Blaine's dependence on border traffic is a cautionary tale
- **Revenue resilience** — cities that depend too heavily on one revenue source are fragile
- **Regional relationships** — no small city is an island; WCOG, WTA, county services all matter

### Feedback Loops Discovered in Blaine
1. **Border traffic ↓ → Sales tax ↓ → Utility fund stress → Rate increases → Cost of living ↑ → Business closures → More revenue loss** (vicious cycle, currently active)
2. **Zoning restrictions → No new development → Housing shortage → Workers leave → Businesses can't hire → Economic stagnation** (the loop the downtown moratorium is trying to break)
3. **Aging infrastructure → Higher maintenance costs → Rate increases → Resistance to investment → Deferred maintenance → Worse infrastructure** (classic small-city trap)
4. **Population growth → Demand for services → Need for revenue → Annexation → More infrastructure obligations → Higher costs** (the Grandis Pond story — eventually they de-annexed)

---

## Schema Evolution Plan

### v0.2 (after 2-3 more cities)
- Refine entity fields based on what's actually available/comparable
- Add `CityArchetype` classification (border, college, resort, bedroom, agricultural, industrial)
- Add inter-city relationship modeling (e.g., Blaine-Bellingham dependency)
- Normalize industry codes to NAICS for cross-city comparison
- Add school district as a first-class entity

### v0.3 (after 5-10 cities)
- Statistical baselines (what's "normal" for a 6K-population city?)
- Anomaly detection (where does Blaine deviate from peer cities?)
- Factor analysis on what drives city "type"
- Vulnerability index based on revenue concentration, infrastructure age, demographic trends

### v1.0 (scaling target)
- Database implementation (SQLite initially, Postgres if we need spatial queries)
- Automated ingestion pipelines for transferable sources
- Dashboard for cross-city comparison
- "New city template" — what does a municipality need on day one?

---

## Appendix: Blaine as Instance

To validate the schema, here's how Blaine populates the top-level City entity:

```json
{
  "id": "blaine-wa",
  "name": "Blaine",
  "state": "WA",
  "county": "Whatcom",
  "fips_state": "53",
  "fips_county": "073",
  "fips_place": "06505",
  "incorporated_date": "1890-05-20",
  "named_for": "James G. Blaine, U.S. Senator from Maine",
  "motto": "Where America Begins",
  "website_url": "ci.blaine.wa.us",
  "website_platform": "CivicPlus",
  "latitude": 48.9937,
  "longitude": -122.7471,
  "elevation_ft": 49,
  "land_area_sq_mi": 5.63,
  "water_area_sq_mi": 2.80,
  "climate_zone": "Marine west coast",
  "regional_center": "Bellingham",
  "regional_center_dist": 20,
  "adjacent_communities": ["Birch Bay", "Semiahmoo", "Point Roberts"],
  "special_position": ["international border", "coastal", "I-5 corridor", "retirement destination"],
  "notable": "At a major inflection point as of 2026: border traffic collapse threatening fiscal model, comprehensive zoning rewrite underway, electric fund near depletion, rare de-annexation approved. City's identity as gateway to Canada being tested by geopolitics."
}
```

The `notable` field here captures what no structured field can: Blaine is a city in crisis-mode transition, and that context colors every other data point.
