# PRD: finn-mcp v2 ## Current State (from codebase + DB inspection) ### What already works - **SQLite database** (`data/finn.sqlite`) with row counts: 222 finn_ads, 149 eiendom_units, 56 similar_units - **Hash-aware caching architecture** is designed (see `cache.py` docstring) - **Transport scoring** is implemented (`score_transport` uses lat/lng from Eiendom.no) - **`listing_description`** is stored in the `FinnAd` model - **`finn_analyze_unit_images`** downloads, resizes to 1024px, returns as `ImageContent` — Claude sees images directly ### Critical bugs discovered - **Analysis cache is dead.** `analysis_cache` table has **0 rows**. Every search recomputes scoring from scratch. - **`content_hash` is NULL on every row** in `finn_ads`, `eiendom_units`, `similar_units` — 100% NULL across 427 rows. The `_compute_deps_hash` function therefore returns a deterministic hash of empty strings on every call. - Schema dump shows `, content_hash TEXT)` appended — column was added via `ALTER TABLE` after data already existed. Either the running deployment doesn't populate it on writes, or no backfill migration was run. - **Only 36 of 222 ads** have `eiendom_unit_code` populated in the stored payload. Enrichment is failing or the resolved unit code isn't being persisted back to the ad row. - **Search page cache** (`cache_meta`) all rows expired May 16 — 60-min TTL is far too short. ### Known design problems - **`feedback.py` is a stub** — all three functions are `# TODO`, nothing is persisted. No `user_feedback` table. - No `price_history` table. - No `search_runs` table with finnkodes per search. - **`listing_description` is actively stripped** in `_slim_listing()` in `mcp_server.py`. - **`detail_limit`** means only N listings get full Eiendom.no analysis — the rest are unscored. - **No batch analysis** — analyzing 46 listings requires 46 sequential MCP calls. - **12 tools**, 7 of which are internal plumbing. - **Cache TTLs are far too short** — 24h on listing data forces full re-fetch on day-2 repeat searches. --- ## Goals 1. **Fix the broken cache first** — current cache promises nothing and delivers nothing 2. **Long-lived caching** with smart freshness checks — listing structural data doesn't change, treat it accordingly 3. **6 tools** — one per user intent 4. **Batch analysis** — analyze many listings in one call 5. **Persistent enrichment** — missing tables, feedback implementation 6. **Output matches intent** — each tool returns only what is relevant 7. **`listing_description` available** for AI interpretation in `finn_analyze_ad` --- ## Architecture ### Caching strategy (revised) Listings don't fundamentally change on FINN once posted. Address, area, year, property type, description, eiendom_unit_code mapping — all stable. What changes: price, sale status, DOM. Treat structural data as effectively immutable; check price/status separately and cheaply. **Two-tier model:** ``` ┌────────────────────────────────────────────────────────────────┐ │ STRUCTURAL DATA (long TTL, full refetch only when invalidated)│ │ - finn_ads.payload (description, area, year, etc.) │ │ - eiendom_units.payload (lat, lng, property_type, etc.) │ │ - similar_units.payload (completed sales — immutable) │ └────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────┐ │ VOLATILE DATA (short TTL, cheap refresh) │ │ - price, status, days_on_market │ │ - eiendom_units.estimated_selling_price │ └────────────────────────────────────────────────────────────────┘ ``` ### Cache TTLs (revised) | Data | TTL | Refresh strategy | |------|-----|-----------------| | FINN ad structural | **30 days** | Full refetch only | | FINN ad price/status | **6 hours** | Lightweight check, falls back to full refetch if status changed | | Eiendom.no unit structural | **30 days** | Full refetch only | | Eiendom.no estimate | **7 days** | Refresh on access | | Similar units (sold comps) | **60 days** | Immutable rows; new rows appear over time | | Search pages | **6 hours** | Content-hash check, only re-scrape if list actually changed | | Analysis result | **Never expires** | Invalidated by `deps_hash` change | **Lightweight price/status check:** A FINN ad page has a stable URL. Fetch headers only (HEAD) or scrape the small `price_widget` block — much cheaper than the full ad page. If price unchanged, bump `last_verified_at`; if changed, full refetch. ### Database schema changes ```sql -- Add to finn_ads ALTER TABLE finn_ads ADD COLUMN last_verified_at TEXT; -- Tracks when we last confirmed price/status, separate from fetched_at -- which tracks when we last did a full refetch. -- New: user feedback (replaces feedback.py stubs) CREATE TABLE user_feedback ( finnkode TEXT PRIMARY KEY, verdict TEXT NOT NULL, -- 'liked' | 'disliked' | 'maybe' | 'visited' notes TEXT, created_at TEXT NOT NULL, updated_at TEXT NOT NULL ); -- New: price history (append-only) CREATE TABLE price_history ( id INTEGER PRIMARY KEY AUTOINCREMENT, finnkode TEXT NOT NULL, total_price INTEGER, asking_price INTEGER, sale_status TEXT, recorded_at TEXT NOT NULL ); CREATE INDEX idx_price_history_finnkode_recorded ON price_history(finnkode, recorded_at); -- New: search runs (for finn_get_new_ads_since_last_run) CREATE TABLE search_runs ( id INTEGER PRIMARY KEY AUTOINCREMENT, search_url TEXT NOT NULL, finnkodes TEXT NOT NULL, -- JSON array created_at TEXT NOT NULL ); CREATE INDEX idx_search_runs_url_created ON search_runs(search_url, created_at); -- Indexes for stale-detection scans CREATE INDEX idx_finn_ads_verified ON finn_ads(last_verified_at); CREATE INDEX idx_eiendom_units_fetched ON eiendom_units(fetched_at); ``` --- ## Tools (v2) — 6 total ### 1. `finn_analyze_search` **Intent:** Ranked list of all listings in this search. ```typescript Input: search_url: string refresh?: boolean // force re-fetch even if cache is valid max_pages?: number // default 5 Output: total: number cache_status: { listings_from_cache: number listings_refreshed: number listings_freshly_scraped: number } listings: Array<{ finnkode, rank, score, url, address, district, area_m2, bedrooms, floor, construction_year, total_price, common_costs, shared_debt, sqm_price, price_vs_estimate, // negative = below estimate market_placement, dom, categories, risks }> ``` **Behaviour:** Returns ALL scraped listings, not limited by `detail_limit`. Listings without enrichment get `score: null`. Lazy enrichment is triggered by `finn_analyze_ad`. ### 2. `finn_analyze_ad` **Intent:** Deep-dive into one or more specific listings. ```typescript Input: finnkode: string | string[] // single or batch refresh?: boolean // bypass cache Output: // Single string input → single object // Array input → array of objects in same order finnkode: string url: string address: string listing_description: string // ← INCLUDED for AI interpretation score: { total: number breakdown: Record nearby_transit: { tbane: [...], trikk: [...] } } price: { total, asking, shared_debt, common_costs, sqm_price, estimate, estimate_lower, estimate_upper, vs_estimate, market_placement } property: { type, ownership, area_m2, bedrooms, floor, construction_year, has_balcony, has_elevator, has_garage } market: { dom, sale_status, avg_comp_sqm_price, comp_count, comps: Array<{address, usable_area, floor, construction_year, selling_price, sqm_price, days_on_market, finalized_at}> // top 15 } price_history: Array<{ total_price, asking_price, recorded_at }> categories: string[] risks: string[] cache_age: { structural_days: number // age of last full refetch price_hours: number // age of last price verification } ``` **Batch behaviour:** Up to 50 finnkodes per call. Internal parallelism, single MCP round-trip. Returns array in input order; failed lookups have `{finnkode, error: "..."}` shape. ### 3. `finn_analyze_unit_images` **Intent:** Visual assessment — condition, views, room feel. Unchanged from current implementation. Returns `ImageContent` blocks, not URLs. ```typescript Input: unit_code: string max_images?: number // default 8 ``` ### 4. `finn_get_new_ads_since_last_run` **Intent:** What has changed since I last checked this search? ```typescript Input: search_url: string Output: new_ads: Array<{finnkode, address, score, total_price, categories, url}> removed_ads: Array<{finnkode, address}> changed_ads: Array<{ finnkode, address, changes: Array<{field, from, to}> // typically price/status }> since: string // ISO timestamp of previous run ``` ### 5. `finn_save_feedback` **Intent:** Save my verdict on a listing. ```typescript Input: finnkode: string verdict: 'liked' | 'disliked' | 'maybe' | 'visited' notes?: string Output: ok: boolean finnkode: string verdict: string ``` ### 6. `finn_get_shortlist` **Intent:** Show me reviewed listings, or find similar to one I liked. ```typescript Input: verdict?: 'liked' | 'disliked' | 'maybe' | 'visited' find_similar_to?: string // finnkode — return listings similar to this min_score?: number limit?: number // default 10 Output: listings: Array<{ finnkode, address, score, total_price, verdict?, notes?, categories, url }> ``` --- ## Tools removed | Tool | Reason | |------|--------| | `finn_build_unit_vector` | Internal impl detail | | `finn_decode_unit_vector` | Debug utility, no user value | | `finn_resolve_eiendom_unit` | Internal mapping, runs automatically in `analyze_ad` | | `finn_get_ad` | Raw fetch without scoring — `analyze_ad` covers it | | `finn_get_eiendom_unit` | Raw Eiendom.no fetch, internal | | `finn_get_similar_units` | Takes unit_vector directly, internal | | `finn_analyze_ad_against_comps` | Absorbed into `analyze_ad` (comps always included) | | `finn_compare_ads` | Absorbed into `analyze_ad(finnkode: string[])` | | `finn_find_similar_to_liked_ad` | Absorbed into `get_shortlist(find_similar_to=finnkode)` | 12 → 6 tools. No user intent is lost. Batch use case now native via `analyze_ad`. --- ## Workflows & optimizations ### Lazy enrichment on demand `analyze_search` returns all scraped listings immediately with whatever data is cached. Listings without Eiendom.no enrichment have `score: null`. First `analyze_ad(finnkode)` call enriches and caches. Next `analyze_search` shows the now-cached score. Eliminates `detail_limit` as a user-facing parameter. ### Background freshness check On `analyze_search` cache hit, kick off async refresh of any items older than the volatile-data TTL (6h price check). User gets immediate response from cache; next call benefits from refreshed data. ### Re-score without refetch Scoring weights are configurable. If the user changes weights, re-score from cached `finn_ads` + `eiendom_units` + `similar_units` without any network calls. Invalidates `analysis_cache` only, not raw data. ### Price drop detection `price_history` table enables `finn_get_shortlist(price_dropped_since: timestamp)` — surface listings that dropped price recently. Built on existing append-only writes. ### Cache warming on save_feedback When `verdict='liked'`, pre-fetch similar units in background. Next `find_similar_to=finnkode` call is instant. ### Batch enrichment via parallel Eiendom.no Current enrichment is sequential per ad. Parallel-batch up to N at a time via `asyncio.gather` already exists in `analyze_search` — use the same pattern in `analyze_ad(finnkode: string[])`. ### Cache inspection Internal-only — useful for debugging. Add a `--cache-status` CLI command (not an MCP tool) that reports row counts, oldest/newest fetched_at, NULL-hash rows, missing eiendom_unit_codes. --- ## Output principles **Never in any tool response:** - `unit_vector` / raw Eiendom.no vector - `unit_images` URL lists (use `finn_analyze_unit_images`) - Internal timestamps (`fetched_at`, `detail_fetched_at`, `computed_at`) - `lat` / `lng` coordinates **`listing_description`:** - **Not** in `finn_analyze_search` — too long, 77 × 500 words = noise - **Yes** in `finn_analyze_ad` — AI needs it to interpret risk flags, clauses, edge cases --- ## Migration plan ### Phase 0 — Fix the broken cache (BLOCKER) Nothing else delivers value until this is fixed. The current cache stores nothing reusable across sessions. - [ ] **Audit the running deployment.** Compare the deployed `cache.py` to the source we have. Hashes are NULL in DB despite source code populating them — find the divergence. - [ ] **Backfill content_hash for existing rows.** Compute from stored payloads. - [ ] **Fix `ensure_eiendom_unit_code` persistence.** Only 36/222 ads have `eiendom_unit_code` in their payload — verify the mutation reaches `save_finn_ad` before serialisation. - [ ] **Verify `save_analysis` actually fires.** Add unit test confirming analysis_cache row count increases after `analyze_ad` call. Currently 0 rows after 222 ad fetches. - [ ] **Add CLI cache-status command** for ongoing visibility. **Success criteria:** - `analysis_cache` populated after any `analyze_search` run - Repeat `analyze_search` within TTL window: zero network calls, sub-second response - All `content_hash` columns populated across `finn_ads`, `eiendom_units`, `similar_units` ### Phase 1 — Longer cache TTLs + freshness model - [ ] Update `config.py` TTLs (see table above) - [ ] Add `last_verified_at` column to `finn_ads` - [ ] Implement lightweight price/status check (HEAD or `price_widget` scrape) - [ ] On cache hit, kick off async refresh if `last_verified_at` is stale - [ ] Update `_is_fresh` logic to use TTL only on `last_verified_at`, not `fetched_at` **Success criteria:** - Listing fetched 28 days ago, never re-verified: returns from cache, triggers async verify - Same listing fetched today: returns from cache, no network call - Price changed since last fetch: detected by lightweight check, triggers full refetch + invalidates analysis ### Phase 2 — Missing tables and stub implementations - [ ] Create `user_feedback`, `price_history`, `search_runs` tables - [ ] Implement `feedback.py` — replace all TODO stubs with DB writes - [ ] Populate `price_history` on every `save_finn_ad` call (append-only) - [ ] Populate `search_runs` on every `analyze_search` call **Success criteria:** - `finn_save_feedback` writes to DB; `finn_get_shortlist(verdict=...)` returns it - `finn_get_new_ads_since_last_run` returns real diff from last run - `price_history` populated when a re-fetched ad has changed price ### Phase 3 — Output payload cleanup (no breaking tool changes) - [ ] Stop stripping `listing_description` in `_slim_listing()` for `analyze_ad` - [ ] Remove `unit_images`, `unit_vector`, internal timestamps from `analyze_ad` response - [ ] Add `price_history` and `cache_age` to `analyze_ad` response - [ ] Add `price_vs_estimate` and `cache_status` to `analyze_search` response **Success criteria:** - `finn_analyze_search` on 30 listings: < 50KB - `finn_analyze_ad` per listing: < 8KB excluding description, < 12KB including ### Phase 4 — Consolidate to 6 tools + batch (breaking change) - [ ] Remove the 9 redundant tools from `mcp_server.py` - [ ] Update `finn_analyze_ad` to accept `string | string[]` — single or batch - [ ] Add `find_similar_to` parameter to `finn_get_shortlist` - [ ] Always include comps in `analyze_ad` — drop `include_eiendom_no` / `include_similar_units` flags - [ ] Migrate all `test_mcp_integration.py` tests to new tool surface **Success criteria:** - `finn_analyze_ad(["a", "b", "c"])`: one round trip, parallel internal fetch - All existing use cases covered by 6 tools ### Phase 5 — Lazy enrichment + workflow additions - [ ] `analyze_search` returns all scraped listings, not just `detail_limit` count - [ ] Listings without enrichment get `score: null`, enriched on first `analyze_ad` call - [ ] Background warm-up on `save_feedback(liked)` → pre-fetch similar units - [ ] Re-score endpoint (or flag) that rebuilds scores from cached raw data **Success criteria:** - `analyze_search` on 77-result search: all 77 returned, no `detail_limit` truncation - Subsequent `analyze_ad` on a previously-unenriched listing: enriches + caches + returns - Scoring weight change re-runs analysis without re-fetching FINN or Eiendom.no --- ## Success metrics | Metric | Now | Target | |--------|-----|--------| | Number of tools | 12 | 6 | | `content_hash` populated rows | 0% | 100% | | `analysis_cache` row count after search | 0 | matches analyzed_listings | | `eiendom_unit_code` populated in stored ads | 36/222 (16%) | ~95% (resale only) | | `listing_description` available to AI | No | Yes (in `finn_analyze_ad`) | | Feedback actually persisted | No (stub) | Yes | | `finn_analyze_search` payload (30 ads) | ~215KB | < 50KB | | `finn_analyze_ad` payload per ad | ~40KB | < 12KB | | Repeat search within 1 week | Full recompute | 0 network calls, < 1s | | Listings unscored due to `detail_limit` | 47 of 77 | 0 (lazy enrichment) | | Batch analyze 10 ads | 10 round-trips | 1 round-trip | | FINN ad structural TTL | 24h | 30 days |