# Copilot instructions for finn-eiendom-mcp This project is a private, self-hosted Python platform for analyzing FINN real-estate listings. It exposes the same code through three coordinated front ends: 1. A **Python library** (`finn_eiendom`) — source of truth. 2. An **MCP server** (FastMCP, stdio + optional HTTP) over `finn_eiendom/mcp_server.py`. 3. A **CLI** (`finn-eiendom`) over `finn_eiendom/cli.py`. All three share the same `service.py`, `formatting.py`, `cache.py`, and `models.py`. Code lives in exactly one place and is called from both front ends. See `PRD.md` §17 for the full ownership rules — that section is the constitution. --- ## Source of truth Read in this order: 1. `PRD.md` — product and architecture, especially §17. 2. `PROJECT.md` — module map. 3. `AGENTS.md` — workflow. 4. `.github/instructions/*.md` — per-topic rules. --- ## Module layout ``` finn_eiendom/ config.py # env vars, defaults, TTLs models.py # Pydantic v2 models parser.py # number/area/date/URL/finnkode normalization http.py # async HTTP (httpx) with delay + retry + user-agent cache.py # SQLite (sqlite3) schema + persistence search.py # FINN search HTML parsing + pagination ad.py # FINN listing HTML parsing eiendom_no.py # Eiendom.no unit search/detail, unit_vector, similar-units scoring.py # score model + classifications feedback.py # verdicts + soft preference signal analysis.py # orchestration + shortlist + summary service.py # get_or_fetch_* + thin facade for MCP and CLI formatting.py # render_* helpers shared by MCP and CLI mcp_server.py # FastMCP wrappers around service.py cli.py # typer-based CLI wrappers around service.py __main__.py # python -m finn_eiendom → CLI entry ``` --- ## The five hard rules Enforced by `tests/test_architecture.py`: 1. **`mcp_server.py` and `cli.py` are siblings.** They never import from each other. Both import only from `service`, `formatting`, `models`, `config`, stdlib, and their own framework (`mcp` / `typer`). 2. **`service.py` is the only orchestrator.** Nothing above it touches HTTP or SQLite directly. 3. **`httpx` lives only in `http.py`.** 4. **`sqlite3` lives only in `cache.py`.** 5. **Output formatting lives only in `formatting.py`.** Never inline in CLI or MCP tool bodies. --- ## Development workflow — local venv Default runtime is a project-local virtualenv. Docker is supported for packaging but optional for development. ```bash uv venv # or: python3.12 -m venv .venv source .venv/bin/activate uv pip install -e ".[dev]" # or: pip install -e ".[dev]" # from now on: pytest ruff check . ruff format . mypy finn_eiendom finn-eiendom --help finn-eiendom-mcp # stdio MCP server ``` **Never** install packages globally. **Never** add a dependency without updating `pyproject.toml`. --- ## Coding rules * Python 3.12+. * Pydantic v2 with `model_config = ConfigDict(...)`. No v1 `class Config:` blocks. * Type hints on every function signature. * Async I/O for all network and DB code paths through `service.py`. * Dependency injection for HTTP/cache clients in tests. * Small, focused functions. One job per function. See `clean-code.instructions.md`. * Errors raise with actionable messages; the MCP boundary translates them to `{"error": True, "code": ..., "message": ...}`. * stdio MCP servers log to **stderr only**. --- ## Code ownership — the short version | Concern | Lives in | | -------------------------------------- | ------------------------------ | | FINN search HTML parsing | `search.py` | | FINN listing HTML parsing | `ad.py` | | Norwegian number / area / URL regexes | `parser.py` | | HTTP fetching + retry + delay | `http.py` | | SQLite reads / writes | `cache.py` | | Eiendom.no unit search/detail/comps | `eiendom_no.py` | | `unit_vector` encode/decode (msgpack) | `eiendom_no.py` | | Scoring + classification | `scoring.py` | | Feedback storage | `feedback.py` | | Cache-aware orchestration | `service.py` (`get_or_fetch_*`)| | Shortlist + summary assembly | `analysis.py` | | End-to-end runs | `service.py` (`analyze_search`)| | MCP tool definitions | `mcp_server.py` | | CLI command definitions | `cli.py` | | Output rendering | `formatting.py` | | Env-var defaults | `config.py` | | Pydantic models | `models.py` | Full table with "never lives in" column is in `PRD.md` §17.2. --- ## Adding a feature 1. Decide the home using the table above (and `PRD.md` §17.2). 2. Implement in `service.py` (or `analysis.py` if pure orchestration). 3. Add a service-level test. 4. Add a thin MCP tool — `response_format`-aware. 5. Add a thin CLI command — `--format`-aware. 6. Add a renderer in `formatting.py`. 7. Test MCP and CLI registration. 8. Update PRD and instruction docs. If the MCP tool body or CLI command body grows past ~20 lines, push logic down to `service.py`. --- ## Documentation lookups — use context7 When uncertain about an external library API (FastMCP, Pydantic v2, Typer, httpx, msgpack, pytest-asyncio, respx, BeautifulSoup), call the **`context7` MCP server** *before* writing code. Don't rely on training-data memory. ``` context7:resolve-library-id → library_id context7:query-docs(library_id, topic) → authoritative snippets ``` Details in `.github/instructions/docs.instructions.md`. --- ## Clean code is a hard requirement See `clean-code.instructions.md`. DRY, single-responsibility, descriptive names, type hints, no dead code, comments explain why not what. If duplication slips in, the right answer is to extract it — not to copy the second instance. --- ## Product behavior The MVP does one thing well: ``` FINN search URL in → relevant property candidates out → enriched with Eiendom.no estimates → similar-units / comps → explanations → risks → next steps → broker questions ``` Always explain: * why a property is interesting, * price vs estimate, * price vs comparable sales, * renovation upside, * hybel / rental potential, * technical / legal risks, * uncertainty / confidence, * next questions for the broker. Scores and estimates are decision support, not advice. Surface uncertainty, never hide it.