Files
finn-mcp/.github/copilot-instructions.md
T
2026-05-16 06:54:17 +00:00

6.7 KiB

Copilot instructions for finn-eiendom-mcp

This project is a private, self-hosted Python platform for analyzing FINN real-estate listings. It exposes the same code through three coordinated front ends:

  1. A Python library (finn_eiendom) — source of truth.
  2. An MCP server (FastMCP, stdio + optional HTTP) over finn_eiendom/mcp_server.py.
  3. A CLI (finn-eiendom) over finn_eiendom/cli.py.

All three share the same service.py, formatting.py, cache.py, and models.py. Code lives in exactly one place and is called from both front ends. See PRD.md §17 for the full ownership rules — that section is the constitution.


Source of truth

Read in this order:

  1. PRD.md — product and architecture, especially §17.
  2. PROJECT.md — module map.
  3. AGENTS.md — workflow.
  4. .github/instructions/*.md — per-topic rules.

Module layout

finn_eiendom/
  config.py         # env vars, defaults, TTLs
  models.py         # Pydantic v2 models
  parser.py         # number/area/date/URL/finnkode normalization
  http.py           # async HTTP (httpx) with delay + retry + user-agent
  cache.py          # SQLite (sqlite3) schema + persistence
  search.py         # FINN search HTML parsing + pagination
  ad.py             # FINN listing HTML parsing
  eiendom_no.py     # Eiendom.no unit search/detail, unit_vector, similar-units
  scoring.py        # score model + classifications
  feedback.py       # verdicts + soft preference signal
  analysis.py       # orchestration + shortlist + summary
  service.py        # get_or_fetch_* + thin facade for MCP and CLI
  formatting.py     # render_* helpers shared by MCP and CLI
  mcp_server.py     # FastMCP wrappers around service.py
  cli.py            # typer-based CLI wrappers around service.py
  __main__.py       # python -m finn_eiendom → CLI entry

The five hard rules

Enforced by tests/test_architecture.py:

  1. mcp_server.py and cli.py are siblings. They never import from each other. Both import only from service, formatting, models, config, stdlib, and their own framework (mcp / typer).
  2. service.py is the only orchestrator. Nothing above it touches HTTP or SQLite directly.
  3. httpx lives only in http.py.
  4. sqlite3 lives only in cache.py.
  5. Output formatting lives only in formatting.py. Never inline in CLI or MCP tool bodies.

Development workflow — local venv

Default runtime is a project-local virtualenv. Docker is supported for packaging but optional for development.

uv venv                          # or: python3.12 -m venv .venv
source .venv/bin/activate
uv pip install -e ".[dev]"       # or: pip install -e ".[dev]"

# from now on:
pytest
ruff check .
ruff format .
mypy finn_eiendom
finn-eiendom --help
finn-eiendom-mcp                 # stdio MCP server

Never install packages globally. Never add a dependency without updating pyproject.toml.


Coding rules

  • Python 3.12+.
  • Pydantic v2 with model_config = ConfigDict(...). No v1 class Config: blocks.
  • Type hints on every function signature.
  • Async I/O for all network and DB code paths through service.py.
  • Dependency injection for HTTP/cache clients in tests.
  • Small, focused functions. One job per function. See clean-code.instructions.md.
  • Errors raise with actionable messages; the MCP boundary translates them to {"error": True, "code": ..., "message": ...}.
  • stdio MCP servers log to stderr only.

Code ownership — the short version

Concern Lives in
FINN search HTML parsing search.py
FINN listing HTML parsing ad.py
Norwegian number / area / URL regexes parser.py
HTTP fetching + retry + delay http.py
SQLite reads / writes cache.py
Eiendom.no unit search/detail/comps eiendom_no.py
unit_vector encode/decode (msgpack) eiendom_no.py
Scoring + classification scoring.py
Feedback storage feedback.py
Cache-aware orchestration service.py (get_or_fetch_*)
Shortlist + summary assembly analysis.py
End-to-end runs service.py (analyze_search)
MCP tool definitions mcp_server.py
CLI command definitions cli.py
Output rendering formatting.py
Env-var defaults config.py
Pydantic models models.py

Full table with "never lives in" column is in PRD.md §17.2.


Adding a feature

  1. Decide the home using the table above (and PRD.md §17.2).
  2. Implement in service.py (or analysis.py if pure orchestration).
  3. Add a service-level test.
  4. Add a thin MCP tool — response_format-aware.
  5. Add a thin CLI command — --format-aware.
  6. Add a renderer in formatting.py.
  7. Test MCP and CLI registration.
  8. Update PRD and instruction docs.

If the MCP tool body or CLI command body grows past ~20 lines, push logic down to service.py.


Documentation lookups — use context7

When uncertain about an external library API (FastMCP, Pydantic v2, Typer, httpx, msgpack, pytest-asyncio, respx, BeautifulSoup), call the context7 MCP server before writing code. Don't rely on training-data memory.

context7:resolve-library-id   →  library_id
context7:query-docs(library_id, topic)  →  authoritative snippets

Details in .github/instructions/docs.instructions.md.


Clean code is a hard requirement

See clean-code.instructions.md. DRY, single-responsibility, descriptive names, type hints, no dead code, comments explain why not what. If duplication slips in, the right answer is to extract it — not to copy the second instance.


Product behavior

The MVP does one thing well:

FINN search URL in
  → relevant property candidates out
  → enriched with Eiendom.no estimates
  → similar-units / comps
  → explanations
  → risks
  → next steps
  → broker questions

Always explain:

  • why a property is interesting,
  • price vs estimate,
  • price vs comparable sales,
  • renovation upside,
  • hybel / rental potential,
  • technical / legal risks,
  • uncertainty / confidence,
  • next questions for the broker.

Scores and estimates are decision support, not advice. Surface uncertainty, never hide it.