181 lines
6.7 KiB
Markdown
181 lines
6.7 KiB
Markdown
# Copilot instructions for finn-eiendom-mcp
|
|
|
|
This project is a private, self-hosted Python platform for analyzing FINN real-estate listings. It exposes the same code through three coordinated front ends:
|
|
|
|
1. A **Python library** (`finn_eiendom`) — source of truth.
|
|
2. An **MCP server** (FastMCP, stdio + optional HTTP) over `finn_eiendom/mcp_server.py`.
|
|
3. A **CLI** (`finn-eiendom`) over `finn_eiendom/cli.py`.
|
|
|
|
All three share the same `service.py`, `formatting.py`, `cache.py`, and `models.py`. Code lives in exactly one place and is called from both front ends. See `PRD.md` §17 for the full ownership rules — that section is the constitution.
|
|
|
|
---
|
|
|
|
## Source of truth
|
|
|
|
Read in this order:
|
|
|
|
1. `PRD.md` — product and architecture, especially §17.
|
|
2. `PROJECT.md` — module map.
|
|
3. `AGENTS.md` — workflow.
|
|
4. `.github/instructions/*.md` — per-topic rules.
|
|
|
|
---
|
|
|
|
## Module layout
|
|
|
|
```
|
|
finn_eiendom/
|
|
config.py # env vars, defaults, TTLs
|
|
models.py # Pydantic v2 models
|
|
parser.py # number/area/date/URL/finnkode normalization
|
|
http.py # async HTTP (httpx) with delay + retry + user-agent
|
|
cache.py # SQLite (sqlite3) schema + persistence
|
|
search.py # FINN search HTML parsing + pagination
|
|
ad.py # FINN listing HTML parsing
|
|
eiendom_no.py # Eiendom.no unit search/detail, unit_vector, similar-units
|
|
scoring.py # score model + classifications
|
|
feedback.py # verdicts + soft preference signal
|
|
analysis.py # orchestration + shortlist + summary
|
|
service.py # get_or_fetch_* + thin facade for MCP and CLI
|
|
formatting.py # render_* helpers shared by MCP and CLI
|
|
mcp_server.py # FastMCP wrappers around service.py
|
|
cli.py # typer-based CLI wrappers around service.py
|
|
__main__.py # python -m finn_eiendom → CLI entry
|
|
```
|
|
|
|
---
|
|
|
|
## The five hard rules
|
|
|
|
Enforced by `tests/test_architecture.py`:
|
|
|
|
1. **`mcp_server.py` and `cli.py` are siblings.** They never import from each other. Both import only from `service`, `formatting`, `models`, `config`, stdlib, and their own framework (`mcp` / `typer`).
|
|
2. **`service.py` is the only orchestrator.** Nothing above it touches HTTP or SQLite directly.
|
|
3. **`httpx` lives only in `http.py`.**
|
|
4. **`sqlite3` lives only in `cache.py`.**
|
|
5. **Output formatting lives only in `formatting.py`.** Never inline in CLI or MCP tool bodies.
|
|
|
|
---
|
|
|
|
## Development workflow — local venv
|
|
|
|
Default runtime is a project-local virtualenv. Docker is supported for packaging but optional for development.
|
|
|
|
```bash
|
|
uv venv # or: python3.12 -m venv .venv
|
|
source .venv/bin/activate
|
|
uv pip install -e ".[dev]" # or: pip install -e ".[dev]"
|
|
|
|
# from now on:
|
|
pytest
|
|
ruff check .
|
|
ruff format .
|
|
mypy finn_eiendom
|
|
finn-eiendom --help
|
|
finn-eiendom-mcp # stdio MCP server
|
|
```
|
|
|
|
**Never** install packages globally. **Never** add a dependency without updating `pyproject.toml`.
|
|
|
|
---
|
|
|
|
## Coding rules
|
|
|
|
* Python 3.12+.
|
|
* Pydantic v2 with `model_config = ConfigDict(...)`. No v1 `class Config:` blocks.
|
|
* Type hints on every function signature.
|
|
* Async I/O for all network and DB code paths through `service.py`.
|
|
* Dependency injection for HTTP/cache clients in tests.
|
|
* Small, focused functions. One job per function. See `clean-code.instructions.md`.
|
|
* Errors raise with actionable messages; the MCP boundary translates them to `{"error": True, "code": ..., "message": ...}`.
|
|
* stdio MCP servers log to **stderr only**.
|
|
|
|
---
|
|
|
|
## Code ownership — the short version
|
|
|
|
| Concern | Lives in |
|
|
| -------------------------------------- | ------------------------------ |
|
|
| FINN search HTML parsing | `search.py` |
|
|
| FINN listing HTML parsing | `ad.py` |
|
|
| Norwegian number / area / URL regexes | `parser.py` |
|
|
| HTTP fetching + retry + delay | `http.py` |
|
|
| SQLite reads / writes | `cache.py` |
|
|
| Eiendom.no unit search/detail/comps | `eiendom_no.py` |
|
|
| `unit_vector` encode/decode (msgpack) | `eiendom_no.py` |
|
|
| Scoring + classification | `scoring.py` |
|
|
| Feedback storage | `feedback.py` |
|
|
| Cache-aware orchestration | `service.py` (`get_or_fetch_*`)|
|
|
| Shortlist + summary assembly | `analysis.py` |
|
|
| End-to-end runs | `service.py` (`analyze_search`)|
|
|
| MCP tool definitions | `mcp_server.py` |
|
|
| CLI command definitions | `cli.py` |
|
|
| Output rendering | `formatting.py` |
|
|
| Env-var defaults | `config.py` |
|
|
| Pydantic models | `models.py` |
|
|
|
|
Full table with "never lives in" column is in `PRD.md` §17.2.
|
|
|
|
---
|
|
|
|
## Adding a feature
|
|
|
|
1. Decide the home using the table above (and `PRD.md` §17.2).
|
|
2. Implement in `service.py` (or `analysis.py` if pure orchestration).
|
|
3. Add a service-level test.
|
|
4. Add a thin MCP tool — `response_format`-aware.
|
|
5. Add a thin CLI command — `--format`-aware.
|
|
6. Add a renderer in `formatting.py`.
|
|
7. Test MCP and CLI registration.
|
|
8. Update PRD and instruction docs.
|
|
|
|
If the MCP tool body or CLI command body grows past ~20 lines, push logic down to `service.py`.
|
|
|
|
---
|
|
|
|
## Documentation lookups — use context7
|
|
|
|
When uncertain about an external library API (FastMCP, Pydantic v2, Typer, httpx, msgpack, pytest-asyncio, respx, BeautifulSoup), call the **`context7` MCP server** *before* writing code. Don't rely on training-data memory.
|
|
|
|
```
|
|
context7:resolve-library-id → library_id
|
|
context7:query-docs(library_id, topic) → authoritative snippets
|
|
```
|
|
|
|
Details in `.github/instructions/docs.instructions.md`.
|
|
|
|
---
|
|
|
|
## Clean code is a hard requirement
|
|
|
|
See `clean-code.instructions.md`. DRY, single-responsibility, descriptive names, type hints, no dead code, comments explain why not what. If duplication slips in, the right answer is to extract it — not to copy the second instance.
|
|
|
|
---
|
|
|
|
## Product behavior
|
|
|
|
The MVP does one thing well:
|
|
|
|
```
|
|
FINN search URL in
|
|
→ relevant property candidates out
|
|
→ enriched with Eiendom.no estimates
|
|
→ similar-units / comps
|
|
→ explanations
|
|
→ risks
|
|
→ next steps
|
|
→ broker questions
|
|
```
|
|
|
|
Always explain:
|
|
|
|
* why a property is interesting,
|
|
* price vs estimate,
|
|
* price vs comparable sales,
|
|
* renovation upside,
|
|
* hybel / rental potential,
|
|
* technical / legal risks,
|
|
* uncertainty / confidence,
|
|
* next questions for the broker.
|
|
|
|
Scores and estimates are decision support, not advice. Surface uncertainty, never hide it. |