initial
This commit is contained in:
@@ -0,0 +1,18 @@
|
|||||||
|
FINN_CACHE_PATH=/data/finn.sqlite
|
||||||
|
FINN_MAX_SEARCH_PAGES=3
|
||||||
|
FINN_DETAIL_LIMIT=20
|
||||||
|
FINN_REQUEST_DELAY_SECONDS=2
|
||||||
|
FINN_CACHE_TTL_SEARCH_MINUTES=60
|
||||||
|
FINN_CACHE_TTL_AD_HOURS=24
|
||||||
|
FINN_USER_AGENT=personal-finn-eiendom-analyzer/0.1
|
||||||
|
|
||||||
|
EIENDOM_NO_ENABLED=true
|
||||||
|
EIENDOM_NO_BASE_URL=https://api.eiendom.no/api/v1
|
||||||
|
EIENDOM_NO_CACHE_TTL_HOURS=24
|
||||||
|
EIENDOM_NO_REQUEST_DELAY_SECONDS=1
|
||||||
|
EIENDOM_NO_SIMILAR_UNITS_ENABLED=true
|
||||||
|
EIENDOM_NO_SIMILAR_UNITS_DEFAULT_STATUS=RECENTLY_SOLD
|
||||||
|
|
||||||
|
LOG_LEVEL=DEBUG
|
||||||
|
MCP_HOST=0.0.0.0
|
||||||
|
MCP_PORT=8000
|
||||||
@@ -0,0 +1,181 @@
|
|||||||
|
# Copilot instructions for finn-eiendom-mcp
|
||||||
|
|
||||||
|
This project is a private, self-hosted Python platform for analyzing FINN real-estate listings. It exposes the same code through three coordinated front ends:
|
||||||
|
|
||||||
|
1. A **Python library** (`finn_eiendom`) — source of truth.
|
||||||
|
2. An **MCP server** (FastMCP, stdio + optional HTTP) over `finn_eiendom/mcp_server.py`.
|
||||||
|
3. A **CLI** (`finn-eiendom`) over `finn_eiendom/cli.py`.
|
||||||
|
|
||||||
|
All three share the same `service.py`, `formatting.py`, `cache.py`, and `models.py`. Code lives in exactly one place and is called from both front ends. See `PRD.md` §17 for the full ownership rules — that section is the constitution.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Source of truth
|
||||||
|
|
||||||
|
Read in this order:
|
||||||
|
|
||||||
|
1. `PRD.md` — product and architecture, especially §17.
|
||||||
|
2. `PROJECT.md` — module map.
|
||||||
|
3. `AGENTS.md` — workflow.
|
||||||
|
4. `.github/instructions/*.md` — per-topic rules.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Module layout
|
||||||
|
|
||||||
|
```
|
||||||
|
finn_eiendom/
|
||||||
|
config.py # env vars, defaults, TTLs
|
||||||
|
models.py # Pydantic v2 models
|
||||||
|
parser.py # number/area/date/URL/finnkode normalization
|
||||||
|
http.py # async HTTP (httpx) with delay + retry + user-agent
|
||||||
|
cache.py # SQLite (sqlite3) schema + persistence
|
||||||
|
search.py # FINN search HTML parsing + pagination
|
||||||
|
ad.py # FINN listing HTML parsing
|
||||||
|
eiendom_no.py # Eiendom.no unit search/detail, unit_vector, similar-units
|
||||||
|
scoring.py # score model + classifications
|
||||||
|
feedback.py # verdicts + soft preference signal
|
||||||
|
analysis.py # orchestration + shortlist + summary
|
||||||
|
service.py # get_or_fetch_* + thin facade for MCP and CLI
|
||||||
|
formatting.py # render_* helpers shared by MCP and CLI
|
||||||
|
mcp_server.py # FastMCP wrappers around service.py
|
||||||
|
cli.py # typer-based CLI wrappers around service.py
|
||||||
|
__main__.py # python -m finn_eiendom → CLI entry
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The five hard rules
|
||||||
|
|
||||||
|
Enforced by `tests/test_architecture.py`:
|
||||||
|
|
||||||
|
1. **`mcp_server.py` and `cli.py` are siblings.** They never import from each other. Both import only from `service`, `formatting`, `models`, `config`, stdlib, and their own framework (`mcp` / `typer`).
|
||||||
|
2. **`service.py` is the only orchestrator.** Nothing above it touches HTTP or SQLite directly.
|
||||||
|
3. **`httpx` lives only in `http.py`.**
|
||||||
|
4. **`sqlite3` lives only in `cache.py`.**
|
||||||
|
5. **Output formatting lives only in `formatting.py`.** Never inline in CLI or MCP tool bodies.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Development workflow — local venv
|
||||||
|
|
||||||
|
Default runtime is a project-local virtualenv. Docker is supported for packaging but optional for development.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv venv # or: python3.12 -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
uv pip install -e ".[dev]" # or: pip install -e ".[dev]"
|
||||||
|
|
||||||
|
# from now on:
|
||||||
|
pytest
|
||||||
|
ruff check .
|
||||||
|
ruff format .
|
||||||
|
mypy finn_eiendom
|
||||||
|
finn-eiendom --help
|
||||||
|
finn-eiendom-mcp # stdio MCP server
|
||||||
|
```
|
||||||
|
|
||||||
|
**Never** install packages globally. **Never** add a dependency without updating `pyproject.toml`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Coding rules
|
||||||
|
|
||||||
|
* Python 3.12+.
|
||||||
|
* Pydantic v2 with `model_config = ConfigDict(...)`. No v1 `class Config:` blocks.
|
||||||
|
* Type hints on every function signature.
|
||||||
|
* Async I/O for all network and DB code paths through `service.py`.
|
||||||
|
* Dependency injection for HTTP/cache clients in tests.
|
||||||
|
* Small, focused functions. One job per function. See `clean-code.instructions.md`.
|
||||||
|
* Errors raise with actionable messages; the MCP boundary translates them to `{"error": True, "code": ..., "message": ...}`.
|
||||||
|
* stdio MCP servers log to **stderr only**.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Code ownership — the short version
|
||||||
|
|
||||||
|
| Concern | Lives in |
|
||||||
|
| -------------------------------------- | ------------------------------ |
|
||||||
|
| FINN search HTML parsing | `search.py` |
|
||||||
|
| FINN listing HTML parsing | `ad.py` |
|
||||||
|
| Norwegian number / area / URL regexes | `parser.py` |
|
||||||
|
| HTTP fetching + retry + delay | `http.py` |
|
||||||
|
| SQLite reads / writes | `cache.py` |
|
||||||
|
| Eiendom.no unit search/detail/comps | `eiendom_no.py` |
|
||||||
|
| `unit_vector` encode/decode (msgpack) | `eiendom_no.py` |
|
||||||
|
| Scoring + classification | `scoring.py` |
|
||||||
|
| Feedback storage | `feedback.py` |
|
||||||
|
| Cache-aware orchestration | `service.py` (`get_or_fetch_*`)|
|
||||||
|
| Shortlist + summary assembly | `analysis.py` |
|
||||||
|
| End-to-end runs | `service.py` (`analyze_search`)|
|
||||||
|
| MCP tool definitions | `mcp_server.py` |
|
||||||
|
| CLI command definitions | `cli.py` |
|
||||||
|
| Output rendering | `formatting.py` |
|
||||||
|
| Env-var defaults | `config.py` |
|
||||||
|
| Pydantic models | `models.py` |
|
||||||
|
|
||||||
|
Full table with "never lives in" column is in `PRD.md` §17.2.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Adding a feature
|
||||||
|
|
||||||
|
1. Decide the home using the table above (and `PRD.md` §17.2).
|
||||||
|
2. Implement in `service.py` (or `analysis.py` if pure orchestration).
|
||||||
|
3. Add a service-level test.
|
||||||
|
4. Add a thin MCP tool — `response_format`-aware.
|
||||||
|
5. Add a thin CLI command — `--format`-aware.
|
||||||
|
6. Add a renderer in `formatting.py`.
|
||||||
|
7. Test MCP and CLI registration.
|
||||||
|
8. Update PRD and instruction docs.
|
||||||
|
|
||||||
|
If the MCP tool body or CLI command body grows past ~20 lines, push logic down to `service.py`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Documentation lookups — use context7
|
||||||
|
|
||||||
|
When uncertain about an external library API (FastMCP, Pydantic v2, Typer, httpx, msgpack, pytest-asyncio, respx, BeautifulSoup), call the **`context7` MCP server** *before* writing code. Don't rely on training-data memory.
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id → library_id
|
||||||
|
context7:query-docs(library_id, topic) → authoritative snippets
|
||||||
|
```
|
||||||
|
|
||||||
|
Details in `.github/instructions/docs.instructions.md`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Clean code is a hard requirement
|
||||||
|
|
||||||
|
See `clean-code.instructions.md`. DRY, single-responsibility, descriptive names, type hints, no dead code, comments explain why not what. If duplication slips in, the right answer is to extract it — not to copy the second instance.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Product behavior
|
||||||
|
|
||||||
|
The MVP does one thing well:
|
||||||
|
|
||||||
|
```
|
||||||
|
FINN search URL in
|
||||||
|
→ relevant property candidates out
|
||||||
|
→ enriched with Eiendom.no estimates
|
||||||
|
→ similar-units / comps
|
||||||
|
→ explanations
|
||||||
|
→ risks
|
||||||
|
→ next steps
|
||||||
|
→ broker questions
|
||||||
|
```
|
||||||
|
|
||||||
|
Always explain:
|
||||||
|
|
||||||
|
* why a property is interesting,
|
||||||
|
* price vs estimate,
|
||||||
|
* price vs comparable sales,
|
||||||
|
* renovation upside,
|
||||||
|
* hybel / rental potential,
|
||||||
|
* technical / legal risks,
|
||||||
|
* uncertainty / confidence,
|
||||||
|
* next questions for the broker.
|
||||||
|
|
||||||
|
Scores and estimates are decision support, not advice. Surface uncertainty, never hide it.
|
||||||
@@ -0,0 +1,150 @@
|
|||||||
|
---
|
||||||
|
name: Clean code rules
|
||||||
|
description: Best-practice standards for all production and test code
|
||||||
|
applyTo: "**/*.py"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Clean code rules
|
||||||
|
|
||||||
|
These rules apply everywhere — every module, every function, every test. They are intentionally opinionated. If a rule conflicts with the architecture rules in `PRD.md` §17, the architecture rules win. If it conflicts with another best practice here, pick the one that produces the simpler, more readable result.
|
||||||
|
|
||||||
|
## Single responsibility
|
||||||
|
|
||||||
|
* One job per function. If a function name needs "and" to describe it, it's two functions.
|
||||||
|
* One job per module. `parser.py` parses. `cache.py` caches. `formatting.py` formats. Don't mix.
|
||||||
|
* One job per class. We rarely need classes outside Pydantic models, dataclasses, and the `HTTPClient`. Avoid OO for OO's sake.
|
||||||
|
|
||||||
|
## Function size
|
||||||
|
|
||||||
|
* Aim for under **30 lines** of body.
|
||||||
|
* Past **50 lines** it's a code smell — extract helpers.
|
||||||
|
* If you've got more than **3 levels of nesting**, the function wants splitting (extract the inner block into a helper named after what it does).
|
||||||
|
|
||||||
|
## Naming
|
||||||
|
|
||||||
|
* Names describe **intent**, not implementation. `get_or_fetch_ad`, not `process_ad`. `render_shortlist_markdown`, not `format2`.
|
||||||
|
* Verbs for actions (`fetch_`, `parse_`, `score_`, `render_`).
|
||||||
|
* Nouns for data (`FinnAd`, `EiendomUnit`, `shortlist`).
|
||||||
|
* Boolean variables / parameters read as predicates: `force_refresh`, `include_eiendom_no`, `is_recently_sold`. Not `flag`, not `do_thing`.
|
||||||
|
* Avoid abbreviations except those well-established in the domain (`url`, `ad`, `nok`, `bra`, `sqm`).
|
||||||
|
* Norwegian terms stay Norwegian when they're domain vocabulary (`hybel`, `fellesgjeld`, `finnkode`). Don't translate `finnkode` to `finn_code` — it's a proper noun.
|
||||||
|
|
||||||
|
## Type hints
|
||||||
|
|
||||||
|
Required on every function signature, including private helpers. Mypy in strict mode is the goal.
|
||||||
|
|
||||||
|
```python
|
||||||
|
# ❌
|
||||||
|
def parse(html, base_url=None):
|
||||||
|
...
|
||||||
|
|
||||||
|
# ✅
|
||||||
|
def parse(html: str, base_url: str | None = None) -> FinnAd | None:
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
Use modern syntax: `X | None` over `Optional[X]`, `list[int]` over `List[int]`, `dict[str, Any]` over `Dict[str, Any]`.
|
||||||
|
|
||||||
|
## Comments
|
||||||
|
|
||||||
|
* Comments explain **WHY**, never **WHAT**. The code already says what.
|
||||||
|
* If a comment is needed to explain *what* a line does, the line wants renaming or extracting.
|
||||||
|
* Use docstrings for public functions, classes, and modules. One-line summary, blank line, optional details and examples.
|
||||||
|
* No commented-out code. Delete it. Git remembers.
|
||||||
|
* No `# TODO` without a date or issue reference. `# TODO(2026-05): replace once Eiendom.no confirms ...` is fine.
|
||||||
|
|
||||||
|
## DRY — Don't Repeat Yourself
|
||||||
|
|
||||||
|
If you write the same logic, regex, SQL, or format string **twice**, extract it. The decision table in `PRD.md` §17.2 tells you where it belongs.
|
||||||
|
|
||||||
|
The pre-merge anti-duplication checklist (from `PRD.md` §17.4):
|
||||||
|
|
||||||
|
1. Is this logic already implemented somewhere? (`grep` the function name and obvious keywords.)
|
||||||
|
2. If I'm copy-pasting from another file, am I about to duplicate behavior that should live in one shared function?
|
||||||
|
3. Can a new caller use an existing `service.py` function instead of writing its own orchestration?
|
||||||
|
4. Is the same Pydantic field defined in two models? Factor out a base model.
|
||||||
|
5. Am I formatting output in two places (CLI + MCP)? Move it to `formatting.py`.
|
||||||
|
6. Am I opening a SQLite connection outside `cache.py`? Move it.
|
||||||
|
7. Am I building an httpx call outside `http.py`? Move it.
|
||||||
|
8. Am I writing a Norwegian-number / area / finnkode regex outside `parser.py`? Move it.
|
||||||
|
9. Am I adding an env-var lookup outside `config.py`? Move it.
|
||||||
|
10. Did I add a new behavior with only one front end (MCP or CLI)? If it should exist in both, the service function is missing.
|
||||||
|
|
||||||
|
A small amount of duplication is acceptable to keep boundaries clean — see `PRD.md` §17.8. Past a handful of lines, extract.
|
||||||
|
|
||||||
|
## Errors
|
||||||
|
|
||||||
|
* **Fail loudly** with actionable messages.
|
||||||
|
|
||||||
|
```python
|
||||||
|
# ❌
|
||||||
|
raise ValueError("bad input")
|
||||||
|
|
||||||
|
# ✅
|
||||||
|
raise ValueError(f"Unknown listing_status {status!r}; expected one of {VALID_LISTING_STATUSES}")
|
||||||
|
```
|
||||||
|
|
||||||
|
* **No silent failures.** `except Exception: pass` is forbidden. Catch the specific exception, log it, and either recover or re-raise.
|
||||||
|
|
||||||
|
* **Service raises; MCP wraps.** Service functions raise normal exceptions. The MCP tool boundary translates them into `{"error": True, "code": ..., "message": ...}`. CLI lets typer handle non-zero exits.
|
||||||
|
|
||||||
|
* **Graceful degradation is explicit.** If Eiendom.no enrichment fails, return a result with `eiendom_unit=None` and a warning, not a silently-missing field.
|
||||||
|
|
||||||
|
## State
|
||||||
|
|
||||||
|
* No global mutable state. The only module-level constants allowed are configuration values loaded from env in `config.py`.
|
||||||
|
* No module-level caches (dicts, lists) that mutate. Use `cache.py` if you need persistence.
|
||||||
|
* Pass dependencies in (HTTP clients, DB connections) for testability.
|
||||||
|
|
||||||
|
## Dead code
|
||||||
|
|
||||||
|
* No commented-out code.
|
||||||
|
* No unused imports (ruff catches these — fix them, don't add `# noqa`).
|
||||||
|
* No unused parameters (use `_` or remove).
|
||||||
|
* No `if False:` blocks "for later".
|
||||||
|
* Functions and classes that aren't called anywhere — delete them. Git keeps history.
|
||||||
|
|
||||||
|
## Magic numbers and strings
|
||||||
|
|
||||||
|
Anything that influences behavior and isn't self-explanatory belongs in `config.py` (env-controlled) or as a named module-level constant near the top of the file.
|
||||||
|
|
||||||
|
```python
|
||||||
|
# ❌
|
||||||
|
if days > 90:
|
||||||
|
confidence = "low"
|
||||||
|
|
||||||
|
# ✅
|
||||||
|
COMPS_STALE_AFTER_DAYS = 90
|
||||||
|
|
||||||
|
if days > COMPS_STALE_AFTER_DAYS:
|
||||||
|
confidence = "low"
|
||||||
|
```
|
||||||
|
|
||||||
|
URLs, timeouts, retries, TTLs, status codes — never inline.
|
||||||
|
|
||||||
|
## Imports
|
||||||
|
|
||||||
|
* Standard library first, third-party second, local last, separated by blank lines.
|
||||||
|
* Ruff's `I` rules sort and group these — run `ruff check . --fix`.
|
||||||
|
* No wildcard imports.
|
||||||
|
* No relative imports above one level (`from ..thing import x` is a smell; refactor).
|
||||||
|
* Each module's allowed import set is enforced by `tests/test_architecture.py`.
|
||||||
|
|
||||||
|
## Tests are first-class code
|
||||||
|
|
||||||
|
Same rules. Same type hints. Same naming. Same DRY. If a fixture is used in three test files, it goes in `conftest.py`. If three tests share a setup, factor it into a fixture.
|
||||||
|
|
||||||
|
## Reviewing your own change before commit
|
||||||
|
|
||||||
|
A 60-second self-review:
|
||||||
|
|
||||||
|
1. Did I add a function that already exists somewhere? (`grep` it.)
|
||||||
|
2. Did I bypass `service.py`, `http.py`, `cache.py`, or `formatting.py`?
|
||||||
|
3. Is everything typed?
|
||||||
|
4. Did I leave a `print()`, `breakpoint()`, or commented-out block behind?
|
||||||
|
5. Does the test for this change actually fail without the change?
|
||||||
|
6. Did I update `PRD.md` or the relevant instruction file if I changed an architectural rule?
|
||||||
|
|
||||||
|
## When in doubt about a library API
|
||||||
|
|
||||||
|
Use the `context7` MCP server instead of guessing. See `docs.instructions.md`. Training-data memory of `pydantic.field_validator`, `typer.Option`, `mcp.tool` annotations, or `httpx.AsyncClient` is unreliable — they all change between versions.
|
||||||
@@ -0,0 +1,158 @@
|
|||||||
|
---
|
||||||
|
name: CLI rules
|
||||||
|
description: Rules for the typer-based finn-eiendom CLI
|
||||||
|
applyTo: "finn_eiendom/cli.py,finn_eiendom/__main__.py"
|
||||||
|
---
|
||||||
|
|
||||||
|
# CLI rules
|
||||||
|
|
||||||
|
The CLI is a **thin wrapper** over `service.py`. It is a sibling of `mcp_server.py` — they never call each other and they share the same underlying service functions. Every CLI command maps 1:1 to a service function with the same parameters and defaults.
|
||||||
|
|
||||||
|
## Framework
|
||||||
|
|
||||||
|
Built with [`typer`](https://typer.tiangolo.com/). One `typer.Typer` app:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# finn_eiendom/cli.py
|
||||||
|
import asyncio, typer
|
||||||
|
from . import service, formatting
|
||||||
|
|
||||||
|
app = typer.Typer(no_args_is_help=True, add_completion=False)
|
||||||
|
```
|
||||||
|
|
||||||
|
Entry points in `pyproject.toml`:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[project.scripts]
|
||||||
|
finn-eiendom-mcp = "finn_eiendom.mcp_server:main"
|
||||||
|
finn-eiendom = "finn_eiendom.cli:app"
|
||||||
|
```
|
||||||
|
|
||||||
|
Plus `finn_eiendom/__main__.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from .cli import app
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
app()
|
||||||
|
```
|
||||||
|
|
||||||
|
So `python -m finn_eiendom ...` works without installation.
|
||||||
|
|
||||||
|
## Command body shape
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.command()
|
||||||
|
def analyze_search(
|
||||||
|
url: str,
|
||||||
|
max_pages: int = 3,
|
||||||
|
detail_limit: int = 20,
|
||||||
|
no_details: bool = typer.Option(False, "--no-details"),
|
||||||
|
no_eiendom: bool = typer.Option(False, "--no-eiendom"),
|
||||||
|
with_similar: bool = typer.Option(False, "--with-similar"),
|
||||||
|
format: str = typer.Option("json", "--format"),
|
||||||
|
) -> None:
|
||||||
|
"""Analyze a FINN search URL and return a ranked shortlist."""
|
||||||
|
result = asyncio.run(service.analyze_search(
|
||||||
|
search_url=url,
|
||||||
|
max_pages=max_pages,
|
||||||
|
detail_limit=detail_limit,
|
||||||
|
include_details=not no_details,
|
||||||
|
include_eiendom_no=not no_eiendom,
|
||||||
|
include_similar_units_for_shortlist=with_similar,
|
||||||
|
))
|
||||||
|
typer.echo(formatting.render_shortlist(result, format))
|
||||||
|
```
|
||||||
|
|
||||||
|
Rules:
|
||||||
|
|
||||||
|
* The command body has at most three sections: option parsing (handled by typer), one `service.<function>` call, one `typer.echo(formatting.render_<thing>(result, format))`.
|
||||||
|
* If the body has more than ~20 lines, the logic belongs in `service.py`.
|
||||||
|
* No `print()` — use `typer.echo()` for stdout, `typer.echo(..., err=True)` for stderr.
|
||||||
|
* No business logic, no rendering, no SQLite, no HTTP, no parsing.
|
||||||
|
|
||||||
|
## Formats
|
||||||
|
|
||||||
|
Every command that produces structured output accepts `--format`:
|
||||||
|
|
||||||
|
* `--format json` (default) — full structured output, pipeable into `jq`.
|
||||||
|
* `--format markdown` — human-readable.
|
||||||
|
* `--format table` — terminal table (only where it makes sense: `analyze-search`, `compare`, `shortlist`, `diff`).
|
||||||
|
|
||||||
|
All three render paths are produced by `formatting.py`. Never format inline in `cli.py`. Unsupported values raise `ValueError` with a list of supported formats — typer surfaces this as a non-zero exit.
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
```text
|
||||||
|
finn-eiendom analyze-search <url> [--max-pages 3] [--detail-limit 20] [--no-details] [--no-eiendom] [--with-similar] [--format ...]
|
||||||
|
finn-eiendom get-ad <finnkode> [--force-refresh] [--no-eiendom] [--with-similar] [--format ...]
|
||||||
|
finn-eiendom compare <finnkode...> [--no-eiendom] [--no-comps] [--format ...]
|
||||||
|
finn-eiendom save-feedback <finnkode> <verdict> [--notes "..."]
|
||||||
|
finn-eiendom shortlist [--run-id ID] [--limit 10] [--format ...]
|
||||||
|
finn-eiendom diff <url> [--format ...]
|
||||||
|
finn-eiendom resolve-unit <finn_url>
|
||||||
|
finn-eiendom get-unit <unit_code> [--force-refresh]
|
||||||
|
finn-eiendom enrich-ad <finnkode> [--with-similar]
|
||||||
|
finn-eiendom build-vector <unit_code>
|
||||||
|
finn-eiendom decode-vector <unit_vector>
|
||||||
|
finn-eiendom similar-units <unit_vector> [--status RECENTLY_SOLD|FOR_SALE|CURRENT]
|
||||||
|
finn-eiendom similar-to-liked <finnkode> [--mode recommendations|comps] [--status ...]
|
||||||
|
finn-eiendom analyze-against-comps <finnkode>
|
||||||
|
finn-eiendom cache stats | clear | clear-html | clear-json
|
||||||
|
finn-eiendom serve [--transport stdio|http] [--host 127.0.0.1] [--port 8010]
|
||||||
|
finn-eiendom config show | path
|
||||||
|
finn-eiendom doctor
|
||||||
|
finn-eiendom version
|
||||||
|
```
|
||||||
|
|
||||||
|
Sub-command groups (`cache`, `config`) use `typer.Typer` sub-apps:
|
||||||
|
|
||||||
|
```python
|
||||||
|
cache_app = typer.Typer(help="Cache management")
|
||||||
|
app.add_typer(cache_app, name="cache")
|
||||||
|
|
||||||
|
@cache_app.command("stats")
|
||||||
|
def cache_stats() -> None:
|
||||||
|
typer.echo(formatting.render_cache_stats(service.get_cache_stats(), "json"))
|
||||||
|
```
|
||||||
|
|
||||||
|
## Async glue
|
||||||
|
|
||||||
|
Service functions are async; CLI commands are sync. Always use `asyncio.run(service.<function>(...))` at the call boundary. Don't sprinkle `async def` across CLI commands — typer expects sync handlers.
|
||||||
|
|
||||||
|
## Exit codes
|
||||||
|
|
||||||
|
* `0` — success.
|
||||||
|
* `1` — runtime error (raised exception in service).
|
||||||
|
* `2` — usage error (typer's default for bad options).
|
||||||
|
|
||||||
|
Let exceptions propagate from `service.py` and rely on typer's default handling. Only catch where you want a more specific exit code or message.
|
||||||
|
|
||||||
|
## What stays out of cli.py
|
||||||
|
|
||||||
|
* `import httpx`, `import sqlite3`, `import msgpack` — never.
|
||||||
|
* `from .ad import ...`, `from .search import ...`, `from .eiendom_no import ...`, `from .scoring import ...`, `from .cache import ...`, `from .http import ...` — never.
|
||||||
|
* Inline formatting logic — goes in `formatting.py`.
|
||||||
|
* MCP imports (no `from .mcp_server import ...`).
|
||||||
|
|
||||||
|
Allowed imports in `cli.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import asyncio, json, sys
|
||||||
|
import typer
|
||||||
|
from . import service, formatting, config
|
||||||
|
from .models import FinnAd, EiendomUnit, SimilarUnit # only for type hints
|
||||||
|
```
|
||||||
|
|
||||||
|
`tests/test_architecture.py` enforces this.
|
||||||
|
|
||||||
|
## When uncertain about typer
|
||||||
|
|
||||||
|
Use `context7` instead of guessing:
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id → "tiangolo/typer"
|
||||||
|
context7:query-docs(id, "Typer sub-apps and option groups")
|
||||||
|
```
|
||||||
|
|
||||||
|
See `docs.instructions.md`.
|
||||||
@@ -0,0 +1,118 @@
|
|||||||
|
---
|
||||||
|
name: Documentation lookups via context7 MCP
|
||||||
|
description: How and when to use the context7 MCP server for library documentation
|
||||||
|
applyTo: "**/*.py,**/*.md,**/*.toml,**/*.yaml,**/*.yml"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Documentation lookups — use context7
|
||||||
|
|
||||||
|
When you are uncertain about a library's API, **call the `context7` MCP server before writing code**. Do not rely on training-data memory. Pydantic, FastMCP, Typer, httpx, and pytest all evolve quickly; what was true two releases ago is often wrong now.
|
||||||
|
|
||||||
|
## When to use context7
|
||||||
|
|
||||||
|
Use it **before** writing code involving any of these:
|
||||||
|
|
||||||
|
* **FastMCP / MCP Python SDK** — `@mcp.tool()` signatures, `ToolAnnotations`, `mcp.run(transport=...)`, resource and prompt decorators, server lifecycle, streamable-HTTP setup.
|
||||||
|
* **Pydantic v2** — `BaseModel`, `Field`, `ConfigDict`, `model_validator`, `field_validator`, `model_dump` / `model_dump_json`, discriminated unions, `Annotated[...]` with validators.
|
||||||
|
* **Typer** — `Typer()` apps, `typer.Option`, `typer.Argument`, sub-apps via `add_typer`, callbacks, exit codes, testing with `CliRunner`.
|
||||||
|
* **httpx** — `AsyncClient`, timeouts, transports, retries, `Response` API.
|
||||||
|
* **respx** — mocking httpx, `respx.mock`, `route.mock`, match patterns.
|
||||||
|
* **msgpack** — packing/unpacking, type extensions, raw vs string mode.
|
||||||
|
* **base64** — `urlsafe_b64encode`, padding handling.
|
||||||
|
* **pytest** / **pytest-asyncio** — fixtures, parametrize, async tests, markers, `tmp_path`, `monkeypatch`.
|
||||||
|
* **BeautifulSoup** / **lxml** — selectors, parser flavors, element traversal.
|
||||||
|
* **typer.testing.CliRunner** — invoking apps, asserting on stdout/stderr/exit codes.
|
||||||
|
|
||||||
|
Use it **also** when:
|
||||||
|
|
||||||
|
* A test fails with an error like `AttributeError: 'BaseModel' object has no attribute 'dict'` (Pydantic v1 vs v2 confusion).
|
||||||
|
* You see a `DeprecationWarning` from a third-party library and aren't sure of the modern replacement.
|
||||||
|
* You're about to copy a code pattern from memory that feels "old".
|
||||||
|
|
||||||
|
## When NOT to use it
|
||||||
|
|
||||||
|
* Pure Python stdlib (`json`, `pathlib`, `dataclasses`, `typing`) — these are stable and well-known.
|
||||||
|
* Project-internal modules — read the source.
|
||||||
|
* Generic programming questions ("what's a list comprehension") — use your own knowledge.
|
||||||
|
* FINN / Eiendom.no API behavior — these are not in context7. Use fixtures from prior runs in `tests/fixtures/` and the endpoint notes in `PRD.md` §9.
|
||||||
|
|
||||||
|
## How to use it
|
||||||
|
|
||||||
|
Two-step pattern:
|
||||||
|
|
||||||
|
### 1. Resolve the library ID
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id(query="fastmcp")
|
||||||
|
context7:resolve-library-id(query="pydantic")
|
||||||
|
context7:resolve-library-id(query="typer")
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns the canonical library ID (e.g. `pydantic/pydantic`, `fastapi/typer`). Pick the most-starred / official-looking match.
|
||||||
|
|
||||||
|
### 2. Query the docs
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:query-docs(
|
||||||
|
context7CompatibleLibraryID="pydantic/pydantic",
|
||||||
|
topic="field validators v2 mode after",
|
||||||
|
tokens=3000,
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
* **Keep the topic focused.** "Pydantic v2 field validators with mode=after on Optional[str]" beats "Pydantic validation".
|
||||||
|
* **Cap tokens** to roughly what you need (1500–4000 is usually plenty). The default is fine for most calls.
|
||||||
|
* **Use library-specific terminology** in the topic — "discriminator field" for Pydantic, "tool annotations" for FastMCP, "sub-apps" for Typer.
|
||||||
|
|
||||||
|
### Worked examples
|
||||||
|
|
||||||
|
**Q: How do I declare a FastMCP tool with read-only annotations?**
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id(query="modelcontextprotocol python sdk")
|
||||||
|
context7:query-docs(context7CompatibleLibraryID="<resolved id>",
|
||||||
|
topic="FastMCP @mcp.tool ToolAnnotations readOnlyHint")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Q: How do I write a Pydantic v2 model_validator that runs after field validation?**
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id(query="pydantic")
|
||||||
|
context7:query-docs(context7CompatibleLibraryID="pydantic/pydantic",
|
||||||
|
topic="model_validator mode='after' v2")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Q: How do I mock an async httpx POST with respx?**
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id(query="respx")
|
||||||
|
context7:query-docs(context7CompatibleLibraryID="<resolved id>",
|
||||||
|
topic="respx mock async httpx POST json body")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Q: How do I add a Typer sub-app for `cache` commands?**
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id(query="typer")
|
||||||
|
context7:query-docs(context7CompatibleLibraryID="<resolved id>",
|
||||||
|
topic="Typer add_typer sub-application command groups")
|
||||||
|
```
|
||||||
|
|
||||||
|
## After the lookup
|
||||||
|
|
||||||
|
* Cite or summarize what you found in a code comment **only when** the snippet documents a non-obvious API choice — otherwise the code is enough.
|
||||||
|
* If context7 returns nothing useful, fall back to:
|
||||||
|
1. The library's official docs site.
|
||||||
|
2. The library's repo `README` / `examples/`.
|
||||||
|
3. The smallest possible spike (a 5-line script in the venv) to verify behavior.
|
||||||
|
|
||||||
|
## Anti-patterns
|
||||||
|
|
||||||
|
* **Don't** invent a method signature from memory and hope. If you're not 100% sure of an API, look it up.
|
||||||
|
* **Don't** copy patterns from old Stack Overflow answers without verifying — Pydantic, FastMCP, and Typer all had breaking changes recently.
|
||||||
|
* **Don't** silence a warning instead of fixing the deprecation. Look up the modern API.
|
||||||
|
* **Don't** query context7 for FINN or Eiendom.no — those endpoints aren't in any public docs index. Use `tests/fixtures/` and `PRD.md` §9.
|
||||||
|
|
||||||
|
## Network configuration note
|
||||||
|
|
||||||
|
`context7` is configured as a connected MCP server in this environment. If a call fails with a connection error, surface it clearly — don't fall back to guessing.
|
||||||
@@ -0,0 +1,192 @@
|
|||||||
|
---
|
||||||
|
name: MCP rules
|
||||||
|
description: Rules for FastMCP tools, resources, and prompts
|
||||||
|
applyTo: "finn_eiendom/mcp_server.py,finn_eiendom/**/*mcp*.py"
|
||||||
|
---
|
||||||
|
|
||||||
|
# MCP server rules
|
||||||
|
|
||||||
|
The MCP server is a **thin wrapper** over `service.py`. It owns:
|
||||||
|
|
||||||
|
* Tool registration with `@mcp.tool()` and annotations.
|
||||||
|
* Pydantic input schemas (these double as tool documentation).
|
||||||
|
* Error wrapping at the protocol boundary.
|
||||||
|
* JSON / markdown response formatting via `formatting.py`.
|
||||||
|
|
||||||
|
It does **not** own:
|
||||||
|
|
||||||
|
* Parsing, scraping, scoring, cache, or HTTP fetching logic.
|
||||||
|
* SQLite or `httpx` access.
|
||||||
|
* Any orchestration of "check cache, else fetch, else save" — that's `service.py`.
|
||||||
|
|
||||||
|
## Server bootstrap
|
||||||
|
|
||||||
|
```python
|
||||||
|
# finn_eiendom/mcp_server.py
|
||||||
|
import sys, logging
|
||||||
|
from mcp.server.fastmcp import FastMCP
|
||||||
|
|
||||||
|
logging.basicConfig(stream=sys.stderr, level=logging.INFO,
|
||||||
|
format="%(asctime)s %(levelname)s %(name)s %(message)s")
|
||||||
|
|
||||||
|
mcp = FastMCP("finn_eiendom_mcp")
|
||||||
|
|
||||||
|
# ... tools registered here ...
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
mcp.run(transport="stdio")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
|
```
|
||||||
|
|
||||||
|
stdio servers **must** log to stderr only — anything on stdout breaks the JSON-RPC frame.
|
||||||
|
|
||||||
|
## Tool naming
|
||||||
|
|
||||||
|
All tools use the `finn_` prefix so they don't collide with other MCP servers running in the same Claude Desktop:
|
||||||
|
|
||||||
|
* `finn_analyze_search`
|
||||||
|
* `finn_get_ad`
|
||||||
|
* `finn_compare_ads`
|
||||||
|
* `finn_save_feedback`
|
||||||
|
* `finn_get_shortlist`
|
||||||
|
* `finn_get_new_ads_since_last_run`
|
||||||
|
* `finn_resolve_eiendom_unit`
|
||||||
|
* `finn_get_eiendom_unit`
|
||||||
|
* `finn_enrich_ad`
|
||||||
|
* `finn_build_unit_vector`
|
||||||
|
* `finn_decode_unit_vector`
|
||||||
|
* `finn_get_similar_units`
|
||||||
|
* `finn_find_similar_to_liked_ad`
|
||||||
|
* `finn_analyze_ad_against_comps`
|
||||||
|
|
||||||
|
## Tool body shape
|
||||||
|
|
||||||
|
Every tool body looks like this:
|
||||||
|
|
||||||
|
```python
|
||||||
|
@mcp.tool(
|
||||||
|
annotations=ToolAnnotations(
|
||||||
|
title="Analyze a FINN search URL",
|
||||||
|
readOnlyHint=True,
|
||||||
|
destructiveHint=False,
|
||||||
|
openWorldHint=True,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
async def finn_analyze_search(input: AnalyzeSearchInput) -> str:
|
||||||
|
"""Analyze a FINN search URL and return a ranked shortlist."""
|
||||||
|
try:
|
||||||
|
result = await service.analyze_search(
|
||||||
|
search_url=input.search_url,
|
||||||
|
max_pages=input.max_pages,
|
||||||
|
detail_limit=input.detail_limit,
|
||||||
|
include_details=input.include_details,
|
||||||
|
include_eiendom_no=input.include_eiendom_no,
|
||||||
|
include_similar_units_for_shortlist=input.include_similar_units_for_shortlist,
|
||||||
|
)
|
||||||
|
return formatting.render_shortlist(result, input.response_format)
|
||||||
|
except Exception as e:
|
||||||
|
log.exception("finn_analyze_search failed")
|
||||||
|
return json.dumps({
|
||||||
|
"error": True,
|
||||||
|
"code": type(e).__name__,
|
||||||
|
"message": str(e),
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
|
||||||
|
* Every tool delegates to `service.<function>` in one call.
|
||||||
|
* Every tool wraps in try/except and returns the error envelope as a JSON string.
|
||||||
|
* Output rendering goes through `formatting.py`, never inline.
|
||||||
|
* If the tool body needs more than ~20 lines, logic has leaked out of the service layer — push it back down.
|
||||||
|
|
||||||
|
## Input schemas
|
||||||
|
|
||||||
|
Every tool has a Pydantic v2 input model. Schemas live with the tool in `mcp_server.py` (they document the tool to LLM clients). Reuse from `models.py` only when the same shape is also a domain object — otherwise keep them as tool-local input types.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class AnalyzeSearchInput(BaseModel):
|
||||||
|
search_url: str = Field(..., description="Full FINN search URL")
|
||||||
|
max_pages: int = Field(default=3, ge=1, le=10)
|
||||||
|
detail_limit: int = Field(default=20, ge=1, le=100)
|
||||||
|
include_details: bool = True
|
||||||
|
include_eiendom_no: bool = True
|
||||||
|
include_similar_units_for_shortlist: bool = False
|
||||||
|
response_format: Literal["json", "markdown"] = "json"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Annotations
|
||||||
|
|
||||||
|
Set the right hints:
|
||||||
|
|
||||||
|
* Read-only tools (most of them): `readOnlyHint=True, destructiveHint=False, openWorldHint=True`.
|
||||||
|
* `finn_save_feedback`: `readOnlyHint=False, destructiveHint=False, idempotentHint=False`.
|
||||||
|
|
||||||
|
## Response format
|
||||||
|
|
||||||
|
Tools accept a `response_format` parameter (`"json"` or `"markdown"`):
|
||||||
|
|
||||||
|
* `"json"` — return `json.dumps(result_dict)`.
|
||||||
|
* `"markdown"` — return `formatting.render_<thing>(result, "markdown")`.
|
||||||
|
|
||||||
|
Errors are always returned as the JSON error envelope regardless of `response_format`.
|
||||||
|
|
||||||
|
## What stays out of mcp_server.py
|
||||||
|
|
||||||
|
* `import httpx` — never.
|
||||||
|
* `import sqlite3` — never.
|
||||||
|
* `from .ad import ...`, `from .search import ...`, `from .eiendom_no import ...`, `from .scoring import ...`, `from .cache import ...`, `from .http import ...` — never. Go through `service`.
|
||||||
|
* Output formatting logic — goes in `formatting.py`.
|
||||||
|
* Cache management — goes in `service.py`.
|
||||||
|
|
||||||
|
Allowed imports in `mcp_server.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import json, logging, sys
|
||||||
|
from typing import Literal, Optional
|
||||||
|
from mcp.server.fastmcp import FastMCP
|
||||||
|
from mcp.server.fastmcp.utilities import ToolAnnotations
|
||||||
|
from pydantic import BaseModel, Field
|
||||||
|
from . import service, formatting
|
||||||
|
from .models import FinnAd, EiendomUnit, SimilarUnit # only if needed for type hints
|
||||||
|
from . import config
|
||||||
|
```
|
||||||
|
|
||||||
|
`tests/test_architecture.py` enforces this.
|
||||||
|
|
||||||
|
## Resources and prompts
|
||||||
|
|
||||||
|
When you add resources or prompts, they follow the same rule: thin wrappers over `service.py` and `formatting.py`. Resources:
|
||||||
|
|
||||||
|
```
|
||||||
|
finn://preferences/current
|
||||||
|
finn://search-runs/latest
|
||||||
|
finn://search-runs/{id}
|
||||||
|
finn://ads/{finnkode}
|
||||||
|
finn://ads/{finnkode}/enriched
|
||||||
|
finn://shortlist/latest
|
||||||
|
finn://feedback/{finnkode}
|
||||||
|
finn://eiendom-units/{unitCode}
|
||||||
|
finn://eiendom-units/{unitCode}/similar/{listingStatus}
|
||||||
|
```
|
||||||
|
|
||||||
|
Prompts: `evaluate_property_for_user`, `compare_properties_for_user`, `refine_search_from_feedback`, `find_more_like_this`.
|
||||||
|
|
||||||
|
## When uncertain about FastMCP
|
||||||
|
|
||||||
|
Use `context7` for FastMCP / MCP SDK questions instead of guessing:
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id → "modelcontextprotocol/python-sdk" or similar
|
||||||
|
context7:query-docs(id, "FastMCP tool annotations") → snippets
|
||||||
|
```
|
||||||
|
|
||||||
|
See `docs.instructions.md`.
|
||||||
|
|
||||||
|
## Transports
|
||||||
|
|
||||||
|
* Default: stdio. `finn-eiendom-mcp` is the entry point.
|
||||||
|
* Optional: Streamable HTTP via `finn-eiendom serve --transport http --port 8010`. Path: `POST /mcp`. Operational endpoints: `GET /health`, `GET /version`, `GET /debug/config`.
|
||||||
|
* Keep tools transport-agnostic. No request/response shape depends on the transport.
|
||||||
@@ -0,0 +1,80 @@
|
|||||||
|
---
|
||||||
|
name: Python project rules
|
||||||
|
description: Python conventions for the FINN/Eiendom MCP server
|
||||||
|
applyTo: "**/*.py"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Python conventions
|
||||||
|
|
||||||
|
## Runtime
|
||||||
|
|
||||||
|
* Python **3.12+**.
|
||||||
|
* Project-local virtualenv at `.venv/` (created by `uv venv` or `python3.12 -m venv .venv`).
|
||||||
|
* All commands run inside the activated venv.
|
||||||
|
* Editable install: `uv pip install -e ".[dev]"` (or `pip install -e ".[dev]"`).
|
||||||
|
* Never install packages globally; never use `sudo pip`; never mutate host Python.
|
||||||
|
* Add new dependencies to `pyproject.toml` in the same change that uses them.
|
||||||
|
|
||||||
|
## Language
|
||||||
|
|
||||||
|
* Use Python 3.12 syntax. Prefer `X | None` over `Optional[X]`, `list[int]` over `List[int]`, structural pattern matching where it actually helps.
|
||||||
|
* **Type hints on every function signature**, including private helpers. `mypy --strict finn_eiendom` is the target.
|
||||||
|
* Async-first for I/O. Sync code is fine for parsing, scoring, and cache access (SQLite).
|
||||||
|
* Pydantic v2 for all structured domain models, with `model_config = ConfigDict(...)`. No v1 `class Config:` blocks.
|
||||||
|
|
||||||
|
## Prefer
|
||||||
|
|
||||||
|
* Small, pure functions for parsing, normalization, and scoring.
|
||||||
|
* Explicit return types and explicit exceptions.
|
||||||
|
* Dependency injection for HTTP clients and DB connections in tests (pass `client` / `conn` as args; let services own the defaults).
|
||||||
|
* Domain names from the PRD (`FinnAd`, `EiendomUnit`, `SimilarUnit`, `analyze_search`, `get_or_fetch_ad`).
|
||||||
|
* `dataclass` for internal value objects that don't cross the API boundary; Pydantic for anything serialized or validated.
|
||||||
|
|
||||||
|
## Avoid
|
||||||
|
|
||||||
|
* Global mutable state (module-level dicts as caches, etc.). The only allowed module-level state is configuration loaded from env in `config.py`.
|
||||||
|
* Hardcoded URLs, credentials, paths, or magic numbers anywhere outside `config.py`.
|
||||||
|
* `httpx` imports anywhere except `finn_eiendom/http.py`.
|
||||||
|
* `sqlite3` imports anywhere except `finn_eiendom/cache.py`.
|
||||||
|
* `BeautifulSoup` imports anywhere except `finn_eiendom/search.py` and `finn_eiendom/ad.py`.
|
||||||
|
* `msgpack` imports anywhere except `finn_eiendom/eiendom_no.py`.
|
||||||
|
* Scraping, scoring, cache, or HTTP fetching logic inside MCP tool or CLI command bodies.
|
||||||
|
* Direct network calls in unit tests — use `respx` and fixtures.
|
||||||
|
* `print()` for logging — use the `logging` module. stdio MCP server logs go to **stderr only**.
|
||||||
|
* Bare `except:` or `except Exception: pass` — catch the specific exception or let it propagate.
|
||||||
|
|
||||||
|
## External fetches
|
||||||
|
|
||||||
|
All external fetches must support:
|
||||||
|
|
||||||
|
* Configurable request delay (`FINN_REQUEST_DELAY_SECONDS`, `EIENDOM_NO_REQUEST_DELAY_SECONDS`).
|
||||||
|
* Cache lookup before fetch.
|
||||||
|
* Retry on 5xx with exponential backoff (`1s, 2s, 4s`).
|
||||||
|
* Graceful failure that returns `None` or empty rather than raising, when the caller can degrade.
|
||||||
|
* Structured logging at INFO for success, WARNING for retry, ERROR for final failure.
|
||||||
|
|
||||||
|
## Best practices
|
||||||
|
|
||||||
|
* **Single responsibility per function.** If a function name needs "and" to describe it, it's two functions.
|
||||||
|
* **Function length:** aim for under 30 lines. Past 50 lines it's a code smell — extract helpers.
|
||||||
|
* **Cyclomatic complexity:** if you've got more than 3 levels of nesting, the function wants splitting.
|
||||||
|
* **Naming:** `get_or_fetch_ad`, not `process_ad`. Verbs for actions, nouns for data. Avoid abbreviations except those well-known in the domain (`url`, `ad`, `nok`).
|
||||||
|
* **DRY:** if you write the same logic, regex, SQL, or format string twice, extract it. The decision table in `PRD.md` §17.2 tells you where it belongs.
|
||||||
|
* **Comments explain WHY**, not WHAT. The code already says what.
|
||||||
|
* **Errors are loud:** raise with actionable messages (`f"Unknown listing_status {status!r}; expected one of {VALID_STATUSES}"`). The MCP boundary wraps them as `{"error": True, ...}`.
|
||||||
|
|
||||||
|
## When uncertain about a library API
|
||||||
|
|
||||||
|
Use the `context7` MCP server **before** writing code:
|
||||||
|
|
||||||
|
1. `context7:resolve-library-id` with the package name → canonical library ID.
|
||||||
|
2. `context7:query-docs` with that ID + focused topic.
|
||||||
|
|
||||||
|
See `docs.instructions.md`. Don't guess from training memory — Pydantic, FastMCP, and Typer all change.
|
||||||
|
|
||||||
|
## Tooling
|
||||||
|
|
||||||
|
* `ruff check .` — lint. Target Python 3.12. Active rules: `E F I UP B SIM`.
|
||||||
|
* `ruff format .` — format. Line length 100.
|
||||||
|
* `mypy --strict finn_eiendom` — type-check.
|
||||||
|
* `pytest` — run the full suite.
|
||||||
@@ -0,0 +1,199 @@
|
|||||||
|
---
|
||||||
|
name: Test rules
|
||||||
|
description: Testing conventions for parser, cache, scoring, service, MCP, CLI, and architecture
|
||||||
|
applyTo: "tests/**/*.py"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Test rules
|
||||||
|
|
||||||
|
## Runtime
|
||||||
|
|
||||||
|
Tests run in the project-local `.venv`. From the project root with the venv activated:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest # full suite
|
||||||
|
pytest tests/test_service.py -v # one file
|
||||||
|
pytest -k "shortlist" # one keyword
|
||||||
|
pytest --lf # rerun last failures
|
||||||
|
```
|
||||||
|
|
||||||
|
`pytest-asyncio` is in `[tool.pytest.ini_options]` with `asyncio_mode = "auto"` — `async def` tests run without an `@pytest.mark.asyncio` decorator.
|
||||||
|
|
||||||
|
## Never do live network calls
|
||||||
|
|
||||||
|
No real HTTP in unit tests. Mock with `respx` (sits in front of `httpx.AsyncClient`):
|
||||||
|
|
||||||
|
```python
|
||||||
|
import respx, httpx
|
||||||
|
from finn_eiendom import http as http_module
|
||||||
|
|
||||||
|
@respx.mock
|
||||||
|
async def test_finn_search_fetch_uses_user_agent():
|
||||||
|
route = respx.get("https://www.finn.no/realestate/homes/search.html").mock(
|
||||||
|
return_value=httpx.Response(200, html=SAMPLE_FINN_SEARCH_HTML)
|
||||||
|
)
|
||||||
|
client = http_module.HTTPClient(user_agent="test-agent")
|
||||||
|
resp = await client.get("https://www.finn.no/realestate/homes/search.html")
|
||||||
|
assert resp.status_code == 200
|
||||||
|
assert route.calls.last.request.headers["user-agent"] == "test-agent"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Fixtures
|
||||||
|
|
||||||
|
Fixture-driven testing for parsers and APIs:
|
||||||
|
|
||||||
|
* FINN search HTML → `tests/fixtures/finn_search.html`.
|
||||||
|
* FINN listing HTML → `tests/fixtures/finn_ad_*.html`.
|
||||||
|
* Eiendom.no unit search JSON → `tests/fixtures/eiendom_unit_search.json`.
|
||||||
|
* Eiendom.no unit detail JSON → `tests/fixtures/eiendom_unit_detail.json`.
|
||||||
|
* Eiendom.no similar-units JSON → `tests/fixtures/eiendom_similar.json`.
|
||||||
|
|
||||||
|
Loader helpers in `tests/fixtures.py` (e.g. `SAMPLE_FINN_SEARCH_HTML`, `SAMPLE_EIENDOM_UNIT_JSON`). Add new fixtures here, don't inline large strings in test files.
|
||||||
|
|
||||||
|
## Test layout
|
||||||
|
|
||||||
|
```
|
||||||
|
tests/
|
||||||
|
fixtures/ # raw HTML / JSON inputs
|
||||||
|
fixtures.py # loader helpers
|
||||||
|
conftest.py # shared pytest fixtures (tmp DB, http client, etc.)
|
||||||
|
test_parser.py # number/area/date/URL/finnkode normalization
|
||||||
|
test_search.py # FINN search HTML → cards
|
||||||
|
test_ad.py # FINN listing HTML → FinnAd
|
||||||
|
test_eiendom_no.py # unit search/detail/similar JSON, unit_vector encode/decode
|
||||||
|
test_scoring.py # all scoring components + classifier
|
||||||
|
test_cache.py # SQLite read/write/TTL
|
||||||
|
test_http.py # retry on 5xx, raise on 4xx, delay applied (new)
|
||||||
|
test_service.py # get_or_fetch_*, analyze_* (new)
|
||||||
|
test_formatting.py # render_* json/markdown/table (new)
|
||||||
|
test_mcp_server.py # tool registration + error envelope (expanded)
|
||||||
|
test_cli.py # typer CliRunner (new)
|
||||||
|
test_architecture.py # import-graph invariants (new)
|
||||||
|
```
|
||||||
|
|
||||||
|
## What to test per category
|
||||||
|
|
||||||
|
### Parsers (`test_parser`, `test_search`, `test_ad`, `test_eiendom_no`)
|
||||||
|
|
||||||
|
* Missing fields → `None`, not exception.
|
||||||
|
* Norwegian number formats: `7 200 991 kr`, `kr 7 200 991`, `7.200.991`.
|
||||||
|
* URL normalization (relative → absolute).
|
||||||
|
* Finnkode extraction from various URL shapes.
|
||||||
|
* Area parsing: `77 m²`, `77m2`, `77 kvm`.
|
||||||
|
* Price parsing (asking vs total vs shared debt).
|
||||||
|
* Eiendom.no JSON edge cases: empty `units`, missing `valuation`, missing `latestMarketData`.
|
||||||
|
|
||||||
|
### Unit vectors (`test_eiendom_no`)
|
||||||
|
|
||||||
|
* msgpack encoding + base64url without padding.
|
||||||
|
* Decode roundtrip.
|
||||||
|
* Missing optional fields (floor, rooms, built).
|
||||||
|
* Both lon/lat orderings handled.
|
||||||
|
|
||||||
|
### Scoring (`test_scoring`)
|
||||||
|
|
||||||
|
* Each component in isolation.
|
||||||
|
* Total clamped to 0–100.
|
||||||
|
* Risk penalties applied (negative range).
|
||||||
|
* Bargain classification triggers on the expected signal mix.
|
||||||
|
* Hybel classification: documented / possible / unclear / not relevant.
|
||||||
|
* Explainability: explanation list non-empty when score is non-trivial.
|
||||||
|
|
||||||
|
### Cache (`test_cache`)
|
||||||
|
|
||||||
|
* Read after write returns same object.
|
||||||
|
* TTL expiry returns `None`.
|
||||||
|
* JSON roundtrip preserves all fields.
|
||||||
|
* `init_db` is idempotent on existing DBs.
|
||||||
|
|
||||||
|
### HTTP (`test_http`)
|
||||||
|
|
||||||
|
* Retries on 500/502/503/504 with backoff (count exactly N retries).
|
||||||
|
* Raises immediately on 404 / 4xx.
|
||||||
|
* Applies `request_delay` between calls.
|
||||||
|
* Honors `user_agent`.
|
||||||
|
|
||||||
|
### Service (`test_service`)
|
||||||
|
|
||||||
|
The service tests are the heart of the suite. They cover orchestration end-to-end against fixtures.
|
||||||
|
|
||||||
|
* `test_get_or_fetch_ad_uses_cache` — second call hits cache, no HTTP.
|
||||||
|
* `test_get_or_fetch_ad_fetches_when_cache_miss` — first call hits HTTP, then writes cache.
|
||||||
|
* `test_get_or_fetch_ad_force_refresh` — `force_refresh=True` bypasses cache.
|
||||||
|
* `test_analyze_search_with_fixtures` — full run from search HTML → shortlist.
|
||||||
|
* `test_find_similar_to_liked_uses_liked_feedback` — only seeds from `liked` verdicts.
|
||||||
|
|
||||||
|
Use a tmp SQLite DB via the `tmp_path` pytest fixture:
|
||||||
|
|
||||||
|
```python
|
||||||
|
@pytest.fixture
|
||||||
|
def tmp_db(tmp_path, monkeypatch):
|
||||||
|
db_path = tmp_path / "finn.sqlite"
|
||||||
|
monkeypatch.setenv("FINN_CACHE_PATH", str(db_path))
|
||||||
|
return db_path
|
||||||
|
```
|
||||||
|
|
||||||
|
### Formatting (`test_formatting`)
|
||||||
|
|
||||||
|
* `render_shortlist(result, "json")` is parseable JSON and roundtrips.
|
||||||
|
* `render_shortlist(result, "markdown")` contains the score and at least one risk.
|
||||||
|
* `render_<thing>(result, "xml")` raises `ValueError` listing supported formats.
|
||||||
|
|
||||||
|
### MCP (`test_mcp_server`)
|
||||||
|
|
||||||
|
* `test_mcp_server_has_correct_tools` — all 14 `finn_*` tool names registered.
|
||||||
|
* `test_finn_decode_unit_vector_returns_json` — happy path.
|
||||||
|
* `test_finn_analyze_search_handles_error` — error envelope shape: `{"error": True, "code": ..., "message": ...}`.
|
||||||
|
|
||||||
|
Use the `mcp` SDK's testing helpers; don't spawn a subprocess.
|
||||||
|
|
||||||
|
### CLI (`test_cli`)
|
||||||
|
|
||||||
|
Use Typer's `CliRunner`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from typer.testing import CliRunner
|
||||||
|
from finn_eiendom.cli import app
|
||||||
|
|
||||||
|
runner = CliRunner()
|
||||||
|
|
||||||
|
def test_cli_help():
|
||||||
|
result = runner.invoke(app, ["--help"])
|
||||||
|
assert result.exit_code == 0
|
||||||
|
assert "analyze-search" in result.stdout
|
||||||
|
```
|
||||||
|
|
||||||
|
Patch `service.<function>` with `monkeypatch` so CLI tests don't exercise the full stack — that's covered by `test_service.py`.
|
||||||
|
|
||||||
|
### Architecture (`test_architecture`)
|
||||||
|
|
||||||
|
Static checks of the module dependency graph:
|
||||||
|
|
||||||
|
* No `import httpx` outside `finn_eiendom/http.py`.
|
||||||
|
* No `import sqlite3` outside `finn_eiendom/cache.py`.
|
||||||
|
* No `BeautifulSoup` import outside `search.py` and `ad.py`.
|
||||||
|
* No `msgpack` import outside `eiendom_no.py`.
|
||||||
|
* `mcp_server.py` only imports from `service`, `formatting`, `models`, `config`, `mcp`, stdlib, `pydantic`.
|
||||||
|
* `cli.py` only imports from `service`, `formatting`, `models`, `config`, `typer`, stdlib.
|
||||||
|
* `service.py` does not import from `mcp_server` or `cli`.
|
||||||
|
|
||||||
|
Implementation: walk `.py` files under `finn_eiendom/` with `ast`, collect imports, assert allowed sets per module.
|
||||||
|
|
||||||
|
## Best practices
|
||||||
|
|
||||||
|
* One assertion per test (or per closely related group). Long tests die in painful ways.
|
||||||
|
* Test names describe the behavior: `test_get_or_fetch_ad_uses_cache_within_ttl`.
|
||||||
|
* Use `monkeypatch` for env vars and `tmp_path` for files. No `os.environ` mutation.
|
||||||
|
* No `time.sleep` — use `freezegun` if a test depends on time, or refactor the code under test to take a `now` parameter.
|
||||||
|
* No "smoke tests" that ping real servers — those go under a separately-marked `pytest -m live` suite and are not part of CI.
|
||||||
|
|
||||||
|
## When uncertain about test tooling
|
||||||
|
|
||||||
|
Use `context7` for pytest, respx, freezegun, or Typer testing:
|
||||||
|
|
||||||
|
```
|
||||||
|
context7:resolve-library-id → "pytest-dev/pytest" / "lundberg/respx"
|
||||||
|
context7:query-docs(id, "respx mock httpx async post")
|
||||||
|
```
|
||||||
|
|
||||||
|
See `docs.instructions.md`.
|
||||||
+33
@@ -0,0 +1,33 @@
|
|||||||
|
# Python
|
||||||
|
__pycache__/
|
||||||
|
*.py[cod]
|
||||||
|
*.egg-info/
|
||||||
|
.pytest_cache/
|
||||||
|
.mypy_cache/
|
||||||
|
.ruff_cache/
|
||||||
|
.coverage
|
||||||
|
htmlcov/
|
||||||
|
|
||||||
|
# Virtualenvs
|
||||||
|
.venv/
|
||||||
|
venv/
|
||||||
|
|
||||||
|
# uv
|
||||||
|
# uv.lock
|
||||||
|
|
||||||
|
# Env
|
||||||
|
.env
|
||||||
|
.env.local
|
||||||
|
|
||||||
|
# Data/cache
|
||||||
|
data/*.sqlite
|
||||||
|
data/*.sqlite-*
|
||||||
|
data/*.db
|
||||||
|
data/*.db-*
|
||||||
|
|
||||||
|
# Editor
|
||||||
|
.DS_Store
|
||||||
|
.idea/
|
||||||
|
|
||||||
|
# Logs
|
||||||
|
*.log
|
||||||
Vendored
+10
@@ -0,0 +1,10 @@
|
|||||||
|
{
|
||||||
|
"recommendations": [
|
||||||
|
"github.copilot",
|
||||||
|
"github.copilot-chat",
|
||||||
|
"ms-python.python",
|
||||||
|
"charliermarsh.ruff",
|
||||||
|
"ms-azuretools.vscode-docker",
|
||||||
|
"tamasfe.even-better-toml"
|
||||||
|
]
|
||||||
|
}
|
||||||
Vendored
+8
@@ -0,0 +1,8 @@
|
|||||||
|
{
|
||||||
|
"servers": {
|
||||||
|
"context7": {
|
||||||
|
"type": "http",
|
||||||
|
"url": "https://mcp.context7.com/mcp",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
Vendored
+23
@@ -0,0 +1,23 @@
|
|||||||
|
{
|
||||||
|
"python.defaultInterpreterPath": ".venv/bin/python",
|
||||||
|
"python.testing.pytestEnabled": true,
|
||||||
|
"python.testing.unittestEnabled": false,
|
||||||
|
"python.testing.pytestArgs": [
|
||||||
|
"tests"
|
||||||
|
],
|
||||||
|
"editor.formatOnSave": true,
|
||||||
|
"[python]": {
|
||||||
|
"editor.defaultFormatter": "charliermarsh.ruff"
|
||||||
|
},
|
||||||
|
"ruff.enable": true,
|
||||||
|
"chat.instructionsFilesLocations": {
|
||||||
|
".github/instructions": true
|
||||||
|
},
|
||||||
|
"github.copilot.chat.codeGeneration.useInstructionFiles": true,
|
||||||
|
"files.exclude": {
|
||||||
|
"**/__pycache__": true,
|
||||||
|
"**/.pytest_cache": true,
|
||||||
|
"**/.mypy_cache": true,
|
||||||
|
"**/.ruff_cache": true
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,178 @@
|
|||||||
|
# AGENTS.md — Workflow for AI agents on finn-eiendom-mcp
|
||||||
|
|
||||||
|
This is the master doc for any AI agent (Claude, Copilot, Cursor, etc.) working in this repo. Read this first, then the more specific files it references.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Read order
|
||||||
|
|
||||||
|
Before changing code, read:
|
||||||
|
|
||||||
|
1. **`PRD.md`** — what we're building and why. Especially §17 ("Code ownership and anti-duplication") — that section is the constitution.
|
||||||
|
2. **`PROJECT.md`** — module map.
|
||||||
|
3. This file — workflow.
|
||||||
|
4. The relevant `.github/instructions/*.md`:
|
||||||
|
* `python.instructions.md` — Python conventions.
|
||||||
|
* `mcp.instructions.md` — MCP tool rules.
|
||||||
|
* `cli.instructions.md` — CLI command rules.
|
||||||
|
* `tests.instructions.md` — testing conventions.
|
||||||
|
* `clean-code.instructions.md` — best practices and DRY enforcement.
|
||||||
|
* `docs.instructions.md` — when and how to use the **context7** MCP server for library documentation.
|
||||||
|
|
||||||
|
If something in code contradicts the PRD, the PRD wins. If you change behavior, update both the PRD and the relevant instruction file in the same change.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Runtime — local venv (default)
|
||||||
|
|
||||||
|
This project runs in a project-local virtualenv. Docker is supported for packaging but is not required for development.
|
||||||
|
|
||||||
|
### One-time setup
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# from the project root
|
||||||
|
uv venv # or: python3.12 -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
uv pip install -e ".[dev]" # or: pip install -e ".[dev]"
|
||||||
|
```
|
||||||
|
|
||||||
|
Python **3.12+** is required.
|
||||||
|
|
||||||
|
### Daily commands
|
||||||
|
|
||||||
|
All commands are run inside the activated `.venv`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest # tests
|
||||||
|
ruff check . # lint
|
||||||
|
ruff format . # format
|
||||||
|
mypy finn_eiendom # type-check
|
||||||
|
finn-eiendom --help # CLI entrypoint
|
||||||
|
finn-eiendom-mcp # MCP server (stdio)
|
||||||
|
finn-eiendom serve --transport http --port 8010 # MCP server (HTTP)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Never
|
||||||
|
|
||||||
|
* Install packages globally (`pip install ...` outside a venv).
|
||||||
|
* Use `sudo pip`.
|
||||||
|
* Mutate the host Python.
|
||||||
|
* Add dependencies without updating `pyproject.toml`.
|
||||||
|
|
||||||
|
### Adding a dependency
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv pip install <package> # ad-hoc, then:
|
||||||
|
# edit pyproject.toml to record it
|
||||||
|
uv pip install -e ".[dev]" # reinstall in editable mode
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture in one screen
|
||||||
|
|
||||||
|
```
|
||||||
|
cli.py (typer) mcp_server.py (FastMCP) ← thin, parallel front ends
|
||||||
|
\ /
|
||||||
|
\ /
|
||||||
|
service.py ← orchestration: get_or_fetch, analyze_*
|
||||||
|
↓
|
||||||
|
analysis.py ← shortlist + summary
|
||||||
|
↓
|
||||||
|
search / ad / eiendom_no / scoring / feedback
|
||||||
|
↓
|
||||||
|
parser / http / cache
|
||||||
|
↓
|
||||||
|
FINN HTML + Eiendom.no JSON + SQLite
|
||||||
|
```
|
||||||
|
|
||||||
|
`formatting.py` sits next to `service.py` and is shared by both CLI and MCP for `json`, `markdown`, and `table` rendering.
|
||||||
|
|
||||||
|
**The single-home rule:** every piece of logic has exactly one home. If you're tempted to add it in two places, you're wrong about one — push it down a layer and call it from both. See `PRD.md` §17.2 for the full ownership table.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The five hard rules
|
||||||
|
|
||||||
|
These are non-negotiable. Architecture tests in `tests/test_architecture.py` enforce them.
|
||||||
|
|
||||||
|
1. **`mcp_server.py` and `cli.py` are siblings.** They never call each other. Both call only `service`, `formatting`, `models`, and `config`.
|
||||||
|
2. **`service.py` is the only place that combines cache + fetch.** Nothing above it touches HTTP or SQLite directly.
|
||||||
|
3. **`httpx` lives in `http.py`. Nowhere else.**
|
||||||
|
4. **`sqlite3` lives in `cache.py`. Nowhere else.**
|
||||||
|
5. **Output formatting lives in `formatting.py`.** No inline rendering in CLI or MCP tool bodies.
|
||||||
|
|
||||||
|
If you have to break one of these to ship a feature, the feature is wrong — fix the design first.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Adding a feature — the checklist
|
||||||
|
|
||||||
|
For any new tool / command / behavior:
|
||||||
|
|
||||||
|
1. Decide the home using the table in `PRD.md` §17.2.
|
||||||
|
2. Write the function in `service.py` (or extend `analysis.py` if it's pure orchestration).
|
||||||
|
3. Add a test in `tests/test_service.py`.
|
||||||
|
4. Add a thin MCP tool in `mcp_server.py` — `response_format` aware.
|
||||||
|
5. Add a thin CLI command in `cli.py` — `--format` aware.
|
||||||
|
6. Add the renderer in `formatting.py` if output is non-trivial.
|
||||||
|
7. Add tests in `tests/test_mcp_server.py` and `tests/test_cli.py`.
|
||||||
|
8. Update `PRD.md` and any affected `.github/instructions/*.md`.
|
||||||
|
|
||||||
|
If steps 4 or 5 need more than ~20 lines, logic has leaked out of the service layer. Push it back down.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Clean code
|
||||||
|
|
||||||
|
See `.github/instructions/clean-code.instructions.md`. Highlights:
|
||||||
|
|
||||||
|
* Type hints everywhere.
|
||||||
|
* Functions stay small; one job per function.
|
||||||
|
* Names describe intent (`get_or_fetch_ad`, not `process`).
|
||||||
|
* Comments explain **why**, never **what** the code already says.
|
||||||
|
* DRY: if you write the same regex / SQL / format string twice, extract it.
|
||||||
|
* Errors fail loudly with actionable messages. No silent `except: pass`.
|
||||||
|
* No dead code, no commented-out blocks left in the tree.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Documentation lookups — use context7
|
||||||
|
|
||||||
|
When uncertain about a library's API (FastMCP decorators, Pydantic v2 validators, Typer command patterns, httpx async, msgpack, pytest-asyncio, respx, BeautifulSoup selectors, etc.), **use the `context7` MCP server**. Do not guess from training-data memory.
|
||||||
|
|
||||||
|
Pattern (full details in `.github/instructions/docs.instructions.md`):
|
||||||
|
|
||||||
|
1. `context7:resolve-library-id` with the library name → get the canonical ID.
|
||||||
|
2. `context7:query-docs` with that ID + a focused topic.
|
||||||
|
|
||||||
|
Use context7 *before* writing the code, not after a test fails. If context7 returns nothing useful, search the library's official docs, then write the smallest possible spike to verify.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Safety and compliance
|
||||||
|
|
||||||
|
* Private, low-frequency use only.
|
||||||
|
* Respect FINN / Eiendom.no rate limits and bot protection.
|
||||||
|
* Cache aggressively; never bulk-harvest.
|
||||||
|
* stdio MCP servers log to **stderr only** — anything on stdout breaks the JSON-RPC frame.
|
||||||
|
* Scores and estimates are decision support, never legal / technical / financial advice.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation order (Phase 2)
|
||||||
|
|
||||||
|
Follow `PRD.md` §29 step-by-step. Each step is independently mergeable:
|
||||||
|
|
||||||
|
1. Switch dev workflow to local venv + update instruction files (this change).
|
||||||
|
2. Pydantic v2 cleanup.
|
||||||
|
3. Service layer + tests.
|
||||||
|
4. Formatting layer + tests.
|
||||||
|
5. HTTP retry on 5xx + tests.
|
||||||
|
6. Replace FastAPI with FastMCP stdio server.
|
||||||
|
7. CLI with typer.
|
||||||
|
8. Diff workflow.
|
||||||
|
9. Compare workflow.
|
||||||
|
10. Similar-to-liked.
|
||||||
|
11. Architecture tests.
|
||||||
|
12. README + Claude Desktop config.
|
||||||
@@ -0,0 +1,384 @@
|
|||||||
|
# IMPLEMENTATION.md — Phase 2 build runbook
|
||||||
|
|
||||||
|
How to drive Phase 2 (the 12 steps in `PRD.md` §29) to completion using an AI agent. Each step has its own kickoff prompt, files affected, and "done" criteria. Run them in order. Each step is independently mergeable.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 0. Pre-flight
|
||||||
|
|
||||||
|
Before starting step 1:
|
||||||
|
|
||||||
|
1. ls -la
|
||||||
|
|
||||||
|
2. **Venv is healthy.** From the project root:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
source .venv/bin/activate
|
||||||
|
pytest -x # green except for any pre-existing FastMCP-related skips
|
||||||
|
ruff check . # zero issues
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Docs are in place.** Re-confirm `PRD.md` §17 (code ownership) is current — every step below references it.
|
||||||
|
|
||||||
|
If any of these fail, stop and fix before proceeding.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How to use this runbook
|
||||||
|
|
||||||
|
For each step:
|
||||||
|
|
||||||
|
1. Create a feature branch: `git checkout -b feat/phase2-step-<N>-<slug>` off `chore/cleanup-phase-2-prep`.
|
||||||
|
2. Open a fresh agent chat with repo access. Paste the kickoff prompt verbatim.
|
||||||
|
3. Let the agent propose, implement, and test. Push back where it skips tests or violates §17.
|
||||||
|
4. When all "done" boxes are checked, merge into `chore/cleanup-phase-2-prep`.
|
||||||
|
5. Move to the next step.
|
||||||
|
|
||||||
|
Each kickoff prompt assumes the agent reads PRD.md, AGENTS.md, and the relevant instruction files first — that's encoded in the prompt.
|
||||||
|
|
||||||
|
After step 12, merge `chore/cleanup-phase-2-prep` into `main`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 1 — Dev workflow already switched to local venv
|
||||||
|
|
||||||
|
This step is **done** by the time `CLEANUP.md` is merged. The instruction files and `AGENTS.md` already use local venv. Sanity check:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
source .venv/bin/activate
|
||||||
|
which finn-eiendom 2>/dev/null || echo "expected: not yet installed; entry points come in steps 6 and 7"
|
||||||
|
ruff check . # zero issues
|
||||||
|
pytest -x # green (allow mcp_server failures)
|
||||||
|
```
|
||||||
|
|
||||||
|
Move on.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 2 — Pydantic v2 cleanup
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** (especially §17 code ownership and A8 acceptance criterion), **`.github/instructions/python.instructions.md`**, and **`.github/instructions/clean-code.instructions.md`**.
|
||||||
|
>
|
||||||
|
> Implement Phase 2 step 2: convert every Pydantic model in `finn_eiendom/models.py` from v1 (`class Config:`) to v2 (`model_config = ConfigDict(...)`). Use `context7:query-docs` on `pydantic/pydantic` if you're not sure of the v2 syntax — don't guess.
|
||||||
|
>
|
||||||
|
> Add `tests/test_models.py` with a JSON roundtrip test per model.
|
||||||
|
>
|
||||||
|
> Run `ruff check .`, `ruff format .`, and `pytest tests/test_models.py -v` before declaring done.
|
||||||
|
|
||||||
|
### Files
|
||||||
|
|
||||||
|
* `finn_eiendom/models.py` (edit)
|
||||||
|
* `tests/test_models.py` (new)
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `grep -rn "class Config:" finn_eiendom/` produces zero output.
|
||||||
|
* `pytest tests/test_models.py` is green.
|
||||||
|
* Existing tests still pass.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 3 — Service layer
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** §16 (Service layer) and §17 (code ownership), **`.github/instructions/python.instructions.md`** and **`.github/instructions/clean-code.instructions.md`**.
|
||||||
|
>
|
||||||
|
> Create `finn_eiendom/service.py` with the public surface listed in PRD §16: `get_or_fetch_ad`, `get_or_fetch_eiendom_unit`, `get_or_fetch_similar_units`, `analyze_search`, `analyze_ad`, `analyze_ad_against_comps`, `find_similar_to_liked`, `compare_ads`, `resolve_eiendom_unit_from_finn_url`, `build_unit_vector_for_unit_code`, `decode_unit_vector_to_dict`, `save_feedback`, `get_shortlist`, `get_new_ads_since_last_run`.
|
||||||
|
>
|
||||||
|
> Each function:
|
||||||
|
> 1. Opens its own SQLite connection via `cache.init_db(FINN_CACHE_PATH)`.
|
||||||
|
> 2. Reads cache first with TTLs from `config.py`.
|
||||||
|
> 3. On miss or `force_refresh=True`, calls the fetcher in `ad.py` / `eiendom_no.py`.
|
||||||
|
> 4. Writes the fresh result back.
|
||||||
|
> 5. Returns a typed model or dict.
|
||||||
|
>
|
||||||
|
> Do not duplicate behavior from `analysis.py` — delegate to it. Add `tests/test_service.py` covering the five service tests listed in PRD §25.2.
|
||||||
|
|
||||||
|
### Files
|
||||||
|
|
||||||
|
* `finn_eiendom/service.py` (new)
|
||||||
|
* `tests/test_service.py` (new)
|
||||||
|
* `tests/conftest.py` (may need a `tmp_db` fixture if it doesn't exist)
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `pytest tests/test_service.py` is green.
|
||||||
|
* `service.py` imports only from `models`, `config`, `cache`, `analysis`, `ad`, `eiendom_no`, `feedback`, `scoring`, stdlib.
|
||||||
|
* No `import httpx` or `import sqlite3` outside their owners.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 4 — Formatting layer
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** §17.6 (shared formatting module) and §19 (output formats), **`.github/instructions/clean-code.instructions.md`**.
|
||||||
|
>
|
||||||
|
> Create `finn_eiendom/formatting.py` with these renderers (signatures in PRD §17.6): `render_ad`, `render_shortlist`, `render_comparison`, `render_diff`, `render_similar_units`, `render_unit`, `render_score_breakdown`, plus `render_cache_stats` for the CLI cache subcommand.
|
||||||
|
>
|
||||||
|
> Each renderer accepts `(payload, fmt: Literal["json","markdown","table"]) -> str`. Unsupported formats raise `ValueError` listing supported options. Table rendering only applies where it makes sense (shortlist, comparison, diff, similar-units).
|
||||||
|
>
|
||||||
|
> Add `tests/test_formatting.py` covering the three tests listed in PRD §25.5.
|
||||||
|
|
||||||
|
### Files
|
||||||
|
|
||||||
|
* `finn_eiendom/formatting.py` (new)
|
||||||
|
* `tests/test_formatting.py` (new)
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `pytest tests/test_formatting.py` is green.
|
||||||
|
* `render_*` is the *only* place that formats output. No inline rendering anywhere else (verified by reading diffs of steps 6 and 7).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 5 — HTTP retry on 5xx
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** A9 (acceptance criterion), **`.github/instructions/python.instructions.md`**.
|
||||||
|
>
|
||||||
|
> Extend `HTTPClient.get()` in `finn_eiendom/http.py` to retry on 5xx responses (500/502/503/504) with exponential backoff `1s, 2s, 4s`, up to `retries` attempts (default 3). Surface 4xx as `httpx.HTTPStatusError` immediately. Apply the existing `request_delay` between any two calls.
|
||||||
|
>
|
||||||
|
> If you're unsure about `httpx` retry semantics or `respx` test patterns, use `context7`.
|
||||||
|
>
|
||||||
|
> Add `tests/test_http.py` covering the three tests listed in PRD §25.6 using `respx`.
|
||||||
|
|
||||||
|
### Files
|
||||||
|
|
||||||
|
* `finn_eiendom/http.py` (edit)
|
||||||
|
* `tests/test_http.py` (new)
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `pytest tests/test_http.py` is green.
|
||||||
|
* `httpx` imports remain confined to `http.py`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 6 — Replace FastAPI with FastMCP
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** §14 (MCP design — every tool and input schema), §17 (code ownership), and **`.github/instructions/mcp.instructions.md`** end-to-end.
|
||||||
|
>
|
||||||
|
> Rewrite `finn_eiendom/mcp_server.py` from scratch:
|
||||||
|
> - Use `from mcp.server.fastmcp import FastMCP`.
|
||||||
|
> - Configure stderr-only logging.
|
||||||
|
> - Register all 14 tools listed in PRD §14.1 with the `finn_` prefix.
|
||||||
|
> - Each tool body has the shape in `mcp.instructions.md` §"Tool body shape": one `service.<function>` call, one `formatting.render_*` call, try/except returning the JSON error envelope.
|
||||||
|
> - Input schemas as in PRD §14.2.
|
||||||
|
> - Annotations: `readOnlyHint=True` for all except `finn_save_feedback`.
|
||||||
|
> - `main()` calls `mcp.run(transport="stdio")`.
|
||||||
|
> - Add `finn-eiendom-mcp = "finn_eiendom.mcp_server:main"` to `[project.scripts]` in `pyproject.toml`.
|
||||||
|
>
|
||||||
|
> If unsure about FastMCP annotations or transport options, use `context7:query-docs` on the MCP Python SDK.
|
||||||
|
>
|
||||||
|
> Rewrite `tests/test_mcp_server.py` to cover the three tests in PRD §25.3. Use the SDK's testing helpers — do not spawn a subprocess.
|
||||||
|
>
|
||||||
|
> Verify: `finn-eiendom-mcp` boots over stdio, `mcp dev finn_eiendom/mcp_server.py` lists all 14 tools.
|
||||||
|
|
||||||
|
### Files
|
||||||
|
|
||||||
|
* `finn_eiendom/mcp_server.py` (full rewrite)
|
||||||
|
* `tests/test_mcp_server.py` (full rewrite)
|
||||||
|
* `pyproject.toml` (edit `[project.scripts]`)
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `mcp_server.py` imports only `service`, `formatting`, `models`, `config`, stdlib, `mcp`, `pydantic`.
|
||||||
|
* All 14 tools registered.
|
||||||
|
* `pytest tests/test_mcp_server.py` is green.
|
||||||
|
* `grep -rn "FastAPI" finn_eiendom/` is empty.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 7 — CLI
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** §15 (CLI design — every command and option) and **`.github/instructions/cli.instructions.md`** end-to-end.
|
||||||
|
>
|
||||||
|
> Create `finn_eiendom/cli.py` with a `typer.Typer` app exposing all commands in PRD §15.1, plus `finn_eiendom/__main__.py` that calls the app. Add to `pyproject.toml`:
|
||||||
|
> ```
|
||||||
|
> [project.scripts]
|
||||||
|
> finn-eiendom = "finn_eiendom.cli:app"
|
||||||
|
> ```
|
||||||
|
>
|
||||||
|
> Each command:
|
||||||
|
> - Translates options into a `service.<function>` call.
|
||||||
|
> - Calls `formatting.render_*(result, format)` and `typer.echo(...)`.
|
||||||
|
> - No business logic, no inline rendering.
|
||||||
|
> - Body under ~20 lines.
|
||||||
|
>
|
||||||
|
> Sub-app for `cache` (stats/clear/clear-html/clear-json) and `config` (show/path). `serve` accepts `--transport stdio|http` and dispatches to `mcp_server.main()` or the HTTP transport.
|
||||||
|
>
|
||||||
|
> If unsure about Typer sub-apps or `CliRunner`, use `context7`.
|
||||||
|
>
|
||||||
|
> Add `tests/test_cli.py` covering the five tests in PRD §25.4 using `typer.testing.CliRunner`. Mock `service.*` with `monkeypatch` — do not exercise the full stack here, that's `test_service.py`.
|
||||||
|
|
||||||
|
### Files
|
||||||
|
|
||||||
|
* `finn_eiendom/cli.py` (new)
|
||||||
|
* `finn_eiendom/__main__.py` (new)
|
||||||
|
* `tests/test_cli.py` (new)
|
||||||
|
* `pyproject.toml` (edit)
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `finn-eiendom --help` lists every command in PRD §15.1.
|
||||||
|
* `cli.py` imports only `service`, `formatting`, `models`, `config`, stdlib, `typer`.
|
||||||
|
* `pytest tests/test_cli.py` is green.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 8 — Diff workflow (new / removed / changed)
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** §10.8, §13 (search_runs table), workflow I in §18, and **`.github/instructions/clean-code.instructions.md`**.
|
||||||
|
>
|
||||||
|
> Implement:
|
||||||
|
> 1. `search_runs` and `scores` tables in `cache.py` (use existing migration pattern).
|
||||||
|
> 2. `service.get_new_ads_since_last_run(search_url)` that compares against the previous run for the same `normalized_url` and returns `{new_ads, removed_ads, changed_ads}` with price/common_costs/status diffs on changed.
|
||||||
|
> 3. `finn_get_new_ads_since_last_run` MCP tool.
|
||||||
|
> 4. `finn-eiendom diff <url>` CLI command.
|
||||||
|
> 5. `formatting.render_diff(result, fmt)`.
|
||||||
|
>
|
||||||
|
> Add tests covering: empty previous-run case, all-new case, mixed new+removed+changed case.
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* The three new tests pass.
|
||||||
|
* MCP and CLI both expose the same behavior with identical defaults.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 9 — Compare workflow
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** workflow K in §18 and §14.2 (`CompareAdsInput`).
|
||||||
|
>
|
||||||
|
> Implement `service.compare_ads(finnkoder, include_eiendom_no=True, include_comps=True)` returning a comparison table + winners by category (best value / lifestyle / hybel / bargain / safest / highest risk / most overpriced).
|
||||||
|
>
|
||||||
|
> Wire `finn_compare_ads` MCP tool and `finn-eiendom compare <finnkode...>` CLI command. Add `formatting.render_comparison`. Tests for service and CLI.
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `finn-eiendom compare 462400360 461153194 --format markdown` produces a readable comparison.
|
||||||
|
* Service test covers the winners-by-category logic.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 10 — Similar-to-liked
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** workflow G in §18 and `FindSimilarToLikedInput` in §14.2.
|
||||||
|
>
|
||||||
|
> Implement `service.find_similar_to_liked(finnkode, mode, listing_status)`:
|
||||||
|
> 1. Load FinnAd; verify `feedback` has `verdict=liked` for this finnkode.
|
||||||
|
> 2. Ensure Eiendom.no enrichment + unit_vector exist.
|
||||||
|
> 3. Fetch similar-units (prefer `FOR_SALE` for recommendations, `RECENTLY_SOLD` for comps).
|
||||||
|
> 4. Score candidates against user preferences.
|
||||||
|
> 5. Return ranked recommendations.
|
||||||
|
>
|
||||||
|
> Wire MCP tool and CLI command. Tests covering: no liked feedback raises clear error; happy path returns ranked list.
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `finn-eiendom similar-to-liked 462400360` returns ranked candidates when the listing has a liked verdict, and a clear error otherwise.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 11 — Architecture tests
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** A10 (architecture acceptance criterion) and §17.3 (layering invariants).
|
||||||
|
>
|
||||||
|
> Create `tests/test_architecture.py` that walks every `.py` file under `finn_eiendom/` with `ast`, collects all `import` and `from X import Y` statements, and asserts the layering invariants in PRD A10:
|
||||||
|
> - No `httpx` outside `http.py`.
|
||||||
|
> - No `sqlite3` outside `cache.py`.
|
||||||
|
> - No `BeautifulSoup` outside `search.py` / `ad.py`.
|
||||||
|
> - No `msgpack` outside `eiendom_no.py`.
|
||||||
|
> - `mcp_server.py` and `cli.py` import only from the allowed set.
|
||||||
|
> - `service.py` never imports `mcp_server` or `cli`.
|
||||||
|
>
|
||||||
|
> Add a parametrize'd test per invariant so failures show which module violated which rule. Failures should print the offending import line and module.
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* `pytest tests/test_architecture.py` is green.
|
||||||
|
* Deliberately introducing a violation (e.g. `import httpx` in `service.py`) makes a test fail with a clear message.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 12 — README + Claude Desktop config + final verification
|
||||||
|
|
||||||
|
### Kickoff prompt
|
||||||
|
|
||||||
|
> Read **PRD.md** §21 (deployment), §22 (MVP scope), §24 (all acceptance criteria), **README.md** and **USAGE.md**.
|
||||||
|
>
|
||||||
|
> Update `README.md` and `USAGE.md` so every command, env var, and Claude Desktop snippet matches what was actually built in steps 1–11. Verify with the user's exact paths.
|
||||||
|
>
|
||||||
|
> Run the full A1–A11 acceptance check:
|
||||||
|
>
|
||||||
|
> - A1: `finn-eiendom-mcp` boots over stdio; `mcp dev finn_eiendom/mcp_server.py` lists all 14 tools.
|
||||||
|
> - A2: `finn-eiendom --help` lists every §15.1 command; each command runs against fixtures.
|
||||||
|
> - A3 – A9: matching service tests pass.
|
||||||
|
> - A10: `pytest tests/test_architecture.py` is green.
|
||||||
|
> - A11: `ruff check .` is clean; `pytest` is fully green; `mypy --strict finn_eiendom` passes or is documented as a gap.
|
||||||
|
>
|
||||||
|
> Report any failures with specific file/line references — don't paper over them.
|
||||||
|
|
||||||
|
### Files
|
||||||
|
|
||||||
|
* `README.md` (edit to match reality)
|
||||||
|
* `USAGE.md` (edit to match reality)
|
||||||
|
|
||||||
|
### Done when
|
||||||
|
|
||||||
|
* All 11 acceptance criteria in PRD §24 pass.
|
||||||
|
* README + USAGE quickstart examples actually work end-to-end on a fresh clone.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Definition of done for the whole phase
|
||||||
|
|
||||||
|
Merge `chore/cleanup-phase-2-prep` into `main` when **every** box is checked:
|
||||||
|
|
||||||
|
* [ ] All 12 steps merged in order.
|
||||||
|
* [ ] `finn-eiendom-mcp` boots over stdio with all 14 tools.
|
||||||
|
* [ ] `finn-eiendom --help` lists every command in PRD §15.1.
|
||||||
|
* [ ] `pytest` is green, including the new `test_service.py`, `test_cli.py`, `test_http.py`, `test_formatting.py`, `test_models.py`, `test_architecture.py`.
|
||||||
|
* [ ] `ruff check .` is clean.
|
||||||
|
* [ ] `mypy --strict finn_eiendom` passes or has a documented exception list.
|
||||||
|
* [ ] `README.md` and `USAGE.md` quickstart examples work on a fresh clone in under 5 minutes.
|
||||||
|
* [ ] Claude Desktop config in USAGE.md is verified to work against your installation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## When a step blocks
|
||||||
|
|
||||||
|
If a step blocks on an unclear requirement:
|
||||||
|
|
||||||
|
1. Re-read the relevant PRD section.
|
||||||
|
2. Check `PRD.md` §28 (open questions) — the answer may be a deferred decision.
|
||||||
|
3. If still unclear, write the question down, pick the simplest interpretation, mark it `# TODO(<date>): revisit <question>` in code, and move on.
|
||||||
|
|
||||||
|
If a step blocks on a library question (FastMCP, Pydantic v2, Typer, httpx, msgpack, respx):
|
||||||
|
|
||||||
|
1. Use `context7` — see `.github/instructions/docs.instructions.md`.
|
||||||
|
2. If context7 returns nothing useful, write the smallest possible spike in `scratch/` (gitignored) to verify behavior.
|
||||||
|
|
||||||
|
If a step blocks on §17 (code ownership) — i.e. it feels like the right answer requires putting logic in the "wrong" place:
|
||||||
|
|
||||||
|
1. Stop.
|
||||||
|
2. Re-read PRD §17.2 (decision table) and §17.3 (layering invariants).
|
||||||
|
3. Ask whether the service layer is actually missing a function. Usually it is.
|
||||||
|
4. Add the missing service function instead of bending the layering.
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
.PHONY: help venv install dev test test-fast lint format typecheck check clean serve mcp doctor
|
||||||
|
|
||||||
|
PYTHON ?= python3.12
|
||||||
|
VENV ?= .venv
|
||||||
|
BIN = $(VENV)/bin
|
||||||
|
|
||||||
|
help: ## Show this help
|
||||||
|
@grep -E '^[a-zA-Z_-]+:.*?## ' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf " \033[36m%-12s\033[0m %s\n", $$1, $$2}'
|
||||||
|
|
||||||
|
venv: ## Create the local virtualenv
|
||||||
|
uv venv $(VENV) 2>/dev/null || $(PYTHON) -m venv $(VENV)
|
||||||
|
@echo "Activate with: source $(BIN)/activate"
|
||||||
|
|
||||||
|
install: venv ## Install the package (editable) with dev extras
|
||||||
|
uv pip install --python $(BIN)/python -e ".[dev]" 2>/dev/null || $(BIN)/pip install -e ".[dev]"
|
||||||
|
|
||||||
|
dev: install ## Alias for install
|
||||||
|
|
||||||
|
test: ## Run the full test suite
|
||||||
|
$(BIN)/pytest
|
||||||
|
|
||||||
|
test-fast: ## Run tests, fail fast, verbose
|
||||||
|
$(BIN)/pytest -x -v
|
||||||
|
|
||||||
|
lint: ## Lint with ruff
|
||||||
|
$(BIN)/ruff check .
|
||||||
|
|
||||||
|
format: ## Auto-format with ruff
|
||||||
|
$(BIN)/ruff format .
|
||||||
|
|
||||||
|
typecheck: ## Static type-check with mypy
|
||||||
|
$(BIN)/mypy finn_eiendom
|
||||||
|
|
||||||
|
check: lint typecheck test ## Run lint + typecheck + tests
|
||||||
|
|
||||||
|
clean: ## Remove caches and build artifacts
|
||||||
|
rm -rf .pytest_cache .ruff_cache .mypy_cache build dist *.egg-info
|
||||||
|
find . -type d -name __pycache__ -prune -exec rm -rf {} +
|
||||||
|
|
||||||
|
serve: ## Start the MCP server over HTTP on port 8010
|
||||||
|
$(BIN)/finn-eiendom serve --transport http --port 8010
|
||||||
|
|
||||||
|
mcp: ## Start the MCP server over stdio
|
||||||
|
$(BIN)/finn-eiendom-mcp
|
||||||
|
|
||||||
|
doctor: ## Smoke-check the install
|
||||||
|
$(BIN)/finn-eiendom doctor
|
||||||
+162
@@ -0,0 +1,162 @@
|
|||||||
|
# PROJECT.md — module map
|
||||||
|
|
||||||
|
The repo at a glance. For the why and the rules, read [`PRD.md`](PRD.md) §12 and §17. For the workflow, read [`AGENTS.md`](AGENTS.md).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Source tree
|
||||||
|
|
||||||
|
```
|
||||||
|
finn-mcp/
|
||||||
|
├── pyproject.toml
|
||||||
|
├── Makefile
|
||||||
|
├── README.md ← user-facing overview
|
||||||
|
├── USAGE.md ← full user guide
|
||||||
|
├── PRD.md ← product spec + architecture (§17 = constitution)
|
||||||
|
├── PROJECT.md ← this file
|
||||||
|
├── AGENTS.md ← workflow for AI agents and contributors
|
||||||
|
├── CLEANUP.md ← pre-Phase-2 cleanup runbook
|
||||||
|
├── IMPLEMENTATION.md ← Phase 2 build runbook (12 steps)
|
||||||
|
│
|
||||||
|
├── .github/
|
||||||
|
│ ├── copilot-instructions.md
|
||||||
|
│ └── instructions/
|
||||||
|
│ ├── python.instructions.md
|
||||||
|
│ ├── mcp.instructions.md
|
||||||
|
│ ├── cli.instructions.md
|
||||||
|
│ ├── tests.instructions.md
|
||||||
|
│ ├── clean-code.instructions.md
|
||||||
|
│ └── docs.instructions.md ← context7 lookup rules
|
||||||
|
│
|
||||||
|
├── finn_eiendom/ ← the package
|
||||||
|
│ ├── __init__.py
|
||||||
|
│ ├── __main__.py ← python -m finn_eiendom → CLI
|
||||||
|
│ ├── config.py ← env vars, defaults, TTLs
|
||||||
|
│ ├── models.py ← Pydantic v2 models
|
||||||
|
│ ├── parser.py ← Norwegian number/area/URL/finnkode normalization
|
||||||
|
│ ├── http.py ← async httpx client w/ retry + delay
|
||||||
|
│ ├── cache.py ← SQLite schema + persistence
|
||||||
|
│ ├── search.py ← FINN search HTML parsing
|
||||||
|
│ ├── ad.py ← FINN listing HTML parsing
|
||||||
|
│ ├── eiendom_no.py ← Eiendom.no unit search/detail, unit_vector, comps
|
||||||
|
│ ├── scoring.py ← score model + classifications
|
||||||
|
│ ├── feedback.py ← verdicts + soft preference signal
|
||||||
|
│ ├── analysis.py ← shortlist + summary assembly
|
||||||
|
│ ├── service.py ← get_or_fetch_* + thin facade for MCP and CLI
|
||||||
|
│ ├── formatting.py ← render_* helpers (json/markdown/table) — shared by MCP and CLI
|
||||||
|
│ ├── mcp_server.py ← FastMCP wrappers around service.py
|
||||||
|
│ └── cli.py ← typer wrappers around service.py
|
||||||
|
│
|
||||||
|
├── tests/
|
||||||
|
│ ├── conftest.py
|
||||||
|
│ ├── fixtures.py
|
||||||
|
│ ├── fixtures/ ← HTML + JSON samples
|
||||||
|
│ ├── test_parser.py
|
||||||
|
│ ├── test_search.py
|
||||||
|
│ ├── test_ad.py
|
||||||
|
│ ├── test_eiendom_no.py
|
||||||
|
│ ├── test_scoring.py
|
||||||
|
│ ├── test_cache.py
|
||||||
|
│ ├── test_http.py ← retry + delay behavior
|
||||||
|
│ ├── test_service.py ← get_or_fetch_* + analyze_*
|
||||||
|
│ ├── test_formatting.py ← render_* roundtrips
|
||||||
|
│ ├── test_models.py ← Pydantic v2 roundtrips
|
||||||
|
│ ├── test_mcp_server.py ← tool registration + error envelope
|
||||||
|
│ ├── test_cli.py ← Typer CliRunner
|
||||||
|
│ └── test_architecture.py ← import-graph invariants (PRD A10)
|
||||||
|
│
|
||||||
|
└── data/ ← gitignored; SQLite cache lives here
|
||||||
|
└── finn.sqlite
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Module responsibilities
|
||||||
|
|
||||||
|
Single-home rule: every concern lives in exactly one module. See `PRD.md` §17.2 for the full table.
|
||||||
|
|
||||||
|
| Module | Owns | Imports allowed |
|
||||||
|
| --------------- | --------------------------------------------------------------------- | ---------------------------------------------------------- |
|
||||||
|
| `config.py` | env-var loading, defaults, TTL constants | stdlib |
|
||||||
|
| `models.py` | Pydantic v2 models | stdlib, `pydantic` |
|
||||||
|
| `parser.py` | Norwegian text normalization (numbers, dates, URLs, finnkode) | stdlib |
|
||||||
|
| `http.py` | async `httpx.AsyncClient`, retry on 5xx, delay, user-agent | stdlib, `httpx` |
|
||||||
|
| `cache.py` | SQLite schema, reads, writes, TTL | stdlib, `sqlite3`, `models` |
|
||||||
|
| `search.py` | FINN search HTML → cards (BeautifulSoup) | stdlib, `bs4`, `parser`, `http`, `cache`, `models` |
|
||||||
|
| `ad.py` | FINN listing HTML → `FinnAd` (BeautifulSoup) | stdlib, `bs4`, `parser`, `http`, `cache`, `models` |
|
||||||
|
| `eiendom_no.py` | Eiendom.no unit search/detail, unit_vector, similar-units (msgpack) | stdlib, `msgpack`, `http`, `cache`, `models` |
|
||||||
|
| `scoring.py` | 9 score components, total clamping, category classifier | stdlib, `models` |
|
||||||
|
| `feedback.py` | feedback storage and retrieval | stdlib, `cache`, `models` |
|
||||||
|
| `analysis.py` | shortlist + summary assembly | stdlib, `search`, `ad`, `eiendom_no`, `scoring`, `feedback`|
|
||||||
|
| `service.py` | cache-aware orchestration; the only place that combines fetch + cache | stdlib, `config`, `cache`, `analysis`, `ad`, `eiendom_no`, `feedback`, `scoring`, `models` |
|
||||||
|
| `formatting.py` | render_* helpers (json/markdown/table) | stdlib, `models` |
|
||||||
|
| `mcp_server.py` | FastMCP tool definitions, error wrapping, stdio/HTTP entry | stdlib, `mcp`, `pydantic`, `service`, `formatting`, `config`, `models` |
|
||||||
|
| `cli.py` | typer command definitions, --format dispatch | stdlib, `typer`, `service`, `formatting`, `config`, `models` |
|
||||||
|
|
||||||
|
`mcp_server.py` and `cli.py` are siblings — they never import each other. `service.py` never imports `mcp_server` or `cli`. `tests/test_architecture.py` enforces all of this.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Entry points
|
||||||
|
|
||||||
|
Defined in `pyproject.toml`:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[project.scripts]
|
||||||
|
finn-eiendom-mcp = "finn_eiendom.mcp_server:main"
|
||||||
|
finn-eiendom = "finn_eiendom.cli:app"
|
||||||
|
```
|
||||||
|
|
||||||
|
So you have:
|
||||||
|
|
||||||
|
* `finn-eiendom-mcp` — MCP server over stdio (what Claude Desktop calls).
|
||||||
|
* `finn-eiendom` — CLI with all subcommands.
|
||||||
|
* `python -m finn_eiendom` — same as `finn-eiendom` (via `__main__.py`).
|
||||||
|
* `import finn_eiendom` — the library, for tests and notebooks.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependency graph
|
||||||
|
|
||||||
|
```
|
||||||
|
cli.py mcp_server.py
|
||||||
|
↓ ↓
|
||||||
|
└──> formatting.py <──┘
|
||||||
|
│
|
||||||
|
↓
|
||||||
|
service.py
|
||||||
|
↓
|
||||||
|
analysis.py
|
||||||
|
↓
|
||||||
|
┌───────────┼──────────────┐
|
||||||
|
↓ ↓ ↓
|
||||||
|
search.py ad.py eiendom_no.py scoring.py feedback.py
|
||||||
|
│ │ │ │ │
|
||||||
|
↓ ↓ ↓ ↓ ↓
|
||||||
|
parser.py parser.py cache.py models.py cache.py
|
||||||
|
│ │ │
|
||||||
|
↓ ↓ ↓
|
||||||
|
http.py http.py http.py
|
||||||
|
```
|
||||||
|
|
||||||
|
Bottom layer: `parser.py`, `http.py`, `cache.py`, `models.py`, `config.py`. They depend only on stdlib + one third-party library each.
|
||||||
|
|
||||||
|
The graph is acyclic and points downward. Every arrow can be drawn; no arrow can be drawn upward.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Where to add things
|
||||||
|
|
||||||
|
| You want to… | Add it to… |
|
||||||
|
| ----------------------------------------- | --------------------------------------- |
|
||||||
|
| Parse a new FINN field | `ad.py` or `search.py` + `models.py` |
|
||||||
|
| Add a new score component | `scoring.py` |
|
||||||
|
| Add a new env var | `config.py` |
|
||||||
|
| Add a new MCP tool | `mcp_server.py` (after `service.py`) |
|
||||||
|
| Add a new CLI command | `cli.py` (after `service.py`) |
|
||||||
|
| Change how something renders | `formatting.py` |
|
||||||
|
| Add a new orchestration / workflow | `service.py` (then add MCP + CLI) |
|
||||||
|
| Speak to a new external API | new module next to `eiendom_no.py` |
|
||||||
|
| Add a new SQLite table | `cache.py` |
|
||||||
|
|
||||||
|
For anything else — read `PRD.md` §17.2 and §17.7.
|
||||||
@@ -0,0 +1,160 @@
|
|||||||
|
# finn-eiendom-mcp
|
||||||
|
|
||||||
|
> **Private, self-hosted property analysis platform for Norwegian real estate.** Analyzes FINN listings, enriches with Eiendom.no estimates, scores against personal preferences, and surfaces bargain candidates, hybel potential, renovation upside, and risk flags. Exposed through an MCP server, a CLI, and a Python library — all sharing one service layer.
|
||||||
|
|
||||||
|
This is a **personal tool**. Not a SaaS, not a crawler, not legal/financial advice. Run locally, low frequency, your own data.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What it does
|
||||||
|
|
||||||
|
```
|
||||||
|
FINN search URL → ranked shortlist of homes
|
||||||
|
with reasons, risks, comps, broker questions
|
||||||
|
```
|
||||||
|
|
||||||
|
Specifically:
|
||||||
|
|
||||||
|
* Parses FINN search and listing pages.
|
||||||
|
* Resolves each listing to an Eiendom.no `unitCode` for valuation and similar-units.
|
||||||
|
* Builds a `unit_vector` and fetches recently-sold comparables.
|
||||||
|
* Scores 9 components (economy, market position, comps, location, layout, outdoor, hybel, renovation, risk).
|
||||||
|
* Classifies as *bargain*, *safe*, *hybel*, *renovation*, *lifestyle*, or *risk*.
|
||||||
|
* Caches everything in SQLite; remembers what you've liked or rejected.
|
||||||
|
* Detects new / removed / changed listings between runs.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Three ways to use it
|
||||||
|
|
||||||
|
| Surface | When you want… | Entry point |
|
||||||
|
| --------------- | -------------------------------------------------------------- | ----------------------- |
|
||||||
|
| **CLI** | Quick triage in a terminal, scripting, cron | `finn-eiendom ...` |
|
||||||
|
| **MCP server** | Claude Desktop, n8n, AI agents — conversational analysis | `finn-eiendom-mcp` |
|
||||||
|
| **Python lib** | Tests, notebooks, custom scripts | `import finn_eiendom` |
|
||||||
|
|
||||||
|
All three call the same underlying `service.py` — same defaults, same semantics, same results.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick start
|
||||||
|
|
||||||
|
### Requirements
|
||||||
|
|
||||||
|
* Python **3.12+**
|
||||||
|
* `uv` (recommended) or `pip`
|
||||||
|
|
||||||
|
### Install
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone <your-repo-url> finn-mcp
|
||||||
|
cd finn-mcp
|
||||||
|
|
||||||
|
uv venv # or: python3.12 -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
uv pip install -e ".[dev]" # or: pip install -e ".[dev]"
|
||||||
|
```
|
||||||
|
|
||||||
|
### First run (CLI)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Triage a FINN search
|
||||||
|
finn-eiendom analyze-search 'https://www.finn.no/realestate/homes/search.html?location=...' --format table
|
||||||
|
|
||||||
|
# Drill into one listing
|
||||||
|
finn-eiendom get-ad 462400360 --format markdown
|
||||||
|
|
||||||
|
# Mark a listing as liked
|
||||||
|
finn-eiendom save-feedback 462400360 liked --notes "great layout, check fellesgjeld"
|
||||||
|
|
||||||
|
# Find similar properties to liked listings
|
||||||
|
finn-eiendom similar-to-liked 462400360
|
||||||
|
```
|
||||||
|
|
||||||
|
### First run (Claude Desktop)
|
||||||
|
|
||||||
|
Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or the equivalent on Linux:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"mcpServers": {
|
||||||
|
"finn-eiendom": {
|
||||||
|
"command": "/absolute/path/to/finn-mcp/.venv/bin/finn-eiendom-mcp",
|
||||||
|
"env": {
|
||||||
|
"FINN_CACHE_PATH": "/absolute/path/to/finn-mcp/data/finn.sqlite",
|
||||||
|
"EIENDOM_NO_ENABLED": "true"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Restart Claude Desktop. Then in any chat:
|
||||||
|
|
||||||
|
> Analyze this FINN search and shortlist the top 5 for a couple in Oslo with a 9–12 MNOK budget, willing to renovate, prefer hybel potential:
|
||||||
|
> `https://www.finn.no/realestate/homes/search.html?location=...`
|
||||||
|
|
||||||
|
For deep usage — every command, every MCP tool, every env var — see [`USAGE.md`](USAGE.md).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture in one screen
|
||||||
|
|
||||||
|
```
|
||||||
|
CLI (typer) MCP server (FastMCP) ← thin, parallel front ends
|
||||||
|
\ /
|
||||||
|
\ /
|
||||||
|
service.py ← cache + fetch orchestration
|
||||||
|
↓
|
||||||
|
analysis.py ← shortlist + summary
|
||||||
|
↓
|
||||||
|
search / ad / eiendom_no / scoring / feedback
|
||||||
|
↓
|
||||||
|
parser / http / cache (SQLite)
|
||||||
|
↓
|
||||||
|
FINN HTML + Eiendom.no JSON
|
||||||
|
```
|
||||||
|
|
||||||
|
`formatting.py` lives next to `service.py` and is shared by both CLI and MCP for JSON / markdown / table rendering.
|
||||||
|
|
||||||
|
**Key rule:** CLI and MCP are siblings. They never call each other. Both call the same `service.py` functions. See [`PRD.md`](PRD.md) §17 for the full code-ownership constitution.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Project documents
|
||||||
|
|
||||||
|
Read in this order depending on what you're doing:
|
||||||
|
|
||||||
|
| If you want to… | Read |
|
||||||
|
| ------------------------------------- | --------------------------------------------------- |
|
||||||
|
| Use the tool | This README, then [`USAGE.md`](USAGE.md) |
|
||||||
|
| Understand the design | [`PRD.md`](PRD.md), especially §1, §12, §17 |
|
||||||
|
| Contribute / extend / hack on it | [`AGENTS.md`](AGENTS.md), then [`PROJECT.md`](PROJECT.md), then `.github/instructions/*.md` |
|
||||||
|
| Run the cleanup pass on the repo | [`CLEANUP.md`](CLEANUP.md) |
|
||||||
|
| Build out unfinished features | [`IMPLEMENTATION.md`](IMPLEMENTATION.md) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
* **Phase 0 (spike):** done.
|
||||||
|
* **Phase 1 (core MVP):** mostly done.
|
||||||
|
* **Phase 2 (MCP + CLI):** in progress — driven by [`IMPLEMENTATION.md`](IMPLEMENTATION.md).
|
||||||
|
* **Phase 3+ (scoring v2, agent workflows, dashboard):** future.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Safety and compliance
|
||||||
|
|
||||||
|
* Private, low-frequency, user-triggered use only. No public deployment.
|
||||||
|
* Configurable request delays (`FINN_REQUEST_DELAY_SECONDS`, `EIENDOM_NO_REQUEST_DELAY_SECONDS`) — defaults are conservative.
|
||||||
|
* Aggressive caching to minimize external requests.
|
||||||
|
* No bypassing of rate limits, bot protection, authentication, or access controls.
|
||||||
|
* No public redistribution of FINN or Eiendom.no data.
|
||||||
|
* Scores, estimates, and comparable sales are **decision support, not advice**. Don't substitute this for a real broker, lawyer, or technical inspector.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## License / use
|
||||||
|
|
||||||
|
Personal project. Not for redistribution. Don't expose the MCP HTTP transport on a public interface — keep it on LAN, Tailscale, or behind auth.
|
||||||
@@ -0,0 +1,503 @@
|
|||||||
|
# USAGE.md — finn-eiendom user guide
|
||||||
|
|
||||||
|
How to use the tool day-to-day. Covers installation, every CLI command, every MCP tool, Claude Desktop integration, common workflows, environment variables, and troubleshooting.
|
||||||
|
|
||||||
|
For the why and the architecture, see [`README.md`](README.md) and [`PRD.md`](PRD.md).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Installation
|
||||||
|
|
||||||
|
### Requirements
|
||||||
|
|
||||||
|
* Python **3.12 or newer** (check with `python3 --version`)
|
||||||
|
* `uv` (recommended) or `pip`
|
||||||
|
* macOS, Linux, or WSL2 on Windows
|
||||||
|
|
||||||
|
### Install
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone <your-repo-url> finn-mcp
|
||||||
|
cd finn-mcp
|
||||||
|
|
||||||
|
# Option A: uv (preferred — fast)
|
||||||
|
uv venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
uv pip install -e ".[dev]"
|
||||||
|
|
||||||
|
# Option B: pip
|
||||||
|
python3.12 -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
pip install -e ".[dev]"
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom --help
|
||||||
|
finn-eiendom-mcp --help # may exit immediately on stdio mode; that's fine
|
||||||
|
finn-eiendom doctor # smoke-checks cache, FINN, Eiendom.no reachability
|
||||||
|
```
|
||||||
|
|
||||||
|
### Updating
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git pull
|
||||||
|
source .venv/bin/activate
|
||||||
|
uv pip install -e ".[dev]"
|
||||||
|
```
|
||||||
|
|
||||||
|
If `pyproject.toml` added dependencies, the second command picks them up.
|
||||||
|
|
||||||
|
### Global install (optional)
|
||||||
|
|
||||||
|
If you want `finn-eiendom` available system-wide without activating the venv:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv tool install .
|
||||||
|
# or
|
||||||
|
pipx install .
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. First-time setup
|
||||||
|
|
||||||
|
### Set up the data directory
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mkdir -p data
|
||||||
|
```
|
||||||
|
|
||||||
|
SQLite cache lives there at `data/finn.sqlite` by default. Override with `FINN_CACHE_PATH` if you want it elsewhere.
|
||||||
|
|
||||||
|
### Optional: environment file
|
||||||
|
|
||||||
|
Create `.env` in the project root for your usual settings:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
FINN_CACHE_PATH=data/finn.sqlite
|
||||||
|
FINN_MAX_SEARCH_PAGES=3
|
||||||
|
FINN_DETAIL_LIMIT=20
|
||||||
|
EIENDOM_NO_ENABLED=true
|
||||||
|
EIENDOM_NO_SIMILAR_UNITS_ENABLED=true
|
||||||
|
LOG_LEVEL=INFO
|
||||||
|
```
|
||||||
|
|
||||||
|
See §7 for the full list of variables.
|
||||||
|
|
||||||
|
### Verify
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom doctor
|
||||||
|
```
|
||||||
|
|
||||||
|
This pings the cache, reaches FINN once, reaches Eiendom.no once, and reports any failures.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. CLI reference
|
||||||
|
|
||||||
|
Every command runs inside the activated venv.
|
||||||
|
|
||||||
|
### 3.1 Analyze a FINN search
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom analyze-search '<finn-search-url>' [options]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Option | Default | Purpose |
|
||||||
|
| ------------------- | ------- | ---------------------------------------------------------- |
|
||||||
|
| `--max-pages N` | `3` | Pages of search results to fetch. |
|
||||||
|
| `--detail-limit N` | `20` | How many listings to detail-fetch from the result set. |
|
||||||
|
| `--no-details` | off | Skip detail fetches; use only search-card data. |
|
||||||
|
| `--no-eiendom` | off | Skip Eiendom.no enrichment. |
|
||||||
|
| `--with-similar` | off | Fetch similar-units / comps for shortlisted listings. |
|
||||||
|
| `--format FMT` | `json` | `json`, `markdown`, or `table`. |
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Triage in the terminal
|
||||||
|
finn-eiendom analyze-search 'https://www.finn.no/realestate/homes/search.html?location=0.20061&min_bedrooms=2&price_collective_to=12000000' --format table
|
||||||
|
|
||||||
|
# Full JSON for piping into jq
|
||||||
|
finn-eiendom analyze-search '<url>' --format json | jq '.shortlist[].title'
|
||||||
|
|
||||||
|
# Detailed run with comps
|
||||||
|
finn-eiendom analyze-search '<url>' --detail-limit 30 --with-similar --format markdown
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.2 Drill into one listing
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom get-ad <finnkode> [options]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Option | Default | Purpose |
|
||||||
|
| ------------------- | ------- | -------------------------------------------------- |
|
||||||
|
| `--force-refresh` | off | Bypass the 24h cache and refetch. |
|
||||||
|
| `--no-eiendom` | off | Skip Eiendom.no enrichment. |
|
||||||
|
| `--with-similar` | off | Fetch similar-units / comps. |
|
||||||
|
| `--format FMT` | `json` | `json` or `markdown`. |
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom get-ad 462400360 --format markdown
|
||||||
|
finn-eiendom get-ad 462400360 --force-refresh --with-similar
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.3 Compare listings
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom compare <finnkode> <finnkode> [<finnkode>...] [options]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Option | Default | Purpose |
|
||||||
|
| ---------------- | ------- | -------------------------------------- |
|
||||||
|
| `--no-eiendom` | off | Skip Eiendom.no enrichment. |
|
||||||
|
| `--no-comps` | off | Skip similar-units / comps. |
|
||||||
|
| `--format FMT` | `json` | `json`, `markdown`, or `table`. |
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom compare 462400360 461153194 --format markdown
|
||||||
|
finn-eiendom compare 462400360 461153194 462400360 --format table
|
||||||
|
```
|
||||||
|
|
||||||
|
Up to 10 finnkoder per call.
|
||||||
|
|
||||||
|
### 3.4 Feedback
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom save-feedback <finnkode> <verdict> [--notes "..."]
|
||||||
|
```
|
||||||
|
|
||||||
|
Verdict vocabulary: `liked`, `rejected`, `interesting`, `bargain_candidate`, `risk_object`, `viewing_candidate`, `viewed`, `too_expensive`, `too_small`, `too_far_out`, `too_high_risk`, `likes_location`, `likes_layout`, `dislikes_area`.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom save-feedback 462400360 liked --notes "balcony, view, check wet rooms"
|
||||||
|
finn-eiendom save-feedback 461153194 rejected --notes "too far from city center"
|
||||||
|
```
|
||||||
|
|
||||||
|
`liked` verdicts feed the `similar-to-liked` command.
|
||||||
|
|
||||||
|
### 3.5 New / removed / changed listings
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom diff '<finn-search-url>' [--format FMT]
|
||||||
|
```
|
||||||
|
|
||||||
|
Compares the current search results against the previous run for the same normalized URL and reports new finnkoder, removed finnkoder, and changed listings (price, common costs, status).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom diff '<url>' --format table
|
||||||
|
```
|
||||||
|
|
||||||
|
Useful as a daily cron:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
0 9 * * * cd /path/to/finn-mcp && .venv/bin/finn-eiendom diff 'https://www.finn.no/...' --format markdown >> diff.log
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.6 Shortlist history
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom shortlist [--run-id ID] [--limit N] [--format FMT]
|
||||||
|
```
|
||||||
|
|
||||||
|
Without `--run-id`, returns the latest saved shortlist.
|
||||||
|
|
||||||
|
### 3.7 Eiendom.no commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom resolve-unit '<finn-listing-url>' # find unitCode for a FINN listing
|
||||||
|
finn-eiendom get-unit <unit_code> [--force-refresh] # fetch unit detail
|
||||||
|
finn-eiendom enrich-ad <finnkode> [--with-similar] # FINN + Eiendom.no combined
|
||||||
|
finn-eiendom build-vector <unit_code> # build the base64url unit_vector
|
||||||
|
finn-eiendom decode-vector <unit_vector> # decode for inspection
|
||||||
|
finn-eiendom similar-units <unit_vector> [--status RECENTLY_SOLD|FOR_SALE|CURRENT]
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.8 Find similar to liked
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom similar-to-liked <finnkode> [--mode recommendations|comps] [--status STATUS]
|
||||||
|
```
|
||||||
|
|
||||||
|
The listing must have a `liked` feedback row. Defaults to `mode=recommendations`, `status=FOR_SALE` — i.e. find active listings similar to this one. Use `--mode comps --status RECENTLY_SOLD` to get comparable sales instead.
|
||||||
|
|
||||||
|
### 3.9 Price analysis against comps
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom analyze-against-comps <finnkode>
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns `price_position` (`below_estimate` / `within_range` / `above_estimate`), `sqm_price_position` (`cheap` / `normal` / `expensive`), `comparable_score`, and a `confidence` label.
|
||||||
|
|
||||||
|
### 3.10 Cache management
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom cache stats # row counts and TTL summary
|
||||||
|
finn-eiendom cache clear # purge everything except feedback
|
||||||
|
finn-eiendom cache clear-html # only purge raw HTML
|
||||||
|
finn-eiendom cache clear-json # only purge raw JSON
|
||||||
|
```
|
||||||
|
|
||||||
|
Feedback is never purged by `cache clear` — feedback is permanent until explicitly deleted via SQL.
|
||||||
|
|
||||||
|
### 3.11 MCP server
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom serve # stdio (default)
|
||||||
|
finn-eiendom serve --transport http --port 8010 # HTTP for n8n / multi-client
|
||||||
|
```
|
||||||
|
|
||||||
|
In HTTP mode the server listens on `http://127.0.0.1:8010/mcp` with operational endpoints `GET /health`, `GET /version`, `GET /debug/config`.
|
||||||
|
|
||||||
|
There's also a shorthand `finn-eiendom-mcp` that starts stdio mode directly — that's what Claude Desktop calls.
|
||||||
|
|
||||||
|
### 3.12 Misc
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom config show # print resolved configuration
|
||||||
|
finn-eiendom config path # print SQLite cache path
|
||||||
|
finn-eiendom doctor # smoke checks
|
||||||
|
finn-eiendom version
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. MCP tools (for Claude Desktop / n8n / agents)
|
||||||
|
|
||||||
|
All tools use the `finn_` prefix. They mirror the CLI commands 1:1 — same defaults, same semantics.
|
||||||
|
|
||||||
|
| Tool | Purpose |
|
||||||
|
| ------------------------------------- | ---------------------------------------------------------------- |
|
||||||
|
| `finn_analyze_search` | Analyze a FINN search URL and return a ranked shortlist. |
|
||||||
|
| `finn_get_ad` | Fetch structured data for one finnkode. |
|
||||||
|
| `finn_compare_ads` | Compare multiple listings side by side. |
|
||||||
|
| `finn_save_feedback` | Store feedback/verdict/notes. |
|
||||||
|
| `finn_get_shortlist` | Fetch a stored shortlist from a previous run. |
|
||||||
|
| `finn_get_new_ads_since_last_run` | Detect new / removed / changed listings. |
|
||||||
|
| `finn_resolve_eiendom_unit` | Map FINN URL → Eiendom.no `unitCode`. |
|
||||||
|
| `finn_get_eiendom_unit` | Fetch Eiendom.no unit detail by `unitCode`. |
|
||||||
|
| `finn_enrich_ad` | Combine FINN listing + Eiendom.no enrichment. |
|
||||||
|
| `finn_build_unit_vector` | Build a `unit_vector` from a `unitCode`. |
|
||||||
|
| `finn_decode_unit_vector` | Decode a `unit_vector` for inspection. |
|
||||||
|
| `finn_get_similar_units` | Fetch comps / recommendations. |
|
||||||
|
| `finn_find_similar_to_liked_ad` | Find properties similar to one you liked. |
|
||||||
|
| `finn_analyze_ad_against_comps` | Evaluate a listing against `RECENTLY_SOLD` comps. |
|
||||||
|
|
||||||
|
Every tool accepts a `response_format` parameter (`"json"` or `"markdown"`). Errors come back as `{"error": true, "code": "<ExceptionName>", "message": "..."}`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Claude Desktop setup
|
||||||
|
|
||||||
|
### Config file
|
||||||
|
|
||||||
|
* macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
|
||||||
|
* Linux: `~/.config/Claude/claude_desktop_config.json`
|
||||||
|
|
||||||
|
### Direct entry-point (recommended)
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"mcpServers": {
|
||||||
|
"finn-eiendom": {
|
||||||
|
"command": "/absolute/path/to/finn-mcp/.venv/bin/finn-eiendom-mcp",
|
||||||
|
"env": {
|
||||||
|
"FINN_CACHE_PATH": "/absolute/path/to/finn-mcp/data/finn.sqlite",
|
||||||
|
"EIENDOM_NO_ENABLED": "true",
|
||||||
|
"EIENDOM_NO_SIMILAR_UNITS_ENABLED": "true",
|
||||||
|
"LOG_LEVEL": "INFO"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `command` **must** be the absolute path to the venv's `finn-eiendom-mcp` binary. Don't rely on `$PATH` here — Claude Desktop doesn't inherit your shell environment.
|
||||||
|
|
||||||
|
### Alternative: via `uv`
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"mcpServers": {
|
||||||
|
"finn-eiendom": {
|
||||||
|
"command": "uv",
|
||||||
|
"args": ["run", "finn-eiendom-mcp"],
|
||||||
|
"cwd": "/absolute/path/to/finn-mcp"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verify
|
||||||
|
|
||||||
|
1. Restart Claude Desktop.
|
||||||
|
2. Look for `finn-eiendom` in the MCP servers indicator (usually a hammer icon).
|
||||||
|
3. Ask in any chat: *"Use the finn-eiendom server to analyze this search: ..."*
|
||||||
|
|
||||||
|
If it doesn't show up, check the Claude Desktop logs:
|
||||||
|
|
||||||
|
* macOS: `~/Library/Logs/Claude/mcp-server-finn-eiendom.log`
|
||||||
|
* Linux: `~/.local/share/Claude/logs/mcp-server-finn-eiendom.log`
|
||||||
|
|
||||||
|
stdout output from the server is a fatal error — the server must only log to stderr.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Common workflows
|
||||||
|
|
||||||
|
### 6.1 Daily triage
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Morning routine
|
||||||
|
finn-eiendom diff 'https://www.finn.no/...' --format table
|
||||||
|
# Detail-fetch only what's new or changed
|
||||||
|
finn-eiendom analyze-search 'https://www.finn.no/...' --detail-limit 10 --format markdown
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6.2 Weekly deep dive in Claude Desktop
|
||||||
|
|
||||||
|
> Read my latest finn-eiendom shortlist and group the top 10 by category (bargain / safe / hybel / lifestyle). For each, summarize the three most important risks and the three most important broker questions.
|
||||||
|
|
||||||
|
### 6.3 Pre-viewing prep
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Mark candidates for viewing
|
||||||
|
finn-eiendom save-feedback 462400360 viewing_candidate --notes "Saturday 14:00"
|
||||||
|
# Get the full data + comps
|
||||||
|
finn-eiendom get-ad 462400360 --with-similar --format markdown > viewing_prep_462400360.md
|
||||||
|
```
|
||||||
|
|
||||||
|
Then in Claude Desktop:
|
||||||
|
|
||||||
|
> Read the saved markdown for finnkode 462400360 and prepare a viewing checklist: wet rooms to inspect, common-costs questions, hybel-approval question, neighbor questions.
|
||||||
|
|
||||||
|
### 6.4 Comparing finalists
|
||||||
|
|
||||||
|
```bash
|
||||||
|
finn-eiendom compare 462400360 461153194 459333210 --format markdown > finalists.md
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6.5 Build a recommendation set from liked properties
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# After you've liked a few
|
||||||
|
finn-eiendom save-feedback 462400360 liked
|
||||||
|
finn-eiendom save-feedback 461153194 liked
|
||||||
|
|
||||||
|
# Get recommendations similar to each
|
||||||
|
finn-eiendom similar-to-liked 462400360 --mode recommendations --status FOR_SALE
|
||||||
|
finn-eiendom similar-to-liked 461153194 --mode recommendations --status FOR_SALE
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Environment variables
|
||||||
|
|
||||||
|
| Variable | Default | Purpose |
|
||||||
|
| ----------------------------------------- | -------------------------------: | -------------------------------- |
|
||||||
|
| `FINN_CACHE_PATH` | `data/finn.sqlite` | SQLite DB path |
|
||||||
|
| `FINN_MAX_SEARCH_PAGES` | `3` | Max search pages per analyze |
|
||||||
|
| `FINN_DETAIL_LIMIT` | `20` | Max detail fetches per analyze |
|
||||||
|
| `FINN_REQUEST_DELAY_SECONDS` | `2` | Seconds between FINN requests |
|
||||||
|
| `FINN_USER_AGENT` | `personal-finn-eiendom-analyzer/0.1` | HTTP User-Agent |
|
||||||
|
| `FINN_CACHE_TTL_SEARCH_MINUTES` | `60` | Search cache TTL |
|
||||||
|
| `FINN_CACHE_TTL_AD_HOURS` | `24` | Listing cache TTL |
|
||||||
|
| `EIENDOM_NO_ENABLED` | `true` | Enable Eiendom.no enrichment |
|
||||||
|
| `EIENDOM_NO_BASE_URL` | `https://api.eiendom.no/api/v1` | API base URL |
|
||||||
|
| `EIENDOM_NO_CACHE_TTL_HOURS` | `24` | Unit/similar cache TTL |
|
||||||
|
| `EIENDOM_NO_REQUEST_DELAY_SECONDS` | `1` | Seconds between Eiendom.no calls |
|
||||||
|
| `EIENDOM_NO_SIMILAR_UNITS_ENABLED` | `true` | Enable similar-units |
|
||||||
|
| `EIENDOM_NO_SIMILAR_UNITS_DEFAULT_STATUS` | `RECENTLY_SOLD` | Default comps status |
|
||||||
|
| `HJEMLA_ENABLED` | `false` | Enable optional Hjemla API |
|
||||||
|
| `LOG_LEVEL` | `INFO` | Log level |
|
||||||
|
| `MCP_TRANSPORT` | `stdio` | `stdio` or `streamable_http` |
|
||||||
|
| `MCP_HTTP_HOST` | `127.0.0.1` | HTTP bind address |
|
||||||
|
| `MCP_HTTP_PORT` | `8010` | HTTP port |
|
||||||
|
|
||||||
|
Set them in `.env`, in your shell, or in the Claude Desktop `env` block per §5.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Troubleshooting
|
||||||
|
|
||||||
|
### Claude Desktop doesn't see the server
|
||||||
|
|
||||||
|
1. The `command` path must be absolute and point at the venv's binary.
|
||||||
|
2. Check `~/Library/Logs/Claude/mcp-server-finn-eiendom.log` (macOS) for a Python traceback.
|
||||||
|
3. The server **must not** write to stdout — any `print()` in the code breaks JSON-RPC. If you're hacking on it and see a frame parse error, that's the cause.
|
||||||
|
4. Restart Claude Desktop after config changes (`Cmd+Q`, not just close the window).
|
||||||
|
|
||||||
|
### "Module not found" when running CLI
|
||||||
|
|
||||||
|
The venv isn't activated, or the package isn't installed in editable mode.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
source .venv/bin/activate
|
||||||
|
uv pip install -e ".[dev]"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Eiendom.no enrichment is `unavailable`
|
||||||
|
|
||||||
|
This is graceful degradation when:
|
||||||
|
|
||||||
|
* The FINN URL can't be matched to a `unitCode` (rare, but happens for unusual addresses).
|
||||||
|
* Eiendom.no rate-limited or returned 5xx.
|
||||||
|
* The unit was deleted from Eiendom.no's index.
|
||||||
|
|
||||||
|
Check the log for the warning. The listing analysis continues without enrichment.
|
||||||
|
|
||||||
|
### Similar-units returns nothing
|
||||||
|
|
||||||
|
* Verify `EIENDOM_NO_SIMILAR_UNITS_ENABLED=true`.
|
||||||
|
* The `unit_vector` might be empty / malformed — check `finn-eiendom decode-vector <unit_vector>`.
|
||||||
|
* Try `--status FOR_SALE` if `RECENTLY_SOLD` is sparse, or vice versa.
|
||||||
|
|
||||||
|
### Slow first run
|
||||||
|
|
||||||
|
The first analyze fills the cache. Subsequent runs are much faster. Tune `FINN_REQUEST_DELAY_SECONDS` and `EIENDOM_NO_REQUEST_DELAY_SECONDS` if you're impatient — but don't drop them too low, the whole point of caching is to be polite.
|
||||||
|
|
||||||
|
### Stale results
|
||||||
|
|
||||||
|
Cache TTLs:
|
||||||
|
|
||||||
|
* Search: 60 minutes
|
||||||
|
* FINN listing: 24 hours
|
||||||
|
* Eiendom.no unit: 24 hours
|
||||||
|
* Similar-units: 24 hours
|
||||||
|
|
||||||
|
Force a refresh with `--force-refresh` on `get-ad` or `get-unit`, or wipe with `finn-eiendom cache clear`.
|
||||||
|
|
||||||
|
### `pytest` fails after pulling new changes
|
||||||
|
|
||||||
|
```bash
|
||||||
|
source .venv/bin/activate
|
||||||
|
uv pip install -e ".[dev]" # re-sync dependencies
|
||||||
|
pytest -x # find the first failure
|
||||||
|
```
|
||||||
|
|
||||||
|
If a test fails with a network-related error, that's a bug — tests should never hit the network. Report it.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. What this tool is not
|
||||||
|
|
||||||
|
* Not a public API. Don't expose the HTTP transport on the open internet.
|
||||||
|
* Not financial, legal, or valuation advice. Scores and estimates are decision support.
|
||||||
|
* Not a bidding agent. It will never contact a broker or place a bid for you.
|
||||||
|
* Not a crawler. Use it for the searches you'd be manually browsing anyway — at your own pace.
|
||||||
|
* Not a substitute for a real condition report (`tilstandsrapport`), a real lawyer, or a real broker.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Getting help
|
||||||
|
|
||||||
|
* [`README.md`](README.md) — overview
|
||||||
|
* [`PRD.md`](PRD.md) — full product spec and architecture
|
||||||
|
* [`AGENTS.md`](AGENTS.md) — workflow rules for contributors
|
||||||
|
* [`.github/instructions/*.md`](.github/instructions/) — per-topic conventions
|
||||||
|
|
||||||
|
For bugs, open an issue in the repo with: the exact command run, the full traceback or unexpected output, the version (`finn-eiendom version`), and a redacted FINN URL if relevant.
|
||||||
@@ -0,0 +1,36 @@
|
|||||||
|
"""FINN Real Estate MCP Server - Private property analysis platform."""
|
||||||
|
|
||||||
|
__version__ = "0.1.0"
|
||||||
|
__author__ = "FINN Scout"
|
||||||
|
|
||||||
|
from . import ad, analysis, cache, config, eiendom_no, scoring, search
|
||||||
|
from .http import HTTPClient
|
||||||
|
from .models import EiendomUnit, FinnAd, FinnSearchCard, SimilarUnit, UnitVector
|
||||||
|
from .parser import (
|
||||||
|
extract_finnkode_from_url,
|
||||||
|
normalize_area,
|
||||||
|
normalize_finnkode,
|
||||||
|
normalize_number,
|
||||||
|
normalize_price,
|
||||||
|
)
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
"config",
|
||||||
|
"FinnAd",
|
||||||
|
"FinnSearchCard",
|
||||||
|
"EiendomUnit",
|
||||||
|
"SimilarUnit",
|
||||||
|
"UnitVector",
|
||||||
|
"normalize_price",
|
||||||
|
"normalize_area",
|
||||||
|
"normalize_number",
|
||||||
|
"normalize_finnkode",
|
||||||
|
"extract_finnkode_from_url",
|
||||||
|
"HTTPClient",
|
||||||
|
"ad",
|
||||||
|
"analysis",
|
||||||
|
"cache",
|
||||||
|
"eiendom_no",
|
||||||
|
"scoring",
|
||||||
|
"search",
|
||||||
|
]
|
||||||
@@ -0,0 +1,193 @@
|
|||||||
|
"""FINN listing detail scraping and normalization."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import re
|
||||||
|
from datetime import UTC, datetime
|
||||||
|
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
|
||||||
|
from .http import HTTPClient
|
||||||
|
from .models import FinnAd
|
||||||
|
from .parser import (
|
||||||
|
clean_text,
|
||||||
|
extract_finnkode_from_url,
|
||||||
|
normalize_area,
|
||||||
|
normalize_finnkode,
|
||||||
|
normalize_number,
|
||||||
|
normalize_price,
|
||||||
|
text_to_bool,
|
||||||
|
)
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
FINN_AD_URL_TEMPLATE = "https://www.finn.no/realestate/homes/ad.html?finnkode={}"
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_ad(finnkode: str, client: HTTPClient | None = None) -> str:
|
||||||
|
"""Fetch FINN listing HTML by finnkode."""
|
||||||
|
client = client or HTTPClient(request_delay_seconds=0.0)
|
||||||
|
url = FINN_AD_URL_TEMPLATE.format(finnkode)
|
||||||
|
response = await client.get(url)
|
||||||
|
return response.text
|
||||||
|
|
||||||
|
|
||||||
|
def _load_property_map(soup: BeautifulSoup) -> dict[str, str]:
|
||||||
|
properties: dict[str, str] = {}
|
||||||
|
for dt, dd in zip(soup.find_all("dt"), soup.find_all("dd"), strict=False):
|
||||||
|
key = clean_text(dt.get_text()) or ""
|
||||||
|
value = clean_text(dd.get_text()) or ""
|
||||||
|
properties[key.lower()] = value
|
||||||
|
return properties
|
||||||
|
|
||||||
|
|
||||||
|
def _get_data_testid_value(soup: BeautifulSoup, testid: str) -> str | None:
|
||||||
|
node = soup.select_one(f'[data-testid="{testid}"]')
|
||||||
|
if not node:
|
||||||
|
return None
|
||||||
|
return clean_text(node.get_text(" ", strip=True))
|
||||||
|
|
||||||
|
|
||||||
|
def _strip_labelled_text(text: str | None, labels: list[str]) -> str | None:
|
||||||
|
if not text:
|
||||||
|
return None
|
||||||
|
for label in labels:
|
||||||
|
if text.lower().startswith(label.lower()):
|
||||||
|
return clean_text(text[len(label) :])
|
||||||
|
return text
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_floor_from_text(text: str | None) -> str | None:
|
||||||
|
if not text:
|
||||||
|
return None
|
||||||
|
match = re.search(r"(\d+)\s*\.?\s*etasje", text, re.IGNORECASE)
|
||||||
|
if match:
|
||||||
|
return f"{match.group(1)}. etasje"
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _clean_description(text: str | None) -> str | None:
|
||||||
|
if not text:
|
||||||
|
return None
|
||||||
|
cleaned = re.sub(r"(?i)^om boligen", "", text).strip()
|
||||||
|
cleaned = re.sub(r"(?i)^beskrivelse", "", cleaned).strip()
|
||||||
|
return clean_text(cleaned)
|
||||||
|
|
||||||
|
|
||||||
|
def _load_feature_text(soup: BeautifulSoup) -> str:
|
||||||
|
return _get_data_testid_value(soup, "object-facilities") or ""
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_description(soup: BeautifulSoup) -> str | None:
|
||||||
|
node = soup.select_one('[data-testid="om boligen"]') or soup.select_one(".description")
|
||||||
|
if not node:
|
||||||
|
return None
|
||||||
|
paragraphs = [clean_text(p.get_text()) for p in node.select("p") if clean_text(p.get_text())]
|
||||||
|
if paragraphs:
|
||||||
|
return "\n".join(paragraphs)
|
||||||
|
return _clean_description(node.get_text(" ", strip=True))
|
||||||
|
|
||||||
|
|
||||||
|
def scrape_ad(html: str, url: str | None = None) -> FinnAd:
|
||||||
|
"""Scrape a FINN listing HTML page into a FinnAd model."""
|
||||||
|
soup = BeautifulSoup(html, "html.parser")
|
||||||
|
title_node = soup.select_one("h1")
|
||||||
|
broker_name = soup.select_one(".broker-name")
|
||||||
|
|
||||||
|
properties = _load_property_map(soup)
|
||||||
|
feature_text = _load_feature_text(soup).lower()
|
||||||
|
finnkode = normalize_finnkode(extract_finnkode_from_url(url or "")) or ""
|
||||||
|
address = _get_data_testid_value(soup, "object-address") or properties.get("adresse")
|
||||||
|
district = _get_data_testid_value(soup, "local-area-name") or properties.get("område")
|
||||||
|
ownership_type = _strip_labelled_text(
|
||||||
|
_get_data_testid_value(soup, "info-ownership-type"), ["Eieform", "Eiendomstype"]
|
||||||
|
) or properties.get("eierform")
|
||||||
|
property_type = _strip_labelled_text(
|
||||||
|
_get_data_testid_value(soup, "info-property-type"), ["Boligtype", "Eiendomstype"]
|
||||||
|
) or properties.get("eiendomstype")
|
||||||
|
|
||||||
|
asking_price = normalize_price(
|
||||||
|
properties.get("prisantydning") or _get_data_testid_value(soup, "pricing-incicative-price")
|
||||||
|
)
|
||||||
|
total_price_value = normalize_price(
|
||||||
|
properties.get("totalpris") or _get_data_testid_value(soup, "pricing-total-price")
|
||||||
|
)
|
||||||
|
shared_debt = normalize_price(
|
||||||
|
properties.get("fellesgjeld") or _get_data_testid_value(soup, "pricing-joint-debt")
|
||||||
|
)
|
||||||
|
common_costs = normalize_number(
|
||||||
|
properties.get("felles utgifter")
|
||||||
|
or _get_data_testid_value(soup, "pricing-common-monthly-cost")
|
||||||
|
)
|
||||||
|
area_m2 = normalize_area(
|
||||||
|
properties.get("boligareal")
|
||||||
|
or _get_data_testid_value(soup, "info-usable-i-area")
|
||||||
|
or _get_data_testid_value(soup, "info-usable-area")
|
||||||
|
)
|
||||||
|
rooms = normalize_number(properties.get("rom") or _get_data_testid_value(soup, "info-rooms"))
|
||||||
|
bedrooms = normalize_number(
|
||||||
|
properties.get("soverom") or _get_data_testid_value(soup, "info-bedrooms")
|
||||||
|
)
|
||||||
|
floor = (
|
||||||
|
properties.get("etasje")
|
||||||
|
or _extract_floor_from_text(title_node.get_text() if title_node else "")
|
||||||
|
or _get_data_testid_value(soup, "info-floor")
|
||||||
|
)
|
||||||
|
construction_year = normalize_number(
|
||||||
|
properties.get("byggeår") or _get_data_testid_value(soup, "info-construction-year")
|
||||||
|
)
|
||||||
|
energy_rating = properties.get("energimerking")
|
||||||
|
heating = properties.get("oppvarming")
|
||||||
|
has_balcony = text_to_bool(properties.get("balkonger/terrasser")) or "balkong" in feature_text
|
||||||
|
has_terrace = "terrasse" in feature_text
|
||||||
|
has_elevator = text_to_bool(properties.get("heis")) or "heis" in feature_text
|
||||||
|
has_parking = (
|
||||||
|
bool(properties.get("parkering/garasje"))
|
||||||
|
or "parkering" in feature_text
|
||||||
|
or "garasje" in feature_text
|
||||||
|
)
|
||||||
|
broker_company = None
|
||||||
|
if broker_name:
|
||||||
|
broker_company = clean_text(broker_name.get_text())
|
||||||
|
|
||||||
|
listing_description = _extract_description(soup)
|
||||||
|
|
||||||
|
ad = FinnAd(
|
||||||
|
finnkode=finnkode,
|
||||||
|
url=url or "",
|
||||||
|
title=clean_text(title_node.get_text()) if title_node else None,
|
||||||
|
address=address,
|
||||||
|
postal_area=properties.get("postnummer"),
|
||||||
|
district=district,
|
||||||
|
property_type=property_type,
|
||||||
|
ownership_type=ownership_type,
|
||||||
|
asking_price=asking_price,
|
||||||
|
total_price=total_price_value,
|
||||||
|
shared_debt=shared_debt,
|
||||||
|
common_costs=common_costs,
|
||||||
|
municipal_fee=normalize_number(properties.get("kommunale avgifter")),
|
||||||
|
other_fees=normalize_number(properties.get("andre utgifter")),
|
||||||
|
area_m2=area_m2,
|
||||||
|
rooms=rooms,
|
||||||
|
bedrooms=bedrooms,
|
||||||
|
floor=floor,
|
||||||
|
construction_year=construction_year,
|
||||||
|
energy_rating=energy_rating,
|
||||||
|
heating=heating,
|
||||||
|
has_balcony=has_balcony,
|
||||||
|
has_terrace=has_terrace,
|
||||||
|
has_elevator=has_elevator,
|
||||||
|
has_parking=has_parking,
|
||||||
|
listing_description=listing_description,
|
||||||
|
broker_name=None,
|
||||||
|
broker_company=broker_company,
|
||||||
|
detail_fetched_at=None,
|
||||||
|
)
|
||||||
|
return ad
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_ad_details(finnkode: str, client: HTTPClient | None = None) -> FinnAd:
|
||||||
|
"""Fetch FINN listing HTML and return a parsed FinnAd object."""
|
||||||
|
html = await fetch_ad(finnkode, client=client)
|
||||||
|
ad = scrape_ad(html, url=FINN_AD_URL_TEMPLATE.format(finnkode))
|
||||||
|
ad.detail_fetched_at = datetime.now(UTC)
|
||||||
|
return ad
|
||||||
@@ -0,0 +1,175 @@
|
|||||||
|
"""Orchestration for FINN search + Eiendom.no enrichment + scoring."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
|
||||||
|
from . import ad as ad_module
|
||||||
|
from . import cache, eiendom_no, scoring, search
|
||||||
|
from .config import (
|
||||||
|
FINN_CACHE_PATH,
|
||||||
|
FINN_CACHE_TTL_AD_HOURS,
|
||||||
|
FINN_DETAIL_LIMIT,
|
||||||
|
FINN_MAX_SEARCH_PAGES,
|
||||||
|
)
|
||||||
|
from .models import EiendomUnit, FinnAd, SimilarUnit
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def _normalize_description(text: str | None) -> str:
|
||||||
|
return text.lower() if text else ""
|
||||||
|
|
||||||
|
|
||||||
|
def _build_ad_summary(
|
||||||
|
ad: FinnAd,
|
||||||
|
enriched: EiendomUnit | None,
|
||||||
|
similar_units: list[SimilarUnit],
|
||||||
|
scores: dict,
|
||||||
|
categories: list[str],
|
||||||
|
) -> dict:
|
||||||
|
description = _normalize_description(ad.listing_description)
|
||||||
|
reasons = []
|
||||||
|
risks = []
|
||||||
|
next_steps = [
|
||||||
|
"Open the FINN listing and condition report.",
|
||||||
|
"Review the Eiendom.no estimate and comparable sales.",
|
||||||
|
"Ask the broker about renovation status and approvals.",
|
||||||
|
]
|
||||||
|
|
||||||
|
if enriched and enriched.estimated_selling_price and ad.total_price:
|
||||||
|
if ad.total_price < enriched.estimated_selling_price:
|
||||||
|
reasons.append("Listing price is below Eiendom.no estimate.")
|
||||||
|
elif ad.total_price <= enriched.estimated_selling_price_upper:
|
||||||
|
reasons.append("Price sits within the local estimate range.")
|
||||||
|
else:
|
||||||
|
reasons.append("Listing price is above the estimate range.")
|
||||||
|
else:
|
||||||
|
reasons.append("Eiendom.no enrichment is unavailable or incomplete.")
|
||||||
|
|
||||||
|
if "utsikt" in description or ad.has_balcony or ad.has_terrace:
|
||||||
|
reasons.append("Outdoor space or view potential is positive.")
|
||||||
|
if "hybel" in description or "leie" in description:
|
||||||
|
reasons.append("Potential hybel/rental opportunity is mentioned.")
|
||||||
|
if "potensial" in description or "renover" in description:
|
||||||
|
reasons.append("Renovation or improvement potential is highlighted.")
|
||||||
|
|
||||||
|
if scores.get("risk", 0.0) < 0:
|
||||||
|
risks.append("Risk flags are detected in description or metadata.")
|
||||||
|
if ad.common_costs and ad.common_costs > 5000:
|
||||||
|
risks.append("Common costs are relatively high and should be reviewed.")
|
||||||
|
if enriched and enriched.sale_status and enriched.sale_status.upper() != "FOR_SALE":
|
||||||
|
risks.append("Eiendom.no sale status does not indicate an active sale.")
|
||||||
|
if not enriched:
|
||||||
|
risks.append("Missing Eiendom.no data increases uncertainty.")
|
||||||
|
|
||||||
|
if not any("Eiendom.no" in step for step in next_steps):
|
||||||
|
next_steps.append("Verify the property on Eiendom.no and reconcile any mismatches.")
|
||||||
|
|
||||||
|
if similar_units:
|
||||||
|
next_steps.append("Review the comparable units and average sqm prices.")
|
||||||
|
else:
|
||||||
|
next_steps.append("Comparable sales are unavailable; treat valuation with caution.")
|
||||||
|
|
||||||
|
return {
|
||||||
|
"why_interesting": reasons,
|
||||||
|
"risks": risks,
|
||||||
|
"next_steps": next_steps,
|
||||||
|
"shortlist_reason": ", ".join(reasons[:3])
|
||||||
|
if reasons
|
||||||
|
else "Review details and seller disclosures.",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def analyze_ad(
|
||||||
|
finn_ad: FinnAd,
|
||||||
|
unit_code: str | None = None,
|
||||||
|
) -> dict:
|
||||||
|
"""Enrich a FinnAd and compute score summary."""
|
||||||
|
conn = cache.init_db(FINN_CACHE_PATH)
|
||||||
|
enriched: EiendomUnit | None = None
|
||||||
|
similar_units: list[SimilarUnit] = []
|
||||||
|
|
||||||
|
if unit_code:
|
||||||
|
enriched = cache.get_eiendom_unit(conn, unit_code)
|
||||||
|
if enriched is None:
|
||||||
|
enriched = await eiendom_no.enrich_ad_with_eiendom_no(finn_ad, unit_code)
|
||||||
|
if enriched is not None:
|
||||||
|
cache.save_eiendom_unit(conn, enriched)
|
||||||
|
|
||||||
|
if enriched and enriched.unit_vector:
|
||||||
|
similar_units = cache.get_similar_units(conn, enriched.unit_code, "RECENTLY_SOLD")
|
||||||
|
if not similar_units:
|
||||||
|
similar_units = await eiendom_no.get_similar_units(enriched.unit_vector)
|
||||||
|
if similar_units:
|
||||||
|
cache.save_similar_units(conn, enriched.unit_code, "RECENTLY_SOLD", similar_units)
|
||||||
|
|
||||||
|
scores = scoring.score_ad(finn_ad, enriched, similar_units)
|
||||||
|
categories = scoring.classify_ad(scores)
|
||||||
|
summary = _build_ad_summary(finn_ad, enriched, similar_units, scores, categories)
|
||||||
|
|
||||||
|
result = {
|
||||||
|
"finnkode": finn_ad.finnkode,
|
||||||
|
"title": finn_ad.title,
|
||||||
|
"address": finn_ad.address,
|
||||||
|
"score": scores,
|
||||||
|
"categories": categories,
|
||||||
|
"summary": summary,
|
||||||
|
"eiendom_unit": enriched.model_dump() if enriched else None,
|
||||||
|
"similar_units": [unit.model_dump() for unit in similar_units],
|
||||||
|
}
|
||||||
|
cache.save_finn_ad(conn, finn_ad)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
async def analyze_search(
|
||||||
|
search_url: str,
|
||||||
|
max_pages: int = FINN_MAX_SEARCH_PAGES,
|
||||||
|
fetch_details: bool = True,
|
||||||
|
detail_limit: int = FINN_DETAIL_LIMIT,
|
||||||
|
include_eiendom_no: bool = True,
|
||||||
|
client=None,
|
||||||
|
use_cache: bool = True,
|
||||||
|
) -> dict:
|
||||||
|
"""Analyze a FINN search URL and enrich matching listings."""
|
||||||
|
conn = cache.init_db(FINN_CACHE_PATH)
|
||||||
|
cards = await search.fetch_search_pages(
|
||||||
|
search_url,
|
||||||
|
max_pages=max_pages,
|
||||||
|
client=client,
|
||||||
|
use_cache=use_cache,
|
||||||
|
)
|
||||||
|
results = []
|
||||||
|
enriched_count = 0
|
||||||
|
|
||||||
|
if fetch_details:
|
||||||
|
for card in cards[:detail_limit]:
|
||||||
|
finn_ad = cache.get_finn_ad(conn, card.finnkode, ttl_hours=FINN_CACHE_TTL_AD_HOURS)
|
||||||
|
if finn_ad is None:
|
||||||
|
finn_ad = await ad_module.fetch_ad_details(card.finnkode, client=client)
|
||||||
|
unit_code = None
|
||||||
|
if include_eiendom_no:
|
||||||
|
try:
|
||||||
|
matched_unit = await eiendom_no.search_unit_from_finn_url(card.url)
|
||||||
|
except Exception as exc:
|
||||||
|
logger.warning("Eiendom.no unit search failed: %s", exc)
|
||||||
|
matched_unit = None
|
||||||
|
unit_code = (
|
||||||
|
matched_unit.unit_code
|
||||||
|
if matched_unit
|
||||||
|
else eiendom_no.resolve_unit_from_finn_url(card.url)
|
||||||
|
)
|
||||||
|
result = await analyze_ad(finn_ad, unit_code=unit_code)
|
||||||
|
if result.get("eiendom_unit"):
|
||||||
|
enriched_count += 1
|
||||||
|
results.append(result)
|
||||||
|
|
||||||
|
results.sort(key=lambda item: item["score"].get("total", 0.0), reverse=True)
|
||||||
|
return {
|
||||||
|
"search_url": search_url,
|
||||||
|
"search_cards": [card.model_dump() for card in cards],
|
||||||
|
"analysis": results,
|
||||||
|
"summary": {
|
||||||
|
"total_listings": len(cards),
|
||||||
|
"analyzed_listings": len(results),
|
||||||
|
"eiendom_enriched": enriched_count,
|
||||||
|
},
|
||||||
|
}
|
||||||
@@ -0,0 +1,243 @@
|
|||||||
|
"""SQLite cache and persistence for FINN and Eiendom.no data."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import sqlite3
|
||||||
|
from datetime import UTC, datetime, timedelta
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from .config import FINN_CACHE_PATH
|
||||||
|
from .models import EiendomUnit, FinnAd, FinnSearchCard, SimilarUnit
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def get_connection(path: str | None = None) -> sqlite3.Connection:
|
||||||
|
db_path = path or FINN_CACHE_PATH
|
||||||
|
conn = sqlite3.connect(str(db_path), detect_types=sqlite3.PARSE_DECLTYPES)
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
return conn
|
||||||
|
|
||||||
|
|
||||||
|
def init_db(path: str | None = None) -> sqlite3.Connection:
|
||||||
|
conn = get_connection(path)
|
||||||
|
cursor = conn.cursor()
|
||||||
|
cursor.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE IF NOT EXISTS finn_ads (
|
||||||
|
finnkode TEXT PRIMARY KEY,
|
||||||
|
url TEXT,
|
||||||
|
payload TEXT NOT NULL,
|
||||||
|
fetched_at TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
cursor.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE IF NOT EXISTS eiendom_units (
|
||||||
|
unit_code TEXT PRIMARY KEY,
|
||||||
|
payload TEXT NOT NULL,
|
||||||
|
fetched_at TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
cursor.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE IF NOT EXISTS similar_units (
|
||||||
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||||
|
unit_code TEXT NOT NULL,
|
||||||
|
listing_status TEXT NOT NULL,
|
||||||
|
payload TEXT NOT NULL,
|
||||||
|
fetched_at TEXT NOT NULL
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
cursor.execute(
|
||||||
|
"""
|
||||||
|
CREATE TABLE IF NOT EXISTS cache_meta (
|
||||||
|
key TEXT PRIMARY KEY,
|
||||||
|
value TEXT NOT NULL,
|
||||||
|
expires_at TEXT
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
return conn
|
||||||
|
|
||||||
|
|
||||||
|
def cache_get(conn: sqlite3.Connection, key: str) -> dict[str, Any] | None:
|
||||||
|
cursor = conn.cursor()
|
||||||
|
cursor.execute("SELECT value, expires_at FROM cache_meta WHERE key = ?", (key,))
|
||||||
|
row = cursor.fetchone()
|
||||||
|
if not row:
|
||||||
|
return None
|
||||||
|
|
||||||
|
expires_at = row["expires_at"]
|
||||||
|
if expires_at and datetime.fromisoformat(expires_at) < datetime.now(UTC):
|
||||||
|
cursor.execute("DELETE FROM cache_meta WHERE key = ?", (key,))
|
||||||
|
conn.commit()
|
||||||
|
return None
|
||||||
|
|
||||||
|
return json.loads(row["value"])
|
||||||
|
|
||||||
|
|
||||||
|
def cache_set(
|
||||||
|
conn: sqlite3.Connection,
|
||||||
|
key: str,
|
||||||
|
payload: dict[str, Any],
|
||||||
|
ttl_hours: int | None = None,
|
||||||
|
ttl_minutes: int | None = None,
|
||||||
|
) -> None:
|
||||||
|
expires_at = None
|
||||||
|
if ttl_minutes is not None:
|
||||||
|
expires_at = (datetime.now(UTC) + timedelta(minutes=ttl_minutes)).isoformat()
|
||||||
|
elif ttl_hours is not None:
|
||||||
|
expires_at = (datetime.now(UTC) + timedelta(hours=ttl_hours)).isoformat()
|
||||||
|
cursor = conn.cursor()
|
||||||
|
cursor.execute(
|
||||||
|
"INSERT OR REPLACE INTO cache_meta (key, value, expires_at) VALUES (?, ?, ?)",
|
||||||
|
(key, json.dumps(payload), expires_at),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
|
||||||
|
def _is_fresh(fetched_at: str, ttl_hours: int | None) -> bool:
|
||||||
|
if ttl_hours is None:
|
||||||
|
return True
|
||||||
|
return datetime.fromisoformat(fetched_at) >= datetime.now(UTC) - timedelta(hours=ttl_hours)
|
||||||
|
|
||||||
|
|
||||||
|
def save_search_page(
|
||||||
|
conn: sqlite3.Connection,
|
||||||
|
url: str,
|
||||||
|
html: str,
|
||||||
|
ttl_minutes: int = 60,
|
||||||
|
) -> None:
|
||||||
|
cache_set(conn, f"search_page:{url}", {"html": html}, ttl_minutes=ttl_minutes)
|
||||||
|
|
||||||
|
|
||||||
|
def get_search_page(conn: sqlite3.Connection, url: str) -> str | None:
|
||||||
|
payload = cache_get(conn, f"search_page:{url}")
|
||||||
|
if not payload:
|
||||||
|
return None
|
||||||
|
return payload.get("html")
|
||||||
|
|
||||||
|
|
||||||
|
def save_search_cards(
|
||||||
|
conn: sqlite3.Connection,
|
||||||
|
url: str,
|
||||||
|
cards: list[FinnSearchCard],
|
||||||
|
ttl_minutes: int = 60,
|
||||||
|
) -> None:
|
||||||
|
cache_set(
|
||||||
|
conn,
|
||||||
|
f"search_cards:{url}",
|
||||||
|
[card.model_dump(mode="json") for card in cards],
|
||||||
|
ttl_minutes=ttl_minutes,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def get_search_cards(conn: sqlite3.Connection, url: str) -> list[FinnSearchCard]:
|
||||||
|
payload = cache_get(conn, f"search_cards:{url}")
|
||||||
|
if not payload:
|
||||||
|
return []
|
||||||
|
return [FinnSearchCard.model_validate(item) for item in payload]
|
||||||
|
|
||||||
|
|
||||||
|
def save_finn_ad(conn: sqlite3.Connection, ad: FinnAd) -> None:
|
||||||
|
cursor = conn.cursor()
|
||||||
|
payload = ad.model_dump(mode="json")
|
||||||
|
cursor.execute(
|
||||||
|
"INSERT OR REPLACE INTO finn_ads (finnkode, url, payload, fetched_at) VALUES (?, ?, ?, ?)",
|
||||||
|
(
|
||||||
|
ad.finnkode,
|
||||||
|
ad.url,
|
||||||
|
json.dumps(payload),
|
||||||
|
ad.detail_fetched_at.isoformat()
|
||||||
|
if ad.detail_fetched_at
|
||||||
|
else datetime.now(UTC).isoformat(),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
|
||||||
|
def get_finn_ad(
|
||||||
|
conn: sqlite3.Connection, finnkode: str, ttl_hours: int | None = None
|
||||||
|
) -> FinnAd | None:
|
||||||
|
cursor = conn.cursor()
|
||||||
|
cursor.execute("SELECT payload, fetched_at FROM finn_ads WHERE finnkode = ?", (finnkode,))
|
||||||
|
row = cursor.fetchone()
|
||||||
|
if not row:
|
||||||
|
return None
|
||||||
|
if ttl_hours is not None and not _is_fresh(row["fetched_at"], ttl_hours):
|
||||||
|
return None
|
||||||
|
return FinnAd.model_validate(json.loads(row["payload"]))
|
||||||
|
|
||||||
|
|
||||||
|
def save_eiendom_unit(conn: sqlite3.Connection, unit: EiendomUnit) -> None:
|
||||||
|
cursor = conn.cursor()
|
||||||
|
cursor.execute(
|
||||||
|
"INSERT OR REPLACE INTO eiendom_units (unit_code, payload, fetched_at) VALUES (?, ?, ?)",
|
||||||
|
(unit.unit_code, json.dumps(unit.model_dump(mode="json")), unit.fetched_at.isoformat()),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
|
||||||
|
def get_eiendom_unit(
|
||||||
|
conn: sqlite3.Connection,
|
||||||
|
unit_code: str,
|
||||||
|
ttl_hours: int | None = None,
|
||||||
|
) -> EiendomUnit | None:
|
||||||
|
cursor = conn.cursor()
|
||||||
|
cursor.execute(
|
||||||
|
"SELECT payload, fetched_at FROM eiendom_units WHERE unit_code = ?",
|
||||||
|
(unit_code,),
|
||||||
|
)
|
||||||
|
row = cursor.fetchone()
|
||||||
|
if not row:
|
||||||
|
return None
|
||||||
|
if ttl_hours is not None and not _is_fresh(row["fetched_at"], ttl_hours):
|
||||||
|
return None
|
||||||
|
return EiendomUnit.model_validate(json.loads(row["payload"]))
|
||||||
|
|
||||||
|
|
||||||
|
def save_similar_units(
|
||||||
|
conn: sqlite3.Connection,
|
||||||
|
unit_code: str,
|
||||||
|
listing_status: str,
|
||||||
|
similar_units: list[SimilarUnit],
|
||||||
|
) -> None:
|
||||||
|
cursor = conn.cursor()
|
||||||
|
payload = json.dumps([item.model_dump(mode="json") for item in similar_units])
|
||||||
|
cursor.execute(
|
||||||
|
(
|
||||||
|
"INSERT INTO similar_units"
|
||||||
|
" (unit_code, listing_status, payload, fetched_at)"
|
||||||
|
" VALUES (?, ?, ?, ?)"
|
||||||
|
),
|
||||||
|
(unit_code, listing_status, payload, datetime.now(UTC).isoformat()),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
|
||||||
|
def get_similar_units(
|
||||||
|
conn: sqlite3.Connection,
|
||||||
|
unit_code: str,
|
||||||
|
listing_status: str,
|
||||||
|
ttl_hours: int | None = None,
|
||||||
|
) -> list[SimilarUnit]:
|
||||||
|
cursor = conn.cursor()
|
||||||
|
cursor.execute(
|
||||||
|
(
|
||||||
|
"SELECT payload, fetched_at FROM similar_units"
|
||||||
|
" WHERE unit_code = ? AND listing_status = ?"
|
||||||
|
" ORDER BY id DESC LIMIT 1"
|
||||||
|
),
|
||||||
|
(unit_code, listing_status),
|
||||||
|
)
|
||||||
|
row = cursor.fetchone()
|
||||||
|
if not row:
|
||||||
|
return []
|
||||||
|
if ttl_hours is not None and not _is_fresh(row["fetched_at"], ttl_hours):
|
||||||
|
return []
|
||||||
|
return [SimilarUnit.model_validate(item) for item in json.loads(row["payload"])]
|
||||||
@@ -0,0 +1,30 @@
|
|||||||
|
"""Configuration and environment variables."""
|
||||||
|
|
||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# Cache and database
|
||||||
|
FINN_CACHE_PATH = os.getenv("FINN_CACHE_PATH", str(Path("data/finn.sqlite")))
|
||||||
|
|
||||||
|
# FINN API settings
|
||||||
|
FINN_MAX_SEARCH_PAGES = int(os.getenv("FINN_MAX_SEARCH_PAGES", "3"))
|
||||||
|
FINN_DETAIL_LIMIT = int(os.getenv("FINN_DETAIL_LIMIT", "20"))
|
||||||
|
FINN_REQUEST_DELAY_SECONDS = float(os.getenv("FINN_REQUEST_DELAY_SECONDS", "2"))
|
||||||
|
FINN_USER_AGENT = os.getenv("FINN_USER_AGENT", "personal-finn-eiendom-analyzer/0.1")
|
||||||
|
FINN_CACHE_TTL_SEARCH_MINUTES = int(os.getenv("FINN_CACHE_TTL_SEARCH_MINUTES", "60"))
|
||||||
|
FINN_CACHE_TTL_AD_HOURS = int(os.getenv("FINN_CACHE_TTL_AD_HOURS", "24"))
|
||||||
|
|
||||||
|
# Eiendom.no API settings
|
||||||
|
EIENDOM_NO_ENABLED = os.getenv("EIENDOM_NO_ENABLED", "true").lower() == "true"
|
||||||
|
EIENDOM_NO_BASE_URL = os.getenv("EIENDOM_NO_BASE_URL", "https://api.eiendom.no/api/v1")
|
||||||
|
EIENDOM_NO_REQUEST_DELAY_SECONDS = float(os.getenv("EIENDOM_NO_REQUEST_DELAY_SECONDS", "1"))
|
||||||
|
EIENDOM_NO_CACHE_TTL_HOURS = int(os.getenv("EIENDOM_NO_CACHE_TTL_HOURS", "24"))
|
||||||
|
EIENDOM_NO_SIMILAR_UNITS_ENABLED = (
|
||||||
|
os.getenv("EIENDOM_NO_SIMILAR_UNITS_ENABLED", "true").lower() == "true"
|
||||||
|
)
|
||||||
|
EIENDOM_NO_SIMILAR_UNITS_DEFAULT_STATUS = os.getenv(
|
||||||
|
"EIENDOM_NO_SIMILAR_UNITS_DEFAULT_STATUS", "RECENTLY_SOLD"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Logging
|
||||||
|
LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
|
||||||
@@ -0,0 +1,236 @@
|
|||||||
|
"""Eiendom.no enrichment, unit vector, and similar units client."""
|
||||||
|
|
||||||
|
import base64
|
||||||
|
import logging
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import msgpack
|
||||||
|
|
||||||
|
from .config import (
|
||||||
|
EIENDOM_NO_BASE_URL,
|
||||||
|
EIENDOM_NO_ENABLED,
|
||||||
|
EIENDOM_NO_REQUEST_DELAY_SECONDS,
|
||||||
|
EIENDOM_NO_SIMILAR_UNITS_DEFAULT_STATUS,
|
||||||
|
)
|
||||||
|
from .http import HTTPClient
|
||||||
|
from .models import EiendomUnit, SimilarUnit, UnitVector
|
||||||
|
from .parser import extract_finnkode_from_url, normalize_finnkode
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_coordinates(geometry: dict) -> tuple[float | None, float | None]:
|
||||||
|
if not isinstance(geometry, dict):
|
||||||
|
return None, None
|
||||||
|
coords = geometry.get("coordinates") or []
|
||||||
|
if isinstance(coords, (list, tuple)) and len(coords) >= 2:
|
||||||
|
return coords[0], coords[1]
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
|
||||||
|
def parse_eiendom_unit_json(unit_data: dict) -> EiendomUnit:
|
||||||
|
geometry = unit_data.get("geometry", {})
|
||||||
|
lon, lat = _extract_coordinates(geometry)
|
||||||
|
specification = unit_data.get("specification", {})
|
||||||
|
valuation = unit_data.get("valuation", {})
|
||||||
|
market = unit_data.get("latestMarketData", {})
|
||||||
|
|
||||||
|
return EiendomUnit(
|
||||||
|
unit_code=unit_data.get("unitCode", ""),
|
||||||
|
address=unit_data.get("address") or unit_data.get("streetAddress"),
|
||||||
|
lat=lat or unit_data.get("lat"),
|
||||||
|
lng=lon or unit_data.get("lon"),
|
||||||
|
property_type=specification.get("propertyType") or unit_data.get("propertyType"),
|
||||||
|
floor=specification.get("floor") or unit_data.get("floor"),
|
||||||
|
rooms=specification.get("rooms") or unit_data.get("rooms"),
|
||||||
|
construction_year=specification.get("constructionYear")
|
||||||
|
or unit_data.get("constructionYear"),
|
||||||
|
usable_area=specification.get("usableArea") or unit_data.get("usableArea"),
|
||||||
|
estimated_selling_price=valuation.get("estimatedSellingPrice")
|
||||||
|
or unit_data.get("estimatedSellingPrice"),
|
||||||
|
estimated_selling_price_lower=valuation.get("estimatedSellingPriceLower")
|
||||||
|
or unit_data.get("estimatedSellingPriceLower"),
|
||||||
|
estimated_selling_price_upper=valuation.get("estimatedSellingPriceUpper")
|
||||||
|
or unit_data.get("estimatedSellingPriceUpper"),
|
||||||
|
listing_price=market.get("listingPrice") or unit_data.get("listingPrice"),
|
||||||
|
listing_sqm_price=market.get("squareMeterPrice")
|
||||||
|
or unit_data.get("listingSquareMeterPrice"),
|
||||||
|
common_costs=market.get("monthlyCosts")
|
||||||
|
or market.get("commonCosts")
|
||||||
|
or unit_data.get("commonCosts"),
|
||||||
|
days_on_market=market.get("daysOnMarket") or unit_data.get("daysOnMarket"),
|
||||||
|
sale_status=market.get("saleStatus") or unit_data.get("saleStatus"),
|
||||||
|
market_placement_score=market.get("marketPlacementScore")
|
||||||
|
or unit_data.get("marketPlacementScore"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_similar_units_json(response_data: dict) -> list[SimilarUnit]:
|
||||||
|
units: list[SimilarUnit] = []
|
||||||
|
for item in response_data.get("units", []):
|
||||||
|
geometry = item.get("geometry", {})
|
||||||
|
lon, lat = _extract_coordinates(geometry)
|
||||||
|
specification = item.get("specification", {})
|
||||||
|
market = item.get("marketData", {})
|
||||||
|
units.append(
|
||||||
|
SimilarUnit(
|
||||||
|
unit_code=item.get("unitCode", ""),
|
||||||
|
address=item.get("address"),
|
||||||
|
lat=lat or item.get("lat"),
|
||||||
|
lng=lon or item.get("lon"),
|
||||||
|
property_type=specification.get("propertyType") or item.get("propertyType"),
|
||||||
|
floor=specification.get("floor") or item.get("floor"),
|
||||||
|
rooms=specification.get("rooms") or item.get("rooms"),
|
||||||
|
construction_year=specification.get("constructionYear")
|
||||||
|
or item.get("constructionYear"),
|
||||||
|
usable_area=specification.get("usableArea") or item.get("usableArea"),
|
||||||
|
listing_price=market.get("listingPrice") or item.get("listingPrice"),
|
||||||
|
selling_price=market.get("sellingPrice") or item.get("sellingPrice"),
|
||||||
|
shared_debt=market.get("jointDebt") or item.get("sharedDebt"),
|
||||||
|
common_costs=market.get("monthlyCosts") or item.get("commonCosts"),
|
||||||
|
sqm_price=market.get("squareMeterPrice") or item.get("squareMeterPrice"),
|
||||||
|
days_on_market=market.get("daysOnMarket") or item.get("daysOnMarket"),
|
||||||
|
sale_status=market.get("saleStatus") or item.get("saleStatus"),
|
||||||
|
finalized_at=item.get("finalizedAt") or market.get("finalizedAt"),
|
||||||
|
listing_status=item.get("listingStatus", "RECENTLY_SOLD"),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
return units
|
||||||
|
|
||||||
|
|
||||||
|
def build_unit_vector(unit: EiendomUnit) -> str:
|
||||||
|
"""Build a base64url-encoded unit_vector from EiendomUnit data."""
|
||||||
|
payload = UnitVector(
|
||||||
|
lon=unit.lng or 0.0,
|
||||||
|
lat=unit.lat or 0.0,
|
||||||
|
ptype=unit.property_type or "APARTMENT",
|
||||||
|
floor=unit.floor,
|
||||||
|
rooms=unit.rooms,
|
||||||
|
built=unit.construction_year,
|
||||||
|
area=unit.usable_area,
|
||||||
|
price=unit.listing_price or unit.estimated_selling_price,
|
||||||
|
)
|
||||||
|
packed = msgpack.packb(payload.model_dump(), use_bin_type=True)
|
||||||
|
encoded = base64.urlsafe_b64encode(packed).decode("utf-8").rstrip("=")
|
||||||
|
return encoded
|
||||||
|
|
||||||
|
|
||||||
|
def decode_unit_vector(vector_str: str) -> dict:
|
||||||
|
"""Decode a base64url unit_vector for debugging."""
|
||||||
|
padding = 4 - (len(vector_str) % 4)
|
||||||
|
if padding != 4:
|
||||||
|
vector_str += "=" * padding
|
||||||
|
packed = base64.urlsafe_b64decode(vector_str.encode("utf-8"))
|
||||||
|
return msgpack.unpackb(packed, raw=False)
|
||||||
|
|
||||||
|
|
||||||
|
async def search_unit_from_finn_url(
|
||||||
|
finn_url: str,
|
||||||
|
client: HTTPClient | None = None,
|
||||||
|
) -> EiendomUnit | None:
|
||||||
|
if not EIENDOM_NO_ENABLED or not finn_url:
|
||||||
|
logger.info("Eiendom.no unit search is disabled or finn_url is empty")
|
||||||
|
return None
|
||||||
|
|
||||||
|
client = client or HTTPClient(
|
||||||
|
base_url=EIENDOM_NO_BASE_URL,
|
||||||
|
request_delay_seconds=EIENDOM_NO_REQUEST_DELAY_SECONDS,
|
||||||
|
)
|
||||||
|
response = await client.get(
|
||||||
|
"/geodata/units/search/",
|
||||||
|
params={"search": finn_url},
|
||||||
|
)
|
||||||
|
data = response.json()
|
||||||
|
units = data.get("units", [])
|
||||||
|
if not units:
|
||||||
|
return None
|
||||||
|
return parse_eiendom_unit_json(units[0])
|
||||||
|
|
||||||
|
|
||||||
|
async def get_unit(
|
||||||
|
unit_code: str,
|
||||||
|
client: HTTPClient | None = None,
|
||||||
|
) -> EiendomUnit | None:
|
||||||
|
if not EIENDOM_NO_ENABLED:
|
||||||
|
logger.info("Eiendom.no enrichment is disabled")
|
||||||
|
return None
|
||||||
|
|
||||||
|
client = client or HTTPClient(
|
||||||
|
base_url=EIENDOM_NO_BASE_URL,
|
||||||
|
request_delay_seconds=EIENDOM_NO_REQUEST_DELAY_SECONDS,
|
||||||
|
)
|
||||||
|
path = f"/geodata/units/{unit_code}/"
|
||||||
|
response = await client.get(path)
|
||||||
|
data = response.json()
|
||||||
|
units = data.get("units") or []
|
||||||
|
if not units and isinstance(data, dict) and data.get("unitCode"):
|
||||||
|
return parse_eiendom_unit_json(data)
|
||||||
|
if not units:
|
||||||
|
return None
|
||||||
|
return parse_eiendom_unit_json(units[0])
|
||||||
|
|
||||||
|
|
||||||
|
async def get_eiendom_unit(
|
||||||
|
unit_code: str,
|
||||||
|
client: HTTPClient | None = None,
|
||||||
|
) -> EiendomUnit | None:
|
||||||
|
return await get_unit(unit_code, client=client)
|
||||||
|
|
||||||
|
|
||||||
|
async def get_similar_units(
|
||||||
|
unit_vector: str,
|
||||||
|
listing_status: str = EIENDOM_NO_SIMILAR_UNITS_DEFAULT_STATUS,
|
||||||
|
client: HTTPClient | None = None,
|
||||||
|
) -> list[SimilarUnit]:
|
||||||
|
if not EIENDOM_NO_ENABLED:
|
||||||
|
logger.info("Eiendom.no similar-units disabled")
|
||||||
|
return []
|
||||||
|
|
||||||
|
client = client or HTTPClient(
|
||||||
|
base_url=EIENDOM_NO_BASE_URL,
|
||||||
|
request_delay_seconds=EIENDOM_NO_REQUEST_DELAY_SECONDS,
|
||||||
|
)
|
||||||
|
response = await client.get(
|
||||||
|
"/geodata/units/similar/",
|
||||||
|
params={"unit_vector": unit_vector},
|
||||||
|
)
|
||||||
|
data = response.json()
|
||||||
|
units = parse_similar_units_json(data)
|
||||||
|
|
||||||
|
listing_status = (listing_status or "").upper()
|
||||||
|
if listing_status == "RECENTLY_SOLD":
|
||||||
|
units = [
|
||||||
|
unit
|
||||||
|
for unit in units
|
||||||
|
if unit.sale_status and unit.sale_status.upper() == "SOLD" and unit.finalized_at
|
||||||
|
]
|
||||||
|
elif listing_status == "FOR_SALE":
|
||||||
|
units = [
|
||||||
|
unit for unit in units if unit.sale_status and unit.sale_status.upper() == "FORSALE"
|
||||||
|
]
|
||||||
|
|
||||||
|
return units
|
||||||
|
|
||||||
|
|
||||||
|
def resolve_unit_from_finn_url(finn_url: str) -> str | None:
|
||||||
|
"""Resolve the FINN URL into a unit identifier or unitCode placeholder."""
|
||||||
|
if not finn_url:
|
||||||
|
return None
|
||||||
|
candidate = normalize_finnkode(extract_finnkode_from_url(finn_url))
|
||||||
|
if candidate:
|
||||||
|
return candidate
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
async def enrich_ad_with_eiendom_no(
|
||||||
|
ad: Any,
|
||||||
|
unit_code: str | None = None,
|
||||||
|
client: HTTPClient | None = None,
|
||||||
|
) -> EiendomUnit | None:
|
||||||
|
if not unit_code:
|
||||||
|
return None
|
||||||
|
unit = await get_eiendom_unit(unit_code, client=client)
|
||||||
|
if unit is None:
|
||||||
|
return None
|
||||||
|
unit.unit_vector = build_unit_vector(unit)
|
||||||
|
return unit
|
||||||
@@ -0,0 +1,122 @@
|
|||||||
|
"""HTTP client with retries, delays, and error handling."""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class HTTPClient:
|
||||||
|
"""HTTP client with configurable retries, delays, and timeout."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
base_url: str = "",
|
||||||
|
user_agent: str = "personal-finn-eiendom-analyzer/0.1",
|
||||||
|
request_delay_seconds: float = 0.0,
|
||||||
|
retries: int = 1,
|
||||||
|
timeout_seconds: float = 30.0,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Initialize HTTP client.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
base_url: Base URL for requests
|
||||||
|
user_agent: User-Agent header value
|
||||||
|
request_delay_seconds: Delay between requests (to be respectful)
|
||||||
|
retries: Number of retry attempts for failed connections
|
||||||
|
timeout_seconds: Request timeout
|
||||||
|
"""
|
||||||
|
self.base_url = base_url
|
||||||
|
self.user_agent = user_agent
|
||||||
|
self.request_delay_seconds = request_delay_seconds
|
||||||
|
self.timeout = httpx.Timeout(timeout_seconds)
|
||||||
|
self.transport = httpx.AsyncHTTPTransport(retries=retries)
|
||||||
|
self.last_request_time: float | None = None
|
||||||
|
|
||||||
|
async def get(self, url: str, **kwargs) -> httpx.Response:
|
||||||
|
"""
|
||||||
|
Make async GET request with delay and error handling.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
url: URL to fetch
|
||||||
|
**kwargs: Additional httpx arguments
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
httpx.Response
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
httpx.HTTPStatusError if status is 4xx or 5xx
|
||||||
|
"""
|
||||||
|
headers = kwargs.pop("headers", {})
|
||||||
|
if "User-Agent" not in headers:
|
||||||
|
headers["User-Agent"] = self.user_agent
|
||||||
|
|
||||||
|
for attempt in range(self._get_retries() + 1):
|
||||||
|
await self._apply_delay()
|
||||||
|
|
||||||
|
async with httpx.AsyncClient(
|
||||||
|
timeout=self.timeout,
|
||||||
|
base_url=self.base_url if not url.startswith("http") else "",
|
||||||
|
) as client:
|
||||||
|
try:
|
||||||
|
response = await client.get(url, headers=headers, **kwargs)
|
||||||
|
if response.status_code < 500:
|
||||||
|
response.raise_for_status()
|
||||||
|
logger.debug(f"GET {url} -> {response.status_code}")
|
||||||
|
return response
|
||||||
|
if attempt < self._get_retries():
|
||||||
|
await asyncio.sleep(2**attempt)
|
||||||
|
continue
|
||||||
|
response.raise_for_status()
|
||||||
|
return response
|
||||||
|
except httpx.HTTPStatusError as e:
|
||||||
|
logger.error(f"HTTP {e.response.status_code} for {url}")
|
||||||
|
raise
|
||||||
|
except httpx.RequestError as e:
|
||||||
|
logger.error(f"Request failed for {url}: {e}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
def _get_retries(self) -> int:
|
||||||
|
"""Get retries count from transport."""
|
||||||
|
if hasattr(self.transport, "_retries"):
|
||||||
|
return self.transport._retries
|
||||||
|
return 1
|
||||||
|
|
||||||
|
async def post(self, url: str, **kwargs) -> httpx.Response:
|
||||||
|
"""Make async POST request with delay and error handling."""
|
||||||
|
headers = kwargs.pop("headers", {})
|
||||||
|
if "User-Agent" not in headers:
|
||||||
|
headers["User-Agent"] = self.user_agent
|
||||||
|
|
||||||
|
for attempt in range(self._get_retries() + 1):
|
||||||
|
await self._apply_delay()
|
||||||
|
|
||||||
|
async with httpx.AsyncClient(
|
||||||
|
timeout=self.timeout,
|
||||||
|
base_url=self.base_url if not url.startswith("http") else "",
|
||||||
|
) as client:
|
||||||
|
try:
|
||||||
|
response = await client.post(url, headers=headers, **kwargs)
|
||||||
|
if response.status_code < 500:
|
||||||
|
response.raise_for_status()
|
||||||
|
logger.debug(f"POST {url} -> {response.status_code}")
|
||||||
|
return response
|
||||||
|
if attempt < self._get_retries():
|
||||||
|
await asyncio.sleep(2**attempt)
|
||||||
|
continue
|
||||||
|
response.raise_for_status()
|
||||||
|
return response
|
||||||
|
except httpx.HTTPStatusError as e:
|
||||||
|
logger.error(f"HTTP {e.response.status_code} for {url}")
|
||||||
|
raise
|
||||||
|
except httpx.RequestError as e:
|
||||||
|
logger.error(f"Request failed for {url}: {e}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
async def _apply_delay(self):
|
||||||
|
"""Apply delay between requests if configured."""
|
||||||
|
if self.request_delay_seconds > 0:
|
||||||
|
await asyncio.sleep(self.request_delay_seconds)
|
||||||
@@ -0,0 +1,160 @@
|
|||||||
|
"""FastMCP stdio server for FINN real estate analysis and Eiendom.no enrichment."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
|
||||||
|
from mcp.server.fastmcp import FastMCP
|
||||||
|
|
||||||
|
from .analysis import analyze_search
|
||||||
|
from .eiendom_no import (
|
||||||
|
build_unit_vector,
|
||||||
|
decode_unit_vector,
|
||||||
|
get_similar_units,
|
||||||
|
get_unit,
|
||||||
|
search_unit_from_finn_url,
|
||||||
|
)
|
||||||
|
from .service import get_or_fetch_ad, get_or_fetch_eiendom_unit
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
mcp = FastMCP("finn_eiendom_mcp")
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool(
|
||||||
|
description=(
|
||||||
|
"Analyze a FINN.no real estate search URL. Scrapes listing cards,"
|
||||||
|
" fetches details, enriches with Eiendom.no data, scores, and ranks."
|
||||||
|
)
|
||||||
|
)
|
||||||
|
async def finn_analyze_search(
|
||||||
|
search_url: str,
|
||||||
|
max_pages: int = 3,
|
||||||
|
detail_limit: int = 20,
|
||||||
|
include_details: bool = True,
|
||||||
|
include_eiendom_no: bool = True,
|
||||||
|
) -> str:
|
||||||
|
"""Analyze a FINN search URL and return ranked listing results."""
|
||||||
|
try:
|
||||||
|
result = await analyze_search(
|
||||||
|
search_url,
|
||||||
|
max_pages=max_pages,
|
||||||
|
fetch_details=include_details,
|
||||||
|
detail_limit=detail_limit,
|
||||||
|
include_eiendom_no=include_eiendom_no,
|
||||||
|
)
|
||||||
|
return json.dumps(result)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error analyzing search: {e}")
|
||||||
|
return json.dumps({"error": True, "message": str(e)})
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool(
|
||||||
|
description=(
|
||||||
|
"Fetch full detail for a FINN listing by finnkode."
|
||||||
|
" Checks cache first; use force_refresh=True to bypass."
|
||||||
|
)
|
||||||
|
)
|
||||||
|
async def finn_get_ad(finnkode: str, force_refresh: bool = False) -> str:
|
||||||
|
"""Fetch FINN ad details by finnkode."""
|
||||||
|
try:
|
||||||
|
ad = await get_or_fetch_ad(finnkode, force_refresh=force_refresh)
|
||||||
|
return ad.model_dump_json()
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error fetching ad {finnkode}: {e}")
|
||||||
|
return json.dumps({"error": True, "message": str(e)})
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool(
|
||||||
|
description="Resolve an Eiendom.no unit_code from a FINN listing URL. "
|
||||||
|
"Returns unit_code, address, lat, lng or an error if not found."
|
||||||
|
)
|
||||||
|
async def finn_resolve_eiendom_unit(finn_url: str) -> str:
|
||||||
|
"""Resolve Eiendom.no unit from FINN URL."""
|
||||||
|
try:
|
||||||
|
unit = await search_unit_from_finn_url(finn_url)
|
||||||
|
if unit is None:
|
||||||
|
return json.dumps(
|
||||||
|
{
|
||||||
|
"error": True,
|
||||||
|
"message": "Eiendom.no unit could not be resolved from FINN URL",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
return json.dumps(
|
||||||
|
{
|
||||||
|
"unit_code": unit.unit_code,
|
||||||
|
"address": unit.address,
|
||||||
|
"lat": unit.lat,
|
||||||
|
"lng": unit.lng,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error resolving unit from {finn_url}: {e}")
|
||||||
|
return json.dumps({"error": True, "message": str(e)})
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool(
|
||||||
|
description="Fetch full Eiendom.no unit data by unit_code. Checks SQLite cache (24h TTL)."
|
||||||
|
)
|
||||||
|
async def finn_get_eiendom_unit(unit_code: str, force_refresh: bool = False) -> str:
|
||||||
|
"""Fetch Eiendom.no unit details by unit_code."""
|
||||||
|
try:
|
||||||
|
unit = await get_or_fetch_eiendom_unit(unit_code, force_refresh=force_refresh)
|
||||||
|
if unit is None:
|
||||||
|
return json.dumps({"error": True, "message": "Eiendom.no unit not found"})
|
||||||
|
return unit.model_dump_json()
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error fetching unit {unit_code}: {e}")
|
||||||
|
return json.dumps({"error": True, "message": str(e)})
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool(
|
||||||
|
description="Fetch comparable recently-sold or for-sale units from Eiendom.no using a "
|
||||||
|
"base64-encoded unit vector. Returns list of similar units with sale prices."
|
||||||
|
)
|
||||||
|
async def finn_get_similar_units(unit_vector: str, listing_status: str = "RECENTLY_SOLD") -> str:
|
||||||
|
"""Fetch similar units from Eiendom.no."""
|
||||||
|
try:
|
||||||
|
units = await get_similar_units(unit_vector, listing_status)
|
||||||
|
return json.dumps([unit.model_dump() for unit in units])
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error fetching similar units: {e}")
|
||||||
|
return json.dumps({"error": True, "message": str(e)})
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool(
|
||||||
|
description="Build a base64-encoded unit vector for a given Eiendom.no unit_code. "
|
||||||
|
"The vector is used as input to finn_get_similar_units."
|
||||||
|
)
|
||||||
|
async def finn_build_unit_vector(unit_code: str) -> str:
|
||||||
|
"""Build unit vector for Eiendom.no unit."""
|
||||||
|
try:
|
||||||
|
unit = await get_unit(unit_code)
|
||||||
|
if unit is None:
|
||||||
|
return json.dumps({"error": True, "message": "Eiendom.no unit not found"})
|
||||||
|
return json.dumps({"unit_code": unit.unit_code, "unit_vector": build_unit_vector(unit)})
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error building unit vector for {unit_code}: {e}")
|
||||||
|
return json.dumps({"error": True, "message": str(e)})
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool(
|
||||||
|
description="Decode a base64 unit vector into human-readable JSON (lat, lon, property type, "
|
||||||
|
"floor, rooms, construction year, area, price)."
|
||||||
|
)
|
||||||
|
def finn_decode_unit_vector(unit_vector: str) -> str:
|
||||||
|
"""Decode unit vector to readable format."""
|
||||||
|
try:
|
||||||
|
result = decode_unit_vector(unit_vector)
|
||||||
|
return json.dumps(result)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error decoding unit vector: {e}")
|
||||||
|
return json.dumps({"error": True, "message": str(e)})
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
"""Run the FastMCP stdio server."""
|
||||||
|
mcp.run(transport="stdio")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,128 @@
|
|||||||
|
"""Pydantic models for FINN ads and Eiendom.no units."""
|
||||||
|
|
||||||
|
from datetime import UTC, datetime
|
||||||
|
|
||||||
|
from pydantic import BaseModel, ConfigDict, Field
|
||||||
|
|
||||||
|
|
||||||
|
class FinnSearchCard(BaseModel):
|
||||||
|
"""FINN search result card (minimal fields from search listing)."""
|
||||||
|
|
||||||
|
finnkode: str
|
||||||
|
url: str
|
||||||
|
title: str | None = None
|
||||||
|
address: str | None = None
|
||||||
|
area_m2: int | None = None
|
||||||
|
asking_price: int | None = None
|
||||||
|
total_price: int | None = None
|
||||||
|
common_costs: int | None = None
|
||||||
|
property_type: str | None = None
|
||||||
|
ownership_type: str | None = None
|
||||||
|
bedrooms: int | None = None
|
||||||
|
floor: str | None = None
|
||||||
|
broker_company: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class FinnAd(BaseModel):
|
||||||
|
"""FINN listing detail with all available fields."""
|
||||||
|
|
||||||
|
finnkode: str
|
||||||
|
url: str
|
||||||
|
title: str | None = None
|
||||||
|
address: str | None = None
|
||||||
|
postal_area: str | None = None
|
||||||
|
district: str | None = None
|
||||||
|
property_type: str | None = None
|
||||||
|
ownership_type: str | None = None
|
||||||
|
asking_price: int | None = None
|
||||||
|
total_price: int | None = None
|
||||||
|
shared_debt: int | None = None
|
||||||
|
common_costs: int | None = None
|
||||||
|
municipal_fee: int | None = None
|
||||||
|
other_fees: int | None = None
|
||||||
|
area_m2: int | None = None
|
||||||
|
rooms: int | None = None
|
||||||
|
bedrooms: int | None = None
|
||||||
|
floor: str | None = None
|
||||||
|
construction_year: int | None = None
|
||||||
|
energy_rating: str | None = None
|
||||||
|
heating: str | None = None
|
||||||
|
has_balcony: bool | None = None
|
||||||
|
has_terrace: bool | None = None
|
||||||
|
has_elevator: bool | None = None
|
||||||
|
has_parking: bool | None = None
|
||||||
|
has_garage: bool | None = None
|
||||||
|
listing_description: str | None = None
|
||||||
|
broker_name: str | None = None
|
||||||
|
broker_company: str | None = None
|
||||||
|
first_seen_at: datetime = Field(default_factory=lambda: datetime.now(UTC))
|
||||||
|
last_seen_at: datetime = Field(default_factory=lambda: datetime.now(UTC))
|
||||||
|
detail_fetched_at: datetime | None = None
|
||||||
|
eiendom_unit_code: str | None = None
|
||||||
|
|
||||||
|
model_config = ConfigDict(serializers={datetime: lambda v: v.isoformat()})
|
||||||
|
|
||||||
|
|
||||||
|
class EiendomUnit(BaseModel):
|
||||||
|
"""Eiendom.no unit detail with market data."""
|
||||||
|
|
||||||
|
unit_code: str
|
||||||
|
address: str | None = None
|
||||||
|
lat: float | None = None
|
||||||
|
lng: float | None = None
|
||||||
|
property_type: str | None = None
|
||||||
|
floor: int | None = None
|
||||||
|
rooms: int | None = None
|
||||||
|
construction_year: int | None = None
|
||||||
|
usable_area: int | None = None
|
||||||
|
estimated_selling_price: int | None = None
|
||||||
|
estimated_selling_price_lower: int | None = None
|
||||||
|
estimated_selling_price_upper: int | None = None
|
||||||
|
listing_price: int | None = None
|
||||||
|
listing_sqm_price: int | None = None
|
||||||
|
common_costs: int | None = None
|
||||||
|
days_on_market: int | None = None
|
||||||
|
sale_status: str | None = None
|
||||||
|
market_placement_score: str | None = None
|
||||||
|
unit_vector: str | None = None
|
||||||
|
fetched_at: datetime = Field(default_factory=lambda: datetime.now(UTC))
|
||||||
|
|
||||||
|
model_config = ConfigDict(serializers={datetime: lambda v: v.isoformat()})
|
||||||
|
|
||||||
|
|
||||||
|
class SimilarUnit(BaseModel):
|
||||||
|
"""Eiendom.no similar unit (comp) result."""
|
||||||
|
|
||||||
|
unit_code: str
|
||||||
|
address: str | None = None
|
||||||
|
lat: float | None = None
|
||||||
|
lng: float | None = None
|
||||||
|
property_type: str | None = None
|
||||||
|
floor: int | None = None
|
||||||
|
rooms: int | None = None
|
||||||
|
construction_year: int | None = None
|
||||||
|
usable_area: int | None = None
|
||||||
|
listing_price: int | None = None
|
||||||
|
selling_price: int | None = None
|
||||||
|
shared_debt: int | None = None
|
||||||
|
common_costs: int | None = None
|
||||||
|
sqm_price: int | None = None
|
||||||
|
days_on_market: int | None = None
|
||||||
|
sale_status: str | None = None
|
||||||
|
finalized_at: datetime | None = None
|
||||||
|
listing_status: str = Field(default="RECENTLY_SOLD")
|
||||||
|
|
||||||
|
model_config = ConfigDict(serializers={datetime: lambda v: v.isoformat() if v else None})
|
||||||
|
|
||||||
|
|
||||||
|
class UnitVector(BaseModel):
|
||||||
|
"""Unit vector payload for similar-units API."""
|
||||||
|
|
||||||
|
lon: float
|
||||||
|
lat: float
|
||||||
|
ptype: str # property type: APARTMENT, HOUSE, etc.
|
||||||
|
floor: int | None = None
|
||||||
|
rooms: int | None = None
|
||||||
|
built: int | None = None # construction year
|
||||||
|
area: int | None = None # usable area
|
||||||
|
price: int | None = None # listing or estimated price
|
||||||
@@ -0,0 +1,88 @@
|
|||||||
|
"""Normalization and parsing helpers."""
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_price(price_str: str | None) -> int | None:
|
||||||
|
"""
|
||||||
|
Normalize Norwegian formatted price to integer.
|
||||||
|
Example: "7 200 991 kr" -> 7200991
|
||||||
|
"""
|
||||||
|
if not price_str:
|
||||||
|
return None
|
||||||
|
# Remove "kr" and spaces, keep only digits
|
||||||
|
normalized = re.sub(r"[^\d]", "", price_str)
|
||||||
|
try:
|
||||||
|
return int(normalized) if normalized else None
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_area(area_str: str | None) -> int | None:
|
||||||
|
"""
|
||||||
|
Normalize area string to integer.
|
||||||
|
Example: "77 m²" -> 77
|
||||||
|
"""
|
||||||
|
if not area_str:
|
||||||
|
return None
|
||||||
|
cleaned = area_str.replace(" ", "")
|
||||||
|
match = re.search(r"(\d+(?:[.,]\d+)?)", cleaned)
|
||||||
|
if match:
|
||||||
|
value = match.group(1).replace(",", ".")
|
||||||
|
try:
|
||||||
|
return int(float(value))
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_number(num_str: str | None) -> int | None:
|
||||||
|
"""
|
||||||
|
Normalize Norwegian formatted number to integer.
|
||||||
|
Handles text like "3 500 kr/mnd" and "7,2".
|
||||||
|
"""
|
||||||
|
if not num_str:
|
||||||
|
return None
|
||||||
|
cleaned = re.sub(r"[^\d,\.]", "", num_str)
|
||||||
|
cleaned = cleaned.replace(" ", "")
|
||||||
|
if "," in cleaned:
|
||||||
|
cleaned = cleaned.replace(".", "").replace(",", ".")
|
||||||
|
else:
|
||||||
|
cleaned = cleaned.replace(".", "")
|
||||||
|
try:
|
||||||
|
return int(float(cleaned)) if cleaned else None
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_finnkode(finnkode: str | None) -> str | None:
|
||||||
|
"""Normalize finnkode to string, strip whitespace."""
|
||||||
|
if not finnkode:
|
||||||
|
return None
|
||||||
|
return str(finnkode).strip()
|
||||||
|
|
||||||
|
|
||||||
|
def extract_finnkode_from_url(url: str) -> str | None:
|
||||||
|
"""
|
||||||
|
Extract finnkode from FINN URL.
|
||||||
|
Example: https://www.finn.no/realestate/homes/ad.html?finnkode=462400360 -> 462400360
|
||||||
|
"""
|
||||||
|
match = re.search(r"finnkode=(\d+)", url)
|
||||||
|
if match:
|
||||||
|
return match.group(1)
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def text_to_bool(text: str | None) -> bool:
|
||||||
|
"""Convert text to boolean."""
|
||||||
|
if not text:
|
||||||
|
return False
|
||||||
|
return text.lower() in ("ja", "yes", "true", "1", "y")
|
||||||
|
|
||||||
|
|
||||||
|
def clean_text(text: str | None) -> str | None:
|
||||||
|
"""Clean and normalize text: strip, collapse whitespace."""
|
||||||
|
if not text:
|
||||||
|
return None
|
||||||
|
cleaned = " ".join(text.split())
|
||||||
|
return cleaned if cleaned else None
|
||||||
@@ -0,0 +1,146 @@
|
|||||||
|
"""Scoring engine for FINN listings enriched with Eiendom.no data."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from .models import EiendomUnit, SimilarUnit
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def _clamp(value: float, min_value: float, max_value: float) -> float:
|
||||||
|
return max(min_value, min(max_value, value))
|
||||||
|
|
||||||
|
|
||||||
|
def score_market_position(unit: EiendomUnit | None) -> float:
|
||||||
|
if unit is None or unit.estimated_selling_price is None or unit.listing_price is None:
|
||||||
|
return 0.0
|
||||||
|
ratio = unit.listing_price / unit.estimated_selling_price
|
||||||
|
if ratio <= 0.9:
|
||||||
|
return 20.0
|
||||||
|
if ratio <= 1.0:
|
||||||
|
return 16.0 + (1.0 - ratio) * 40.0
|
||||||
|
if ratio <= 1.1:
|
||||||
|
return 12.0 - (ratio - 1.0) * 40.0
|
||||||
|
return 5.0
|
||||||
|
|
||||||
|
|
||||||
|
def score_economy(ad: Any, unit: EiendomUnit | None) -> float:
|
||||||
|
if ad.total_price is None:
|
||||||
|
return 0.0
|
||||||
|
if unit and unit.estimated_selling_price:
|
||||||
|
ratio = ad.total_price / unit.estimated_selling_price
|
||||||
|
if ratio <= 0.95:
|
||||||
|
return 20.0
|
||||||
|
if ratio <= 1.0:
|
||||||
|
return 15.0
|
||||||
|
if ratio <= 1.05:
|
||||||
|
return 10.0
|
||||||
|
return 6.0
|
||||||
|
if ad.asking_price and ad.total_price <= ad.asking_price:
|
||||||
|
return 12.0
|
||||||
|
return 8.0
|
||||||
|
|
||||||
|
|
||||||
|
def score_comparable_sales(listings: list[SimilarUnit], listing_price: int | None) -> float:
|
||||||
|
if not listings or listing_price is None:
|
||||||
|
return 0.0
|
||||||
|
selling_prices = [unit.selling_price for unit in listings if unit.selling_price]
|
||||||
|
if not selling_prices:
|
||||||
|
return 0.0
|
||||||
|
average = sum(selling_prices) / len(selling_prices)
|
||||||
|
ratio = listing_price / average
|
||||||
|
score = (1.0 - abs(ratio - 1.0)) * 20.0
|
||||||
|
return float(_clamp(score, 0.0, 20.0))
|
||||||
|
|
||||||
|
|
||||||
|
def score_location(address: str | None, district: str | None) -> float:
|
||||||
|
if not address and not district:
|
||||||
|
return 0.0
|
||||||
|
if district and "oslo" in district.lower():
|
||||||
|
return 15.0
|
||||||
|
if address and "oslo" in address.lower():
|
||||||
|
return 12.0
|
||||||
|
return 7.0
|
||||||
|
|
||||||
|
|
||||||
|
def score_layout_and_potential(description: str | None, rooms: int | None) -> float:
|
||||||
|
score = 0.0
|
||||||
|
if rooms and rooms >= 4:
|
||||||
|
score += 10.0
|
||||||
|
if description and "potensial" in description.lower():
|
||||||
|
score += 8.0
|
||||||
|
return float(_clamp(score, 0.0, 20.0))
|
||||||
|
|
||||||
|
|
||||||
|
def score_outdoor_and_view(description: str | None) -> float:
|
||||||
|
if not description:
|
||||||
|
return 0.0
|
||||||
|
score = 5.0 if "utsikt" in description.lower() or "balkong" in description.lower() else 0.0
|
||||||
|
return float(_clamp(score, 0.0, 15.0))
|
||||||
|
|
||||||
|
|
||||||
|
def score_rental_potential(description: str | None) -> float:
|
||||||
|
if not description:
|
||||||
|
return 0.0
|
||||||
|
score = 10.0 if "hybel" in description.lower() or "leie" in description.lower() else 0.0
|
||||||
|
return score
|
||||||
|
|
||||||
|
|
||||||
|
def score_renovation_upside(description: str | None, asking_price: int | None) -> float:
|
||||||
|
score = 0.0
|
||||||
|
if description and "renover" in description.lower():
|
||||||
|
score += 10.0
|
||||||
|
if asking_price and asking_price > 0:
|
||||||
|
score += 5.0
|
||||||
|
return float(_clamp(score, 0.0, 15.0))
|
||||||
|
|
||||||
|
|
||||||
|
def score_risk(description: str | None, unit: EiendomUnit | None) -> float:
|
||||||
|
if unit is None:
|
||||||
|
return -10.0
|
||||||
|
if description and "usikker" in description.lower():
|
||||||
|
return -10.0
|
||||||
|
return 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def score_ad(
|
||||||
|
ad: Any, unit: EiendomUnit | None, similar_units: list[SimilarUnit]
|
||||||
|
) -> dict[str, float]:
|
||||||
|
scores = {
|
||||||
|
"economy": score_economy(ad, unit),
|
||||||
|
"market_position": score_market_position(unit),
|
||||||
|
"comparable_sales": score_comparable_sales(
|
||||||
|
similar_units, ad.total_price or ad.asking_price
|
||||||
|
),
|
||||||
|
"location": score_location(ad.address, ad.district),
|
||||||
|
"layout": score_layout_and_potential(ad.listing_description, ad.rooms),
|
||||||
|
"outdoor": score_outdoor_and_view(ad.listing_description),
|
||||||
|
"rental_potential": score_rental_potential(ad.listing_description),
|
||||||
|
"renovation": score_renovation_upside(ad.listing_description, ad.asking_price),
|
||||||
|
"risk": score_risk(ad.listing_description, unit),
|
||||||
|
}
|
||||||
|
scores["total"] = float(_clamp(sum(scores.values()), 0.0, 100.0))
|
||||||
|
return scores
|
||||||
|
|
||||||
|
|
||||||
|
def classify_ad(scores: dict[str, float]) -> list[str]:
|
||||||
|
categories: list[str] = []
|
||||||
|
total = scores.get("total", 0.0)
|
||||||
|
if total >= 70:
|
||||||
|
categories.append("bargain_candidate")
|
||||||
|
if total >= 60:
|
||||||
|
categories.append("safe_candidate")
|
||||||
|
if 50 <= total < 70:
|
||||||
|
categories.append("lifestyle_candidate")
|
||||||
|
if scores.get("renovation", 0.0) >= 8:
|
||||||
|
categories.append("renovation_candidate")
|
||||||
|
if scores.get("rental_potential", 0.0) >= 5:
|
||||||
|
categories.append("hybel_candidate")
|
||||||
|
if scores.get("risk", 0.0) < 0:
|
||||||
|
categories.append("risk_object")
|
||||||
|
if total < 30:
|
||||||
|
categories.append("not_interesting")
|
||||||
|
if 30 <= total < 60:
|
||||||
|
categories.append("manual_review_required")
|
||||||
|
return categories
|
||||||
@@ -0,0 +1,194 @@
|
|||||||
|
"""FINN search scraping and parsing."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import re
|
||||||
|
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
|
||||||
|
from . import cache
|
||||||
|
from .config import FINN_CACHE_TTL_SEARCH_MINUTES
|
||||||
|
from .http import HTTPClient
|
||||||
|
from .models import FinnSearchCard
|
||||||
|
from .parser import (
|
||||||
|
clean_text,
|
||||||
|
extract_finnkode_from_url,
|
||||||
|
normalize_area,
|
||||||
|
normalize_finnkode,
|
||||||
|
normalize_number,
|
||||||
|
normalize_price,
|
||||||
|
)
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_search_page(url: str, client: HTTPClient | None = None) -> str:
|
||||||
|
"""Fetch a FINN search page HTML."""
|
||||||
|
client = client or HTTPClient(request_delay_seconds=0.0)
|
||||||
|
response = await client.get(url)
|
||||||
|
return response.text
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_search_page_cached(
|
||||||
|
url: str,
|
||||||
|
client: HTTPClient | None = None,
|
||||||
|
conn: cache.sqlite3.Connection | None = None,
|
||||||
|
use_cache: bool = True,
|
||||||
|
) -> str:
|
||||||
|
"""Fetch a FINN search page with optional SQLite caching."""
|
||||||
|
client = client or HTTPClient(request_delay_seconds=0.0)
|
||||||
|
conn = conn or cache.init_db()
|
||||||
|
if use_cache:
|
||||||
|
cached_html = cache.get_search_page(conn, url)
|
||||||
|
if cached_html:
|
||||||
|
logger.debug("Using cached search page: %s", url)
|
||||||
|
return cached_html
|
||||||
|
|
||||||
|
html = await fetch_search_page(url, client=client)
|
||||||
|
cache.save_search_page(conn, url, html, ttl_minutes=FINN_CACHE_TTL_SEARCH_MINUTES)
|
||||||
|
return html
|
||||||
|
|
||||||
|
|
||||||
|
def extract_ad_links(html: str) -> list[str]:
|
||||||
|
"""Extract listing URLs from FINN search HTML."""
|
||||||
|
soup = BeautifulSoup(html, "html.parser")
|
||||||
|
links = []
|
||||||
|
for article in soup.select("article.listing-card, article.sf-search-ad"):
|
||||||
|
anchor = article.select_one("a[href*='finnkode']")
|
||||||
|
if anchor and anchor.get("href"):
|
||||||
|
links.append(clean_text(anchor.get("href")) or "")
|
||||||
|
return links
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_int_from_text(text: str, pattern: str) -> int | None:
|
||||||
|
match = re.search(pattern, text, re.I)
|
||||||
|
if match:
|
||||||
|
return normalize_number(match.group(1))
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_area_from_text(text: str) -> int | None:
|
||||||
|
matches = re.findall(r"(\d+(?:[.,]\d+)?)\s*(?:m²|m2|kvm)", text, re.I)
|
||||||
|
if matches:
|
||||||
|
return normalize_area(matches[-1])
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_price_from_text(text: str, label: str) -> int | None:
|
||||||
|
pattern = rf"{label}[:\s]*([\d\s]+kr)"
|
||||||
|
match = re.search(pattern, text, re.I)
|
||||||
|
if match:
|
||||||
|
return normalize_price(match.group(1))
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def extract_search_cards(html: str) -> list[FinnSearchCard]:
|
||||||
|
"""Parse FINN search HTML and return a list of FinnSearchCard objects."""
|
||||||
|
logger.debug("Extracting FINN search cards")
|
||||||
|
soup = BeautifulSoup(html, "html.parser")
|
||||||
|
cards: list[FinnSearchCard] = []
|
||||||
|
|
||||||
|
for card in soup.select("article.listing-card, article.sf-search-ad"):
|
||||||
|
data_id = card.get("data-id")
|
||||||
|
anchor = card.select_one("a[href*='finnkode']")
|
||||||
|
url = anchor.get("href") if anchor else ""
|
||||||
|
finnkode = normalize_finnkode(data_id or extract_finnkode_from_url(url))
|
||||||
|
if not finnkode:
|
||||||
|
logger.debug("Skipping card with missing finnkode")
|
||||||
|
continue
|
||||||
|
|
||||||
|
title_elem = card.select_one(".title, h2.sf-realestate-heading, a.sf-search-ad-link")
|
||||||
|
address_elem = card.select_one(".location, .sf-realestate-location")
|
||||||
|
area_elem = card.select_one(".area")
|
||||||
|
price_elem = card.select_one(".price")
|
||||||
|
common_costs_elem = card.select_one(".common-costs")
|
||||||
|
bedrooms_elem = card.select_one(".bedrooms")
|
||||||
|
property_type_elem = card.select_one(".property-type")
|
||||||
|
ownership_type_elem = card.select_one(".ownership-type")
|
||||||
|
broker_elem = card.select_one(".broker-company")
|
||||||
|
|
||||||
|
card_text = clean_text(card.get_text(" ") or "")
|
||||||
|
|
||||||
|
bedrooms = None
|
||||||
|
if bedrooms_elem:
|
||||||
|
bedrooms = normalize_number(bedrooms_elem.get_text())
|
||||||
|
elif card_text:
|
||||||
|
bedrooms = _extract_int_from_text(card_text, r"(\d+)\s*soverom")
|
||||||
|
|
||||||
|
common_costs = None
|
||||||
|
if common_costs_elem:
|
||||||
|
common_costs = normalize_number(common_costs_elem.get_text())
|
||||||
|
elif card_text:
|
||||||
|
common_costs = _extract_int_from_text(
|
||||||
|
card_text, r"(?:Fellesutg|Felleskost(?:er)?)[^\d]*(\d+[\d\s]*)kr"
|
||||||
|
)
|
||||||
|
|
||||||
|
total_price = None
|
||||||
|
if price_elem:
|
||||||
|
total_price = normalize_price(price_elem.get_text())
|
||||||
|
if not total_price and card_text:
|
||||||
|
total_price = _extract_price_from_text(card_text, r"Totalpris")
|
||||||
|
if not total_price and card_text:
|
||||||
|
first_price_match = re.search(r"([\d\s]+kr)", card_text)
|
||||||
|
if first_price_match:
|
||||||
|
total_price = normalize_price(first_price_match.group(1))
|
||||||
|
|
||||||
|
area_m2 = None
|
||||||
|
if area_elem:
|
||||||
|
area_m2 = normalize_area(area_elem.get_text())
|
||||||
|
elif card_text:
|
||||||
|
area_m2 = _extract_area_from_text(card_text)
|
||||||
|
|
||||||
|
card_data = FinnSearchCard(
|
||||||
|
finnkode=finnkode,
|
||||||
|
url=url or "",
|
||||||
|
title=clean_text(title_elem.get_text()) if title_elem else None,
|
||||||
|
address=clean_text(address_elem.get_text()) if address_elem else None,
|
||||||
|
area_m2=area_m2,
|
||||||
|
asking_price=None,
|
||||||
|
total_price=total_price,
|
||||||
|
common_costs=common_costs,
|
||||||
|
property_type=clean_text(property_type_elem.get_text()) if property_type_elem else None,
|
||||||
|
ownership_type=clean_text(ownership_type_elem.get_text())
|
||||||
|
if ownership_type_elem
|
||||||
|
else None,
|
||||||
|
bedrooms=bedrooms,
|
||||||
|
floor=None,
|
||||||
|
broker_company=clean_text(broker_elem.get_text()) if broker_elem else None,
|
||||||
|
)
|
||||||
|
cards.append(card_data)
|
||||||
|
logger.debug("Parsed FINN search card %s", finnkode)
|
||||||
|
|
||||||
|
return cards
|
||||||
|
|
||||||
|
|
||||||
|
def find_next_page_url(html: str) -> str | None:
|
||||||
|
"""Return the FINN search next page URL if present."""
|
||||||
|
soup = BeautifulSoup(html, "html.parser")
|
||||||
|
next_link = soup.select_one("a[rel='next']")
|
||||||
|
if next_link and next_link.get("href"):
|
||||||
|
return clean_text(next_link.get("href"))
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_search_pages(
|
||||||
|
start_url: str,
|
||||||
|
max_pages: int = 1,
|
||||||
|
client: HTTPClient | None = None,
|
||||||
|
use_cache: bool = True,
|
||||||
|
) -> list[FinnSearchCard]:
|
||||||
|
"""Fetch paginated FINN search pages and parse search cards."""
|
||||||
|
client = client or HTTPClient(request_delay_seconds=0.0)
|
||||||
|
conn = cache.init_db()
|
||||||
|
url = start_url
|
||||||
|
all_cards: list[FinnSearchCard] = []
|
||||||
|
|
||||||
|
for _ in range(max_pages):
|
||||||
|
html = await fetch_search_page_cached(url, client=client, conn=conn, use_cache=use_cache)
|
||||||
|
all_cards.extend(extract_search_cards(html))
|
||||||
|
next_url = find_next_page_url(html)
|
||||||
|
if not next_url:
|
||||||
|
break
|
||||||
|
url = next_url
|
||||||
|
logger.debug("Following next page link: %s", url)
|
||||||
|
|
||||||
|
return all_cards
|
||||||
@@ -0,0 +1,35 @@
|
|||||||
|
"""Service layer for cache-aware fetching of FINN ads and Eiendom.no units."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
|
||||||
|
from .ad import fetch_ad_details
|
||||||
|
from .cache import get_eiendom_unit as get_cached_eiendom_unit
|
||||||
|
from .cache import get_finn_ad, init_db, save_eiendom_unit, save_finn_ad
|
||||||
|
from .config import FINN_CACHE_PATH
|
||||||
|
from .eiendom_no import get_unit
|
||||||
|
from .models import EiendomUnit, FinnAd
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
async def get_or_fetch_ad(finnkode: str, force_refresh: bool = False) -> FinnAd:
|
||||||
|
"""Get FinnAd from cache or fetch fresh. Never returns None."""
|
||||||
|
conn = init_db(FINN_CACHE_PATH)
|
||||||
|
ad = None if force_refresh else get_finn_ad(conn, finnkode, ttl_hours=24)
|
||||||
|
if ad is None:
|
||||||
|
ad = await fetch_ad_details(finnkode)
|
||||||
|
save_finn_ad(conn, ad)
|
||||||
|
return ad
|
||||||
|
|
||||||
|
|
||||||
|
async def get_or_fetch_eiendom_unit(
|
||||||
|
unit_code: str, force_refresh: bool = False
|
||||||
|
) -> EiendomUnit | None:
|
||||||
|
"""Get EiendomUnit from cache or fetch fresh."""
|
||||||
|
conn = init_db(FINN_CACHE_PATH)
|
||||||
|
unit = None if force_refresh else get_cached_eiendom_unit(conn, unit_code, ttl_hours=24)
|
||||||
|
if unit is None:
|
||||||
|
unit = await get_unit(unit_code)
|
||||||
|
if unit is not None:
|
||||||
|
save_eiendom_unit(conn, unit)
|
||||||
|
return unit
|
||||||
@@ -0,0 +1,49 @@
|
|||||||
|
[project]
|
||||||
|
name = "finn-eiendom-mcp"
|
||||||
|
version = "0.1.0"
|
||||||
|
description = "Private FINN and Eiendom.no real estate MCP scout"
|
||||||
|
readme = "README.md"
|
||||||
|
requires-python = ">=3.12"
|
||||||
|
dependencies = [
|
||||||
|
"beautifulsoup4>=4.12.0",
|
||||||
|
"httpx>=0.27.0",
|
||||||
|
"lxml>=5.0.0",
|
||||||
|
"mcp[cli]>=1.0.0",
|
||||||
|
"msgpack>=1.0.0",
|
||||||
|
"pydantic>=2.8.0",
|
||||||
|
"pydantic-settings>=2.4.0",
|
||||||
|
"python-dotenv>=1.0.0",
|
||||||
|
]
|
||||||
|
|
||||||
|
[project.scripts]
|
||||||
|
finn-eiendom-mcp = "finn_eiendom.mcp_server:main"
|
||||||
|
|
||||||
|
[dependency-groups]
|
||||||
|
dev = [
|
||||||
|
"ipython>=8.0.0",
|
||||||
|
"mypy>=1.10.0",
|
||||||
|
"pytest>=8.0.0",
|
||||||
|
"pytest-asyncio>=0.23.0",
|
||||||
|
"respx>=0.21.0",
|
||||||
|
"ruff>=0.6.0",
|
||||||
|
]
|
||||||
|
|
||||||
|
[tool.ruff]
|
||||||
|
line-length = 100
|
||||||
|
target-version = "py312"
|
||||||
|
|
||||||
|
[tool.ruff.lint]
|
||||||
|
select = ["E", "F", "I", "UP", "B", "SIM"]
|
||||||
|
ignore = []
|
||||||
|
|
||||||
|
[tool.ruff.lint.per-file-ignores]
|
||||||
|
"tests/fixtures.py" = ["E501"]
|
||||||
|
|
||||||
|
[tool.pytest.ini_options]
|
||||||
|
testpaths = ["tests"]
|
||||||
|
asyncio_mode = "auto"
|
||||||
|
|
||||||
|
[tool.mypy]
|
||||||
|
python_version = "3.12"
|
||||||
|
strict = true
|
||||||
|
plugins = []
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
"""Test fixtures and utilities."""
|
||||||
@@ -0,0 +1,236 @@
|
|||||||
|
"""Fixture data for testing without hitting live APIs."""
|
||||||
|
# noqa: E501
|
||||||
|
|
||||||
|
SAMPLE_FINN_SEARCH_HTML = """
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="no">
|
||||||
|
<head><title>FINN.no - Leiligheter til salgs</title></head>
|
||||||
|
<body>
|
||||||
|
<div class="listings">
|
||||||
|
<article class="listing-card" data-id="462400360">
|
||||||
|
<a href="https://www.finn.no/realestate/homes/ad.html?finnkode=462400360" class="listing-link">
|
||||||
|
<div class="title">Flott 3-roms i Ferner</div>
|
||||||
|
<div class="meta">
|
||||||
|
<span class="area">77 m²</span>
|
||||||
|
<span class="price">7 200 991 kr</span>
|
||||||
|
<span class="price-per-sqm">93 500 kr/m²</span>
|
||||||
|
</div>
|
||||||
|
</a>
|
||||||
|
<div class="details">
|
||||||
|
<span class="bedrooms">3</span>
|
||||||
|
<span class="location">Grünerløkka, Oslo</span>
|
||||||
|
<span class="common-costs">3 500 kr/mnd</span>
|
||||||
|
</div>
|
||||||
|
</article>
|
||||||
|
<article class="listing-card" data-id="460784945">
|
||||||
|
<a href="https://www.finn.no/realestate/homes/ad.html?finnkode=460784945" class="listing-link">
|
||||||
|
<div class="title">Leilighet med potensial - må renoveres</div>
|
||||||
|
<div class="meta">
|
||||||
|
<span class="area">65 m²</span>
|
||||||
|
<span class="price">6 500 000 kr</span>
|
||||||
|
<span class="price-per-sqm">100 000 kr/m²</span>
|
||||||
|
</div>
|
||||||
|
</a>
|
||||||
|
<div class="details">
|
||||||
|
<span class="bedrooms">2</span>
|
||||||
|
<span class="location">Sagene, Oslo</span>
|
||||||
|
<span class="common-costs">2 800 kr/mnd</span>
|
||||||
|
</div>
|
||||||
|
</article>
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
"""
|
||||||
|
|
||||||
|
# noqa: E501
|
||||||
|
SAMPLE_FINN_SEARCH_HTML_NEW = """
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="no">
|
||||||
|
<head><title>FINN.no - Leiligheter til salgs</title></head>
|
||||||
|
<body>
|
||||||
|
<div class="listings">
|
||||||
|
<article class="relative isolate sf-search-ad card card--cardShadow">
|
||||||
|
<div class="col-span-2 p-16 grid sm:grid-cols-2">
|
||||||
|
<h2 class="h4 mb-0 col-span-2 mt-12 sm:mt-24 sf-realestate-heading">IDYLLISKE ILADALEN - Lekker 3-roms loftsleilighet fra 2016 | Privat, solrik takterrasse | Peis | Gulvareal på 77kvm | Sentralt, men rolig</h2>
|
||||||
|
<a href="https://www.finn.no/realestate/homes/ad.html?finnkode=462880791" class="sf-search-ad-link s-text!">IDYLLISKE ILADALEN - Lekker 3-roms loftsleilighet fra 2016 | Privat, solrik takterrasse | Peis | Gulvareal på 77kvm | Sentralt, men rolig</a>
|
||||||
|
<div class="mt-4 sf-line-clamp-2 sm:order-first sm:text-right sm:mt-0 sm:ml-16 sf-realestate-location">Lofotgata 4B, Oslo</div>
|
||||||
|
<div class="col-span-2 mt-16 flex justify-between sm:mt-4 sm:block space-x-12 font-bold">62 m² 6 750 000 kr</div>
|
||||||
|
<div class="col-span-2 sm:flex sm:items-baseline sm:justify-between">Totalpris: 7 253 377 kr ∙ Fellesutg.: 7 067 kr ∙ Andel ∙ Leilighet ∙ 2 soverom</div>
|
||||||
|
</div>
|
||||||
|
</article>
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_FINN_LISTING_HTML = """
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="no">
|
||||||
|
<head><title>Flott 3-roms i Ferner - FINN.no</title></head>
|
||||||
|
<body>
|
||||||
|
<div class="listing-details">
|
||||||
|
<div class="heading">
|
||||||
|
<h1>Flott 3-roms i Ferner</h1>
|
||||||
|
<div class="price">Totalpris: 7 200 991 kr</div>
|
||||||
|
</div>
|
||||||
|
<div class="properties">
|
||||||
|
<dl>
|
||||||
|
<dt>Adresse</dt>
|
||||||
|
<dd>Fernerveien 42, 0554 Oslo</dd>
|
||||||
|
<dt>Område</dt>
|
||||||
|
<dd>Grünerløkka</dd>
|
||||||
|
<dt>Postnummer</dt>
|
||||||
|
<dd>0554</dd>
|
||||||
|
<dt>Eierform</dt>
|
||||||
|
<dd>Eierbolig</dd>
|
||||||
|
<dt>Eiendomstype</dt>
|
||||||
|
<dd>Leilighet</dd>
|
||||||
|
<dt>Prisantydning</dt>
|
||||||
|
<dd>7 200 000 kr</dd>
|
||||||
|
<dt>Totalpris</dt>
|
||||||
|
<dd>7 200 991 kr</dd>
|
||||||
|
<dt>Fellesgjeld</dt>
|
||||||
|
<dd>0 kr</dd>
|
||||||
|
<dt>Felles utgifter</dt>
|
||||||
|
<dd>3 500 kr/mnd</dd>
|
||||||
|
<dt>Boligareal</dt>
|
||||||
|
<dd>77 m²</dd>
|
||||||
|
<dt>Rom</dt>
|
||||||
|
<dd>4</dd>
|
||||||
|
<dt>Soverom</dt>
|
||||||
|
<dd>3</dd>
|
||||||
|
<dt>Etasje</dt>
|
||||||
|
<dd>4. etasje</dd>
|
||||||
|
<dt>Byggeår</dt>
|
||||||
|
<dd>2005</dd>
|
||||||
|
<dt>Energimerking</dt>
|
||||||
|
<dd>C</dd>
|
||||||
|
<dt>Oppvarming</dt>
|
||||||
|
<dd>Fjernvarme</dd>
|
||||||
|
<dt>Balkonger/terrasser</dt>
|
||||||
|
<dd>Ja, balkonger</dd>
|
||||||
|
<dt>Heis</dt>
|
||||||
|
<dd>Ja</dd>
|
||||||
|
<dt>Parkering/garasje</dt>
|
||||||
|
<dd>Privat parkering</dd>
|
||||||
|
</dl>
|
||||||
|
</div>
|
||||||
|
<div class="description">
|
||||||
|
<h2>Beskrivelse</h2>
|
||||||
|
<p>Flott beliggenhet med fin utsikt over Oslo. Moderne kjøkken og bad.</p>
|
||||||
|
<p>Klar til visning!</p>
|
||||||
|
</div>
|
||||||
|
<div class="broker">
|
||||||
|
<div class="broker-info">
|
||||||
|
<span class="broker-name">Meglerhuset AS</span>
|
||||||
|
<span class="broker-contact">Telefon: 21 00 00 00</span>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_FINN_LISTING_HTML_NEW = """
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="no">
|
||||||
|
<head><title>Romslig 5-roms i 5.etasje med heisadkomst</title></head>
|
||||||
|
<body>
|
||||||
|
<div data-testid="object-details">
|
||||||
|
<h1>Romslig 5-roms i 5.etasje med heisadkomst | 2 hybler | 4 balkonger | Ingen dokavgift!</h1>
|
||||||
|
<span data-testid="object-address">Hegdehaugsveien 3, 0352 Oslo</span>
|
||||||
|
<span data-testid="local-area-name">Homansbyen</span>
|
||||||
|
<section data-testid="pricing-details">
|
||||||
|
<div data-testid="pricing-incicative-price">Prisantydning10 900 000 kr</div>
|
||||||
|
<div data-testid="pricing-total-price"><dt>Totalpris</dt><dd>10 986 901 kr</dd></div>
|
||||||
|
<div data-testid="pricing-joint-debt"><dt>Fellesgjeld</dt><dd>76 911 kr</dd></div>
|
||||||
|
<div data-testid="pricing-common-monthly-cost"><dt>Felleskost/mnd.</dt><dd>12 011 kr</dd></div>
|
||||||
|
</section>
|
||||||
|
<section data-testid="key-info">
|
||||||
|
<div data-testid="info-property-type">BoligtypeLeilighet</div>
|
||||||
|
<div data-testid="info-ownership-type">EieformAndel</div>
|
||||||
|
<div data-testid="info-bedrooms">Soverom2</div>
|
||||||
|
<div data-testid="info-rooms">Rom5</div>
|
||||||
|
<div data-testid="info-construction-year">Byggeår1938</div>
|
||||||
|
<div data-testid="info-usable-i-area">Internt bruksareal124 m² (BRA-i)</div>
|
||||||
|
</section>
|
||||||
|
<section data-testid="object-facilities">FasiliteterBalkong/TerrasseParkettHeis</section>
|
||||||
|
<section data-testid="om boligen">
|
||||||
|
<h2>Om boligen</h2>
|
||||||
|
<p>Her bor du med kort vei til daglige behov og offentlig transport.</p>
|
||||||
|
</section>
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_EIENDOM_UNIT_JSON = {
|
||||||
|
"units": [
|
||||||
|
{
|
||||||
|
"unitCode": "c-gxw-xmyum-s2a",
|
||||||
|
"address": "Fernerveien 42, 0554 Oslo",
|
||||||
|
"municipality": "Oslo",
|
||||||
|
"lat": 59.9287,
|
||||||
|
"lon": 10.7803,
|
||||||
|
"propertyType": "APARTMENT",
|
||||||
|
"floor": 4,
|
||||||
|
"rooms": 4,
|
||||||
|
"constructionYear": 2005,
|
||||||
|
"usableArea": 77,
|
||||||
|
"estimatedSellingPrice": 7650000,
|
||||||
|
"estimatedSellingPriceLower": 6900000,
|
||||||
|
"estimatedSellingPriceUpper": 8400000,
|
||||||
|
"listingPrice": 7200000,
|
||||||
|
"listingSquareMeterPrice": 93500,
|
||||||
|
"commonCosts": 3500,
|
||||||
|
"daysOnMarket": 12,
|
||||||
|
"saleStatus": "FOR_SALE",
|
||||||
|
"marketPlacementScore": "ABOVE_AVERAGE",
|
||||||
|
"similarUnitCount": 12,
|
||||||
|
"averageSquareMeterPrice": 98000,
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
SAMPLE_EIENDOM_SIMILAR_UNITS_JSON = {
|
||||||
|
"units": [
|
||||||
|
{
|
||||||
|
"unitCode": "c-recent-1",
|
||||||
|
"address": "Birketveien 10, 0554 Oslo",
|
||||||
|
"lat": 59.9290,
|
||||||
|
"lon": 10.7810,
|
||||||
|
"propertyType": "APARTMENT",
|
||||||
|
"floor": 3,
|
||||||
|
"rooms": 3,
|
||||||
|
"constructionYear": 2004,
|
||||||
|
"usableArea": 75,
|
||||||
|
"listingPrice": 7100000,
|
||||||
|
"sellingPrice": 7050000,
|
||||||
|
"sharedDebt": 0,
|
||||||
|
"commonCosts": 3400,
|
||||||
|
"squareMeterPrice": 94000,
|
||||||
|
"daysOnMarket": 18,
|
||||||
|
"saleStatus": "SOLD",
|
||||||
|
"finalizedAt": "2024-05-01",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"unitCode": "c-recent-2",
|
||||||
|
"address": "Sommers gate 5, 0554 Oslo",
|
||||||
|
"lat": 59.9280,
|
||||||
|
"lon": 10.7820,
|
||||||
|
"propertyType": "APARTMENT",
|
||||||
|
"floor": 2,
|
||||||
|
"rooms": 4,
|
||||||
|
"constructionYear": 2006,
|
||||||
|
"usableArea": 80,
|
||||||
|
"listingPrice": 7400000,
|
||||||
|
"sellingPrice": 7350000,
|
||||||
|
"sharedDebt": 0,
|
||||||
|
"commonCosts": 3600,
|
||||||
|
"squareMeterPrice": 91875,
|
||||||
|
"daysOnMarket": 22,
|
||||||
|
"saleStatus": "SOLD",
|
||||||
|
"finalizedAt": "2024-04-28",
|
||||||
|
},
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -0,0 +1,45 @@
|
|||||||
|
from finn_eiendom.ad import scrape_ad
|
||||||
|
from tests.fixtures import SAMPLE_FINN_LISTING_HTML, SAMPLE_FINN_LISTING_HTML_NEW
|
||||||
|
|
||||||
|
|
||||||
|
def test_scrape_ad():
|
||||||
|
ad = scrape_ad(
|
||||||
|
SAMPLE_FINN_LISTING_HTML,
|
||||||
|
url="https://www.finn.no/realestate/homes/ad.html?finnkode=462400360",
|
||||||
|
)
|
||||||
|
assert ad.finnkode == "462400360"
|
||||||
|
assert ad.title == "Flott 3-roms i Ferner"
|
||||||
|
assert ad.address == "Fernerveien 42, 0554 Oslo"
|
||||||
|
assert ad.area_m2 == 77
|
||||||
|
assert ad.asking_price == 7200000
|
||||||
|
assert ad.total_price == 7200991
|
||||||
|
assert ad.common_costs == 3500
|
||||||
|
assert ad.rooms == 4
|
||||||
|
assert ad.bedrooms == 3
|
||||||
|
assert ad.floor == "4. etasje"
|
||||||
|
assert ad.construction_year == 2005
|
||||||
|
assert ad.energy_rating == "C"
|
||||||
|
assert ad.heating == "Fjernvarme"
|
||||||
|
assert "Moderne kjøkken" in ad.listing_description
|
||||||
|
assert ad.broker_company == "Meglerhuset AS"
|
||||||
|
|
||||||
|
|
||||||
|
def test_scrape_ad_new_structure():
|
||||||
|
ad = scrape_ad(
|
||||||
|
SAMPLE_FINN_LISTING_HTML_NEW,
|
||||||
|
url="https://www.finn.no/realestate/homes/ad.html?finnkode=455978973",
|
||||||
|
)
|
||||||
|
assert ad.finnkode == "455978973"
|
||||||
|
assert ad.title.startswith("Romslig 5-roms i 5.etasje")
|
||||||
|
assert ad.address == "Hegdehaugsveien 3, 0352 Oslo"
|
||||||
|
assert ad.property_type == "Leilighet"
|
||||||
|
assert ad.ownership_type == "Andel"
|
||||||
|
assert ad.asking_price == 10900000
|
||||||
|
assert ad.total_price == 10986901
|
||||||
|
assert ad.common_costs == 12011
|
||||||
|
assert ad.area_m2 == 124
|
||||||
|
assert ad.rooms == 5
|
||||||
|
assert ad.bedrooms == 2
|
||||||
|
assert ad.construction_year == 1938
|
||||||
|
assert ad.floor == "5. etasje"
|
||||||
|
assert "kort vei" in ad.listing_description.lower()
|
||||||
@@ -0,0 +1,71 @@
|
|||||||
|
import tempfile
|
||||||
|
from datetime import UTC, datetime, timedelta
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from finn_eiendom.cache import (
|
||||||
|
get_eiendom_unit,
|
||||||
|
get_finn_ad,
|
||||||
|
get_search_page,
|
||||||
|
get_similar_units,
|
||||||
|
init_db,
|
||||||
|
save_eiendom_unit,
|
||||||
|
save_finn_ad,
|
||||||
|
save_search_page,
|
||||||
|
save_similar_units,
|
||||||
|
)
|
||||||
|
from finn_eiendom.models import EiendomUnit, FinnAd, SimilarUnit
|
||||||
|
|
||||||
|
|
||||||
|
def test_cache_roundtrip():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
db_path = Path(tmpdir) / "cache.sqlite"
|
||||||
|
conn = init_db(str(db_path))
|
||||||
|
|
||||||
|
ad = FinnAd(finnkode="1234", url="https://example.com", title="Test")
|
||||||
|
save_finn_ad(conn, ad)
|
||||||
|
loaded_ad = get_finn_ad(conn, "1234")
|
||||||
|
assert loaded_ad is not None
|
||||||
|
assert loaded_ad.finnkode == "1234"
|
||||||
|
assert loaded_ad.url == "https://example.com"
|
||||||
|
|
||||||
|
unit = EiendomUnit(unit_code="abc", address="Oslo")
|
||||||
|
save_eiendom_unit(conn, unit)
|
||||||
|
loaded_unit = get_eiendom_unit(conn, "abc")
|
||||||
|
assert loaded_unit is not None
|
||||||
|
assert loaded_unit.address == "Oslo"
|
||||||
|
|
||||||
|
comps = [
|
||||||
|
SimilarUnit(unit_code="x1"),
|
||||||
|
SimilarUnit(unit_code="x2"),
|
||||||
|
]
|
||||||
|
save_similar_units(conn, "abc", "RECENTLY_SOLD", comps)
|
||||||
|
loaded_comps = get_similar_units(conn, "abc", "RECENTLY_SOLD")
|
||||||
|
assert len(loaded_comps) == 2
|
||||||
|
assert loaded_comps[0].unit_code == "x1"
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_page_cache_roundtrip():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
conn = init_db(str(Path(tmpdir) / "cache.sqlite"))
|
||||||
|
|
||||||
|
html = "<html><body>search page</body></html>"
|
||||||
|
url = "https://www.finn.no/realestate/homes/search.html"
|
||||||
|
|
||||||
|
save_search_page(conn, url, html, ttl_minutes=5)
|
||||||
|
loaded_html = get_search_page(conn, url)
|
||||||
|
assert loaded_html == html
|
||||||
|
|
||||||
|
|
||||||
|
def test_finn_ad_cache_ttl_expiration():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
conn = init_db(str(Path(tmpdir) / "cache.sqlite"))
|
||||||
|
|
||||||
|
ad = FinnAd(
|
||||||
|
finnkode="1234",
|
||||||
|
url="https://example.com",
|
||||||
|
title="Test",
|
||||||
|
detail_fetched_at=datetime.now(UTC) - timedelta(hours=2),
|
||||||
|
)
|
||||||
|
save_finn_ad(conn, ad)
|
||||||
|
expired_ad = get_finn_ad(conn, "1234", ttl_hours=1)
|
||||||
|
assert expired_ad is None
|
||||||
@@ -0,0 +1,44 @@
|
|||||||
|
from finn_eiendom.eiendom_no import (
|
||||||
|
build_unit_vector,
|
||||||
|
decode_unit_vector,
|
||||||
|
parse_eiendom_unit_json,
|
||||||
|
parse_similar_units_json,
|
||||||
|
resolve_unit_from_finn_url,
|
||||||
|
)
|
||||||
|
from tests.fixtures import (
|
||||||
|
SAMPLE_EIENDOM_SIMILAR_UNITS_JSON,
|
||||||
|
SAMPLE_EIENDOM_UNIT_JSON,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_eiendom_unit_json():
|
||||||
|
unit = parse_eiendom_unit_json(SAMPLE_EIENDOM_UNIT_JSON["units"][0])
|
||||||
|
assert unit.unit_code == "c-gxw-xmyum-s2a"
|
||||||
|
assert unit.address == "Fernerveien 42, 0554 Oslo"
|
||||||
|
assert unit.estimated_selling_price == 7650000
|
||||||
|
assert unit.listing_sqm_price == 93500
|
||||||
|
|
||||||
|
|
||||||
|
def test_unit_vector_roundtrip():
|
||||||
|
unit = parse_eiendom_unit_json(SAMPLE_EIENDOM_UNIT_JSON["units"][0])
|
||||||
|
vector = build_unit_vector(unit)
|
||||||
|
decoded = decode_unit_vector(vector)
|
||||||
|
assert decoded["ptype"] == "APARTMENT"
|
||||||
|
assert decoded["area"] == 77
|
||||||
|
assert decoded["price"] == 7200000
|
||||||
|
assert isinstance(decoded, dict)
|
||||||
|
assert decoded["lon"] == unit.lng
|
||||||
|
|
||||||
|
|
||||||
|
def test_parse_similar_units_json():
|
||||||
|
comps = parse_similar_units_json(SAMPLE_EIENDOM_SIMILAR_UNITS_JSON)
|
||||||
|
assert len(comps) == 2
|
||||||
|
assert comps[0].unit_code == "c-recent-1"
|
||||||
|
assert comps[1].selling_price == 7350000
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolve_unit_from_finn_url():
|
||||||
|
unit_code = resolve_unit_from_finn_url(
|
||||||
|
"https://www.finn.no/realestate/homes/ad.html?finnkode=462400360"
|
||||||
|
)
|
||||||
|
assert unit_code == "462400360"
|
||||||
@@ -0,0 +1,83 @@
|
|||||||
|
"""Tests for HTTP client retry logic."""
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import pytest
|
||||||
|
import respx
|
||||||
|
|
||||||
|
from finn_eiendom.http import HTTPClient
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_retries_on_500():
|
||||||
|
"""Test that HTTPClient retries on 500 errors and succeeds on second attempt."""
|
||||||
|
client = HTTPClient(request_delay_seconds=0.0, retries=2)
|
||||||
|
|
||||||
|
with respx.mock:
|
||||||
|
route = respx.get("https://example.com/api")
|
||||||
|
route.side_effect = [
|
||||||
|
httpx.Response(500, text="Server Error"),
|
||||||
|
httpx.Response(200, text="Success"),
|
||||||
|
]
|
||||||
|
|
||||||
|
response = await client.get("https://example.com/api")
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_raises_on_404():
|
||||||
|
"""Test that HTTPClient raises on 4xx errors immediately."""
|
||||||
|
client = HTTPClient(request_delay_seconds=0.0, retries=2)
|
||||||
|
|
||||||
|
with respx.mock:
|
||||||
|
respx.get("https://example.com/api").mock(return_value=httpx.Response(404))
|
||||||
|
|
||||||
|
with pytest.raises(httpx.HTTPStatusError) as exc_info:
|
||||||
|
await client.get("https://example.com/api")
|
||||||
|
|
||||||
|
assert exc_info.value.response.status_code == 404
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_retries_on_502_bad_gateway():
|
||||||
|
"""Test that HTTPClient retries on 502 Bad Gateway."""
|
||||||
|
client = HTTPClient(request_delay_seconds=0.0, retries=2)
|
||||||
|
|
||||||
|
with respx.mock:
|
||||||
|
route = respx.get("https://example.com/api")
|
||||||
|
route.side_effect = [
|
||||||
|
httpx.Response(502, text="Bad Gateway"),
|
||||||
|
httpx.Response(200, text="Success"),
|
||||||
|
]
|
||||||
|
|
||||||
|
response = await client.get("https://example.com/api")
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_post_retries_on_503():
|
||||||
|
"""Test that HTTPClient retries POST on 503 Service Unavailable."""
|
||||||
|
client = HTTPClient(request_delay_seconds=0.0, retries=2)
|
||||||
|
|
||||||
|
with respx.mock:
|
||||||
|
route = respx.post("https://example.com/api")
|
||||||
|
route.side_effect = [
|
||||||
|
httpx.Response(503, text="Service Unavailable"),
|
||||||
|
httpx.Response(201, json={"success": True}),
|
||||||
|
]
|
||||||
|
|
||||||
|
response = await client.post("https://example.com/api", json={"test": "data"})
|
||||||
|
assert response.status_code == 201
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_eventually_fails_on_persistent_500():
|
||||||
|
"""Test that HTTPClient gives up after max retries."""
|
||||||
|
client = HTTPClient(request_delay_seconds=0.0, retries=1)
|
||||||
|
|
||||||
|
with respx.mock:
|
||||||
|
respx.get("https://example.com/api").mock(return_value=httpx.Response(500))
|
||||||
|
|
||||||
|
with pytest.raises(httpx.HTTPStatusError) as exc_info:
|
||||||
|
await client.get("https://example.com/api")
|
||||||
|
|
||||||
|
assert exc_info.value.response.status_code == 500
|
||||||
@@ -0,0 +1,69 @@
|
|||||||
|
"""Tests for the MCP server tools."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
from finn_eiendom.mcp_server import (
|
||||||
|
finn_decode_unit_vector,
|
||||||
|
mcp,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_mcp_server_has_correct_tools():
|
||||||
|
"""Assert that the MCP server has all expected tools."""
|
||||||
|
import asyncio
|
||||||
|
|
||||||
|
async def check_tools():
|
||||||
|
tools = await mcp.list_tools()
|
||||||
|
tool_names = {tool.name for tool in tools}
|
||||||
|
expected_tools = {
|
||||||
|
"finn_analyze_search",
|
||||||
|
"finn_get_ad",
|
||||||
|
"finn_resolve_eiendom_unit",
|
||||||
|
"finn_get_eiendom_unit",
|
||||||
|
"finn_get_similar_units",
|
||||||
|
"finn_build_unit_vector",
|
||||||
|
"finn_decode_unit_vector",
|
||||||
|
}
|
||||||
|
assert expected_tools.issubset(tool_names), f"Missing tools: {expected_tools - tool_names}"
|
||||||
|
|
||||||
|
asyncio.run(check_tools())
|
||||||
|
|
||||||
|
|
||||||
|
def test_finn_decode_unit_vector_returns_json():
|
||||||
|
"""Test that finn_decode_unit_vector returns valid JSON with expected keys."""
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
test_vector = {
|
||||||
|
"lon": 10.7,
|
||||||
|
"lat": 59.9,
|
||||||
|
"ptype": "APARTMENT",
|
||||||
|
"floor": 3,
|
||||||
|
"rooms": 3,
|
||||||
|
"built": 2000,
|
||||||
|
"area": 80,
|
||||||
|
"price": 5000000,
|
||||||
|
}
|
||||||
|
|
||||||
|
with patch("finn_eiendom.mcp_server.decode_unit_vector", return_value=test_vector):
|
||||||
|
result = finn_decode_unit_vector("dGVzdA==")
|
||||||
|
|
||||||
|
data = json.loads(result)
|
||||||
|
assert "lon" in data
|
||||||
|
assert "lat" in data
|
||||||
|
assert "ptype" in data
|
||||||
|
assert data["lat"] == 59.9
|
||||||
|
assert data["lon"] == 10.7
|
||||||
|
|
||||||
|
|
||||||
|
def test_finn_decode_unit_vector_error_handling():
|
||||||
|
"""Test that finn_decode_unit_vector handles errors gracefully."""
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
with patch(
|
||||||
|
"finn_eiendom.mcp_server.decode_unit_vector", side_effect=Exception("decode failed")
|
||||||
|
):
|
||||||
|
result = finn_decode_unit_vector("invalid")
|
||||||
|
|
||||||
|
data = json.loads(result)
|
||||||
|
assert data.get("error") is True
|
||||||
|
assert "message" in data
|
||||||
@@ -0,0 +1,45 @@
|
|||||||
|
from finn_eiendom.parser import (
|
||||||
|
clean_text,
|
||||||
|
extract_finnkode_from_url,
|
||||||
|
normalize_area,
|
||||||
|
normalize_finnkode,
|
||||||
|
normalize_number,
|
||||||
|
normalize_price,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_normalize_price():
|
||||||
|
assert normalize_price("7 200 991 kr") == 7200991
|
||||||
|
assert normalize_price("1 234") == 1234
|
||||||
|
assert normalize_price(None) is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_normalize_area():
|
||||||
|
assert normalize_area("77 m²") == 77
|
||||||
|
assert normalize_area("100,5 m²") == 100
|
||||||
|
assert normalize_area("") is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_normalize_number():
|
||||||
|
assert normalize_number("3 500 kr/mnd") == 3500
|
||||||
|
assert normalize_number("7,2") == 7
|
||||||
|
assert normalize_number("1.234") == 1234
|
||||||
|
assert normalize_number(None) is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_normalize_finnkode():
|
||||||
|
assert normalize_finnkode(" 462400360 ") == "462400360"
|
||||||
|
assert normalize_finnkode(None) is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_finnkode_from_url():
|
||||||
|
assert (
|
||||||
|
extract_finnkode_from_url("https://www.finn.no/realestate/homes/ad.html?finnkode=462400360")
|
||||||
|
== "462400360"
|
||||||
|
)
|
||||||
|
assert extract_finnkode_from_url("https://www.finn.no/realestate/homes/ad.html") is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_clean_text():
|
||||||
|
assert clean_text(" Hello world \n") == "Hello world"
|
||||||
|
assert clean_text(None) is None
|
||||||
@@ -0,0 +1,22 @@
|
|||||||
|
from finn_eiendom.models import EiendomUnit, FinnAd
|
||||||
|
from finn_eiendom.scoring import classify_ad, score_ad
|
||||||
|
|
||||||
|
|
||||||
|
def test_score_ad_and_classify():
|
||||||
|
ad = FinnAd(
|
||||||
|
finnkode="462400360",
|
||||||
|
url="https://www.finn.no/realestate/homes/ad.html?finnkode=462400360",
|
||||||
|
title="Flott 3-roms i Ferner",
|
||||||
|
)
|
||||||
|
unit = EiendomUnit(
|
||||||
|
unit_code="c-gxw-xmyum-s2a",
|
||||||
|
estimated_selling_price=7650000,
|
||||||
|
listing_price=7200000,
|
||||||
|
property_type="APARTMENT",
|
||||||
|
usable_area=77,
|
||||||
|
rooms=4,
|
||||||
|
)
|
||||||
|
scores = score_ad(ad, unit, [])
|
||||||
|
assert scores["market_position"] >= 0
|
||||||
|
categories = classify_ad(scores)
|
||||||
|
assert isinstance(categories, list)
|
||||||
@@ -0,0 +1,38 @@
|
|||||||
|
from finn_eiendom.search import extract_ad_links, extract_search_cards
|
||||||
|
from tests.fixtures import SAMPLE_FINN_SEARCH_HTML, SAMPLE_FINN_SEARCH_HTML_NEW
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_search_cards():
|
||||||
|
cards = extract_search_cards(SAMPLE_FINN_SEARCH_HTML)
|
||||||
|
assert len(cards) == 2
|
||||||
|
assert cards[0].finnkode == "462400360"
|
||||||
|
assert cards[0].url.endswith("finnkode=462400360")
|
||||||
|
assert cards[0].area_m2 == 77
|
||||||
|
assert cards[0].total_price == 7200991
|
||||||
|
assert cards[0].common_costs == 3500
|
||||||
|
assert cards[1].bedrooms == 2
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_search_cards_new_format():
|
||||||
|
cards = extract_search_cards(SAMPLE_FINN_SEARCH_HTML_NEW)
|
||||||
|
assert len(cards) == 1
|
||||||
|
assert cards[0].finnkode == "462880791"
|
||||||
|
assert cards[0].url.endswith("finnkode=462880791")
|
||||||
|
assert cards[0].address == "Lofotgata 4B, Oslo"
|
||||||
|
assert cards[0].area_m2 == 62
|
||||||
|
assert cards[0].total_price == 7253377
|
||||||
|
assert cards[0].common_costs == 7067
|
||||||
|
assert cards[0].bedrooms == 2
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_ad_links():
|
||||||
|
links = extract_ad_links(SAMPLE_FINN_SEARCH_HTML)
|
||||||
|
assert len(links) == 2
|
||||||
|
assert "finnkode=462400360" in links[0]
|
||||||
|
assert "finnkode=460784945" in links[1]
|
||||||
|
|
||||||
|
|
||||||
|
def test_extract_ad_links_new_format():
|
||||||
|
links = extract_ad_links(SAMPLE_FINN_SEARCH_HTML_NEW)
|
||||||
|
assert len(links) == 1
|
||||||
|
assert "finnkode=462880791" in links[0]
|
||||||
@@ -0,0 +1,97 @@
|
|||||||
|
"""Tests for the service layer (cache-aware fetching)."""
|
||||||
|
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from finn_eiendom.models import EiendomUnit, FinnAd
|
||||||
|
from finn_eiendom.service import get_or_fetch_ad, get_or_fetch_eiendom_unit
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_or_fetch_ad_uses_cache():
|
||||||
|
"""Test that get_or_fetch_ad returns cached ad without fetching."""
|
||||||
|
mock_ad = FinnAd(finnkode="123", url="http://example.com")
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("finn_eiendom.service.init_db"),
|
||||||
|
patch("finn_eiendom.service.get_finn_ad", return_value=mock_ad) as mock_get,
|
||||||
|
patch("finn_eiendom.service.fetch_ad_details") as mock_fetch,
|
||||||
|
):
|
||||||
|
result = await get_or_fetch_ad("123")
|
||||||
|
|
||||||
|
assert result.finnkode == "123"
|
||||||
|
mock_get.assert_called_once()
|
||||||
|
mock_fetch.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_or_fetch_ad_fetches_when_cache_miss():
|
||||||
|
"""Test that get_or_fetch_ad fetches when cache is empty."""
|
||||||
|
mock_ad = FinnAd(finnkode="123", url="http://example.com")
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("finn_eiendom.service.init_db"),
|
||||||
|
patch("finn_eiendom.service.get_finn_ad", return_value=None),
|
||||||
|
patch("finn_eiendom.service.fetch_ad_details", return_value=mock_ad) as mock_fetch,
|
||||||
|
patch("finn_eiendom.service.save_finn_ad") as mock_save,
|
||||||
|
):
|
||||||
|
result = await get_or_fetch_ad("123")
|
||||||
|
|
||||||
|
assert result.finnkode == "123"
|
||||||
|
mock_fetch.assert_called_once_with("123")
|
||||||
|
mock_save.assert_called_once()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_or_fetch_ad_force_refresh():
|
||||||
|
"""Test that force_refresh=True bypasses cache."""
|
||||||
|
mock_ad = FinnAd(finnkode="123", url="http://example.com")
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("finn_eiendom.service.init_db"),
|
||||||
|
patch("finn_eiendom.service.get_finn_ad", return_value=mock_ad) as mock_get,
|
||||||
|
patch("finn_eiendom.service.fetch_ad_details", return_value=mock_ad) as mock_fetch,
|
||||||
|
patch("finn_eiendom.service.save_finn_ad") as mock_save,
|
||||||
|
):
|
||||||
|
result = await get_or_fetch_ad("123", force_refresh=True)
|
||||||
|
|
||||||
|
assert result.finnkode == "123"
|
||||||
|
mock_get.assert_not_called()
|
||||||
|
mock_fetch.assert_called_once_with("123")
|
||||||
|
mock_save.assert_called_once()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_or_fetch_eiendom_unit_uses_cache():
|
||||||
|
"""Test that get_or_fetch_eiendom_unit returns cached unit without fetching."""
|
||||||
|
mock_unit = EiendomUnit(unit_code="test-code")
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("finn_eiendom.service.init_db"),
|
||||||
|
patch("finn_eiendom.service.get_cached_eiendom_unit", return_value=mock_unit) as mock_get,
|
||||||
|
patch("finn_eiendom.service.get_unit") as mock_fetch,
|
||||||
|
):
|
||||||
|
result = await get_or_fetch_eiendom_unit("test-code")
|
||||||
|
|
||||||
|
assert result.unit_code == "test-code"
|
||||||
|
mock_get.assert_called_once()
|
||||||
|
mock_fetch.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_or_fetch_eiendom_unit_fetches_when_cache_miss():
|
||||||
|
"""Test that get_or_fetch_eiendom_unit fetches when cache is empty."""
|
||||||
|
mock_unit = EiendomUnit(unit_code="test-code")
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch("finn_eiendom.service.init_db"),
|
||||||
|
patch("finn_eiendom.service.get_cached_eiendom_unit", return_value=None),
|
||||||
|
patch("finn_eiendom.service.get_unit", return_value=mock_unit) as mock_fetch,
|
||||||
|
patch("finn_eiendom.service.save_eiendom_unit") as mock_save,
|
||||||
|
):
|
||||||
|
result = await get_or_fetch_eiendom_unit("test-code")
|
||||||
|
|
||||||
|
assert result.unit_code == "test-code"
|
||||||
|
mock_fetch.assert_called_once_with("test-code")
|
||||||
|
mock_save.assert_called_once()
|
||||||
Reference in New Issue
Block a user