Files
finn-mcp/.github/instructions/tests.instructions.md
T
2026-05-16 06:54:17 +00:00

199 lines
7.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
name: Test rules
description: Testing conventions for parser, cache, scoring, service, MCP, CLI, and architecture
applyTo: "tests/**/*.py"
---
# Test rules
## Runtime
Tests run in the project-local `.venv`. From the project root with the venv activated:
```bash
pytest # full suite
pytest tests/test_service.py -v # one file
pytest -k "shortlist" # one keyword
pytest --lf # rerun last failures
```
`pytest-asyncio` is in `[tool.pytest.ini_options]` with `asyncio_mode = "auto"``async def` tests run without an `@pytest.mark.asyncio` decorator.
## Never do live network calls
No real HTTP in unit tests. Mock with `respx` (sits in front of `httpx.AsyncClient`):
```python
import respx, httpx
from finn_eiendom import http as http_module
@respx.mock
async def test_finn_search_fetch_uses_user_agent():
route = respx.get("https://www.finn.no/realestate/homes/search.html").mock(
return_value=httpx.Response(200, html=SAMPLE_FINN_SEARCH_HTML)
)
client = http_module.HTTPClient(user_agent="test-agent")
resp = await client.get("https://www.finn.no/realestate/homes/search.html")
assert resp.status_code == 200
assert route.calls.last.request.headers["user-agent"] == "test-agent"
```
## Fixtures
Fixture-driven testing for parsers and APIs:
* FINN search HTML → `tests/fixtures/finn_search.html`.
* FINN listing HTML → `tests/fixtures/finn_ad_*.html`.
* Eiendom.no unit search JSON → `tests/fixtures/eiendom_unit_search.json`.
* Eiendom.no unit detail JSON → `tests/fixtures/eiendom_unit_detail.json`.
* Eiendom.no similar-units JSON → `tests/fixtures/eiendom_similar.json`.
Loader helpers in `tests/fixtures.py` (e.g. `SAMPLE_FINN_SEARCH_HTML`, `SAMPLE_EIENDOM_UNIT_JSON`). Add new fixtures here, don't inline large strings in test files.
## Test layout
```
tests/
fixtures/ # raw HTML / JSON inputs
fixtures.py # loader helpers
conftest.py # shared pytest fixtures (tmp DB, http client, etc.)
test_parser.py # number/area/date/URL/finnkode normalization
test_search.py # FINN search HTML → cards
test_ad.py # FINN listing HTML → FinnAd
test_eiendom_no.py # unit search/detail/similar JSON, unit_vector encode/decode
test_scoring.py # all scoring components + classifier
test_cache.py # SQLite read/write/TTL
test_http.py # retry on 5xx, raise on 4xx, delay applied (new)
test_service.py # get_or_fetch_*, analyze_* (new)
test_formatting.py # render_* json/markdown/table (new)
test_mcp_server.py # tool registration + error envelope (expanded)
test_cli.py # typer CliRunner (new)
test_architecture.py # import-graph invariants (new)
```
## What to test per category
### Parsers (`test_parser`, `test_search`, `test_ad`, `test_eiendom_no`)
* Missing fields → `None`, not exception.
* Norwegian number formats: `7 200 991 kr`, `kr 7 200 991`, `7.200.991`.
* URL normalization (relative → absolute).
* Finnkode extraction from various URL shapes.
* Area parsing: `77 m²`, `77m2`, `77 kvm`.
* Price parsing (asking vs total vs shared debt).
* Eiendom.no JSON edge cases: empty `units`, missing `valuation`, missing `latestMarketData`.
### Unit vectors (`test_eiendom_no`)
* msgpack encoding + base64url without padding.
* Decode roundtrip.
* Missing optional fields (floor, rooms, built).
* Both lon/lat orderings handled.
### Scoring (`test_scoring`)
* Each component in isolation.
* Total clamped to 0100.
* Risk penalties applied (negative range).
* Bargain classification triggers on the expected signal mix.
* Hybel classification: documented / possible / unclear / not relevant.
* Explainability: explanation list non-empty when score is non-trivial.
### Cache (`test_cache`)
* Read after write returns same object.
* TTL expiry returns `None`.
* JSON roundtrip preserves all fields.
* `init_db` is idempotent on existing DBs.
### HTTP (`test_http`)
* Retries on 500/502/503/504 with backoff (count exactly N retries).
* Raises immediately on 404 / 4xx.
* Applies `request_delay` between calls.
* Honors `user_agent`.
### Service (`test_service`)
The service tests are the heart of the suite. They cover orchestration end-to-end against fixtures.
* `test_get_or_fetch_ad_uses_cache` — second call hits cache, no HTTP.
* `test_get_or_fetch_ad_fetches_when_cache_miss` — first call hits HTTP, then writes cache.
* `test_get_or_fetch_ad_force_refresh``force_refresh=True` bypasses cache.
* `test_analyze_search_with_fixtures` — full run from search HTML → shortlist.
* `test_find_similar_to_liked_uses_liked_feedback` — only seeds from `liked` verdicts.
Use a tmp SQLite DB via the `tmp_path` pytest fixture:
```python
@pytest.fixture
def tmp_db(tmp_path, monkeypatch):
db_path = tmp_path / "finn.sqlite"
monkeypatch.setenv("FINN_CACHE_PATH", str(db_path))
return db_path
```
### Formatting (`test_formatting`)
* `render_shortlist(result, "json")` is parseable JSON and roundtrips.
* `render_shortlist(result, "markdown")` contains the score and at least one risk.
* `render_<thing>(result, "xml")` raises `ValueError` listing supported formats.
### MCP (`test_mcp_server`)
* `test_mcp_server_has_correct_tools` — all 14 `finn_*` tool names registered.
* `test_finn_decode_unit_vector_returns_json` — happy path.
* `test_finn_analyze_search_handles_error` — error envelope shape: `{"error": True, "code": ..., "message": ...}`.
Use the `mcp` SDK's testing helpers; don't spawn a subprocess.
### CLI (`test_cli`)
Use Typer's `CliRunner`:
```python
from typer.testing import CliRunner
from finn_eiendom.cli import app
runner = CliRunner()
def test_cli_help():
result = runner.invoke(app, ["--help"])
assert result.exit_code == 0
assert "analyze-search" in result.stdout
```
Patch `service.<function>` with `monkeypatch` so CLI tests don't exercise the full stack — that's covered by `test_service.py`.
### Architecture (`test_architecture`)
Static checks of the module dependency graph:
* No `import httpx` outside `finn_eiendom/http.py`.
* No `import sqlite3` outside `finn_eiendom/cache.py`.
* No `BeautifulSoup` import outside `search.py` and `ad.py`.
* No `msgpack` import outside `eiendom_no.py`.
* `mcp_server.py` only imports from `service`, `formatting`, `models`, `config`, `mcp`, stdlib, `pydantic`.
* `cli.py` only imports from `service`, `formatting`, `models`, `config`, `typer`, stdlib.
* `service.py` does not import from `mcp_server` or `cli`.
Implementation: walk `.py` files under `finn_eiendom/` with `ast`, collect imports, assert allowed sets per module.
## Best practices
* One assertion per test (or per closely related group). Long tests die in painful ways.
* Test names describe the behavior: `test_get_or_fetch_ad_uses_cache_within_ttl`.
* Use `monkeypatch` for env vars and `tmp_path` for files. No `os.environ` mutation.
* No `time.sleep` — use `freezegun` if a test depends on time, or refactor the code under test to take a `now` parameter.
* No "smoke tests" that ping real servers — those go under a separately-marked `pytest -m live` suite and are not part of CI.
## When uncertain about test tooling
Use `context7` for pytest, respx, freezegun, or Typer testing:
```
context7:resolve-library-id → "pytest-dev/pytest" / "lundberg/respx"
context7:query-docs(id, "respx mock httpx async post")
```
See `docs.instructions.md`.