Files
finn-mcp/.github/instructions/tests.instructions.md
T
2026-05-16 06:54:17 +00:00

7.5 KiB
Raw Blame History

name, description, applyTo
name description applyTo
Test rules Testing conventions for parser, cache, scoring, service, MCP, CLI, and architecture tests/**/*.py

Test rules

Runtime

Tests run in the project-local .venv. From the project root with the venv activated:

pytest                                 # full suite
pytest tests/test_service.py -v        # one file
pytest -k "shortlist"                  # one keyword
pytest --lf                            # rerun last failures

pytest-asyncio is in [tool.pytest.ini_options] with asyncio_mode = "auto"async def tests run without an @pytest.mark.asyncio decorator.

Never do live network calls

No real HTTP in unit tests. Mock with respx (sits in front of httpx.AsyncClient):

import respx, httpx
from finn_eiendom import http as http_module

@respx.mock
async def test_finn_search_fetch_uses_user_agent():
    route = respx.get("https://www.finn.no/realestate/homes/search.html").mock(
        return_value=httpx.Response(200, html=SAMPLE_FINN_SEARCH_HTML)
    )
    client = http_module.HTTPClient(user_agent="test-agent")
    resp = await client.get("https://www.finn.no/realestate/homes/search.html")
    assert resp.status_code == 200
    assert route.calls.last.request.headers["user-agent"] == "test-agent"

Fixtures

Fixture-driven testing for parsers and APIs:

  • FINN search HTML → tests/fixtures/finn_search.html.
  • FINN listing HTML → tests/fixtures/finn_ad_*.html.
  • Eiendom.no unit search JSON → tests/fixtures/eiendom_unit_search.json.
  • Eiendom.no unit detail JSON → tests/fixtures/eiendom_unit_detail.json.
  • Eiendom.no similar-units JSON → tests/fixtures/eiendom_similar.json.

Loader helpers in tests/fixtures.py (e.g. SAMPLE_FINN_SEARCH_HTML, SAMPLE_EIENDOM_UNIT_JSON). Add new fixtures here, don't inline large strings in test files.

Test layout

tests/
  fixtures/                # raw HTML / JSON inputs
  fixtures.py              # loader helpers
  conftest.py              # shared pytest fixtures (tmp DB, http client, etc.)
  test_parser.py           # number/area/date/URL/finnkode normalization
  test_search.py           # FINN search HTML → cards
  test_ad.py               # FINN listing HTML → FinnAd
  test_eiendom_no.py       # unit search/detail/similar JSON, unit_vector encode/decode
  test_scoring.py          # all scoring components + classifier
  test_cache.py            # SQLite read/write/TTL
  test_http.py             # retry on 5xx, raise on 4xx, delay applied  (new)
  test_service.py          # get_or_fetch_*, analyze_*                    (new)
  test_formatting.py       # render_* json/markdown/table                  (new)
  test_mcp_server.py       # tool registration + error envelope            (expanded)
  test_cli.py              # typer CliRunner                                (new)
  test_architecture.py     # import-graph invariants                        (new)

What to test per category

Parsers (test_parser, test_search, test_ad, test_eiendom_no)

  • Missing fields → None, not exception.
  • Norwegian number formats: 7 200 991 kr, kr 7 200 991, 7.200.991.
  • URL normalization (relative → absolute).
  • Finnkode extraction from various URL shapes.
  • Area parsing: 77 m², 77m2, 77 kvm.
  • Price parsing (asking vs total vs shared debt).
  • Eiendom.no JSON edge cases: empty units, missing valuation, missing latestMarketData.

Unit vectors (test_eiendom_no)

  • msgpack encoding + base64url without padding.
  • Decode roundtrip.
  • Missing optional fields (floor, rooms, built).
  • Both lon/lat orderings handled.

Scoring (test_scoring)

  • Each component in isolation.
  • Total clamped to 0100.
  • Risk penalties applied (negative range).
  • Bargain classification triggers on the expected signal mix.
  • Hybel classification: documented / possible / unclear / not relevant.
  • Explainability: explanation list non-empty when score is non-trivial.

Cache (test_cache)

  • Read after write returns same object.
  • TTL expiry returns None.
  • JSON roundtrip preserves all fields.
  • init_db is idempotent on existing DBs.

HTTP (test_http)

  • Retries on 500/502/503/504 with backoff (count exactly N retries).
  • Raises immediately on 404 / 4xx.
  • Applies request_delay between calls.
  • Honors user_agent.

Service (test_service)

The service tests are the heart of the suite. They cover orchestration end-to-end against fixtures.

  • test_get_or_fetch_ad_uses_cache — second call hits cache, no HTTP.
  • test_get_or_fetch_ad_fetches_when_cache_miss — first call hits HTTP, then writes cache.
  • test_get_or_fetch_ad_force_refreshforce_refresh=True bypasses cache.
  • test_analyze_search_with_fixtures — full run from search HTML → shortlist.
  • test_find_similar_to_liked_uses_liked_feedback — only seeds from liked verdicts.

Use a tmp SQLite DB via the tmp_path pytest fixture:

@pytest.fixture
def tmp_db(tmp_path, monkeypatch):
    db_path = tmp_path / "finn.sqlite"
    monkeypatch.setenv("FINN_CACHE_PATH", str(db_path))
    return db_path

Formatting (test_formatting)

  • render_shortlist(result, "json") is parseable JSON and roundtrips.
  • render_shortlist(result, "markdown") contains the score and at least one risk.
  • render_<thing>(result, "xml") raises ValueError listing supported formats.

MCP (test_mcp_server)

  • test_mcp_server_has_correct_tools — all 14 finn_* tool names registered.
  • test_finn_decode_unit_vector_returns_json — happy path.
  • test_finn_analyze_search_handles_error — error envelope shape: {"error": True, "code": ..., "message": ...}.

Use the mcp SDK's testing helpers; don't spawn a subprocess.

CLI (test_cli)

Use Typer's CliRunner:

from typer.testing import CliRunner
from finn_eiendom.cli import app

runner = CliRunner()

def test_cli_help():
    result = runner.invoke(app, ["--help"])
    assert result.exit_code == 0
    assert "analyze-search" in result.stdout

Patch service.<function> with monkeypatch so CLI tests don't exercise the full stack — that's covered by test_service.py.

Architecture (test_architecture)

Static checks of the module dependency graph:

  • No import httpx outside finn_eiendom/http.py.
  • No import sqlite3 outside finn_eiendom/cache.py.
  • No BeautifulSoup import outside search.py and ad.py.
  • No msgpack import outside eiendom_no.py.
  • mcp_server.py only imports from service, formatting, models, config, mcp, stdlib, pydantic.
  • cli.py only imports from service, formatting, models, config, typer, stdlib.
  • service.py does not import from mcp_server or cli.

Implementation: walk .py files under finn_eiendom/ with ast, collect imports, assert allowed sets per module.

Best practices

  • One assertion per test (or per closely related group). Long tests die in painful ways.
  • Test names describe the behavior: test_get_or_fetch_ad_uses_cache_within_ttl.
  • Use monkeypatch for env vars and tmp_path for files. No os.environ mutation.
  • No time.sleep — use freezegun if a test depends on time, or refactor the code under test to take a now parameter.
  • No "smoke tests" that ping real servers — those go under a separately-marked pytest -m live suite and are not part of CI.

When uncertain about test tooling

Use context7 for pytest, respx, freezegun, or Typer testing:

context7:resolve-library-id   →  "pytest-dev/pytest" / "lundberg/respx"
context7:query-docs(id, "respx mock httpx async post")

See docs.instructions.md.