Files
finn-mcp/.github/instructions/clean-code.instructions.md
T
2026-05-16 06:54:17 +00:00

6.7 KiB

name, description, applyTo
name description applyTo
Clean code rules Best-practice standards for all production and test code **/*.py

Clean code rules

These rules apply everywhere — every module, every function, every test. They are intentionally opinionated. If a rule conflicts with the architecture rules in PRD.md §17, the architecture rules win. If it conflicts with another best practice here, pick the one that produces the simpler, more readable result.

Single responsibility

  • One job per function. If a function name needs "and" to describe it, it's two functions.
  • One job per module. parser.py parses. cache.py caches. formatting.py formats. Don't mix.
  • One job per class. We rarely need classes outside Pydantic models, dataclasses, and the HTTPClient. Avoid OO for OO's sake.

Function size

  • Aim for under 30 lines of body.
  • Past 50 lines it's a code smell — extract helpers.
  • If you've got more than 3 levels of nesting, the function wants splitting (extract the inner block into a helper named after what it does).

Naming

  • Names describe intent, not implementation. get_or_fetch_ad, not process_ad. render_shortlist_markdown, not format2.
  • Verbs for actions (fetch_, parse_, score_, render_).
  • Nouns for data (FinnAd, EiendomUnit, shortlist).
  • Boolean variables / parameters read as predicates: force_refresh, include_eiendom_no, is_recently_sold. Not flag, not do_thing.
  • Avoid abbreviations except those well-established in the domain (url, ad, nok, bra, sqm).
  • Norwegian terms stay Norwegian when they're domain vocabulary (hybel, fellesgjeld, finnkode). Don't translate finnkode to finn_code — it's a proper noun.

Type hints

Required on every function signature, including private helpers. Mypy in strict mode is the goal.

# ❌
def parse(html, base_url=None):
    ...

# ✅
def parse(html: str, base_url: str | None = None) -> FinnAd | None:
    ...

Use modern syntax: X | None over Optional[X], list[int] over List[int], dict[str, Any] over Dict[str, Any].

Comments

  • Comments explain WHY, never WHAT. The code already says what.
  • If a comment is needed to explain what a line does, the line wants renaming or extracting.
  • Use docstrings for public functions, classes, and modules. One-line summary, blank line, optional details and examples.
  • No commented-out code. Delete it. Git remembers.
  • No # TODO without a date or issue reference. # TODO(2026-05): replace once Eiendom.no confirms ... is fine.

DRY — Don't Repeat Yourself

If you write the same logic, regex, SQL, or format string twice, extract it. The decision table in PRD.md §17.2 tells you where it belongs.

The pre-merge anti-duplication checklist (from PRD.md §17.4):

  1. Is this logic already implemented somewhere? (grep the function name and obvious keywords.)
  2. If I'm copy-pasting from another file, am I about to duplicate behavior that should live in one shared function?
  3. Can a new caller use an existing service.py function instead of writing its own orchestration?
  4. Is the same Pydantic field defined in two models? Factor out a base model.
  5. Am I formatting output in two places (CLI + MCP)? Move it to formatting.py.
  6. Am I opening a SQLite connection outside cache.py? Move it.
  7. Am I building an httpx call outside http.py? Move it.
  8. Am I writing a Norwegian-number / area / finnkode regex outside parser.py? Move it.
  9. Am I adding an env-var lookup outside config.py? Move it.
  10. Did I add a new behavior with only one front end (MCP or CLI)? If it should exist in both, the service function is missing.

A small amount of duplication is acceptable to keep boundaries clean — see PRD.md §17.8. Past a handful of lines, extract.

Errors

  • Fail loudly with actionable messages.

    # ❌
    raise ValueError("bad input")
    
    # ✅
    raise ValueError(f"Unknown listing_status {status!r}; expected one of {VALID_LISTING_STATUSES}")
    
  • No silent failures. except Exception: pass is forbidden. Catch the specific exception, log it, and either recover or re-raise.

  • Service raises; MCP wraps. Service functions raise normal exceptions. The MCP tool boundary translates them into {"error": True, "code": ..., "message": ...}. CLI lets typer handle non-zero exits.

  • Graceful degradation is explicit. If Eiendom.no enrichment fails, return a result with eiendom_unit=None and a warning, not a silently-missing field.

State

  • No global mutable state. The only module-level constants allowed are configuration values loaded from env in config.py.
  • No module-level caches (dicts, lists) that mutate. Use cache.py if you need persistence.
  • Pass dependencies in (HTTP clients, DB connections) for testability.

Dead code

  • No commented-out code.
  • No unused imports (ruff catches these — fix them, don't add # noqa).
  • No unused parameters (use _ or remove).
  • No if False: blocks "for later".
  • Functions and classes that aren't called anywhere — delete them. Git keeps history.

Magic numbers and strings

Anything that influences behavior and isn't self-explanatory belongs in config.py (env-controlled) or as a named module-level constant near the top of the file.

# ❌
if days > 90:
    confidence = "low"

# ✅
COMPS_STALE_AFTER_DAYS = 90

if days > COMPS_STALE_AFTER_DAYS:
    confidence = "low"

URLs, timeouts, retries, TTLs, status codes — never inline.

Imports

  • Standard library first, third-party second, local last, separated by blank lines.
  • Ruff's I rules sort and group these — run ruff check . --fix.
  • No wildcard imports.
  • No relative imports above one level (from ..thing import x is a smell; refactor).
  • Each module's allowed import set is enforced by tests/test_architecture.py.

Tests are first-class code

Same rules. Same type hints. Same naming. Same DRY. If a fixture is used in three test files, it goes in conftest.py. If three tests share a setup, factor it into a fixture.

Reviewing your own change before commit

A 60-second self-review:

  1. Did I add a function that already exists somewhere? (grep it.)
  2. Did I bypass service.py, http.py, cache.py, or formatting.py?
  3. Is everything typed?
  4. Did I leave a print(), breakpoint(), or commented-out block behind?
  5. Does the test for this change actually fail without the change?
  6. Did I update PRD.md or the relevant instruction file if I changed an architectural rule?

When in doubt about a library API

Use the context7 MCP server instead of guessing. See docs.instructions.md. Training-data memory of pydantic.field_validator, typer.Option, mcp.tool annotations, or httpx.AsyncClient is unreliable — they all change between versions.