--- name: Clean code rules description: Best-practice standards for all production and test code applyTo: "**/*.py" --- # Clean code rules These rules apply everywhere — every module, every function, every test. They are intentionally opinionated. If a rule conflicts with the architecture rules in `PRD.md` §17, the architecture rules win. If it conflicts with another best practice here, pick the one that produces the simpler, more readable result. ## Single responsibility * One job per function. If a function name needs "and" to describe it, it's two functions. * One job per module. `parser.py` parses. `cache.py` caches. `formatting.py` formats. Don't mix. * One job per class. We rarely need classes outside Pydantic models, dataclasses, and the `HTTPClient`. Avoid OO for OO's sake. ## Function size * Aim for under **30 lines** of body. * Past **50 lines** it's a code smell — extract helpers. * If you've got more than **3 levels of nesting**, the function wants splitting (extract the inner block into a helper named after what it does). ## Naming * Names describe **intent**, not implementation. `get_or_fetch_ad`, not `process_ad`. `render_shortlist_markdown`, not `format2`. * Verbs for actions (`fetch_`, `parse_`, `score_`, `render_`). * Nouns for data (`FinnAd`, `EiendomUnit`, `shortlist`). * Boolean variables / parameters read as predicates: `force_refresh`, `include_eiendom_no`, `is_recently_sold`. Not `flag`, not `do_thing`. * Avoid abbreviations except those well-established in the domain (`url`, `ad`, `nok`, `bra`, `sqm`). * Norwegian terms stay Norwegian when they're domain vocabulary (`hybel`, `fellesgjeld`, `finnkode`). Don't translate `finnkode` to `finn_code` — it's a proper noun. ## Type hints Required on every function signature, including private helpers. Mypy in strict mode is the goal. ```python # ❌ def parse(html, base_url=None): ... # ✅ def parse(html: str, base_url: str | None = None) -> FinnAd | None: ... ``` Use modern syntax: `X | None` over `Optional[X]`, `list[int]` over `List[int]`, `dict[str, Any]` over `Dict[str, Any]`. ## Comments * Comments explain **WHY**, never **WHAT**. The code already says what. * If a comment is needed to explain *what* a line does, the line wants renaming or extracting. * Use docstrings for public functions, classes, and modules. One-line summary, blank line, optional details and examples. * No commented-out code. Delete it. Git remembers. * No `# TODO` without a date or issue reference. `# TODO(2026-05): replace once Eiendom.no confirms ...` is fine. ## DRY — Don't Repeat Yourself If you write the same logic, regex, SQL, or format string **twice**, extract it. The decision table in `PRD.md` §17.2 tells you where it belongs. The pre-merge anti-duplication checklist (from `PRD.md` §17.4): 1. Is this logic already implemented somewhere? (`grep` the function name and obvious keywords.) 2. If I'm copy-pasting from another file, am I about to duplicate behavior that should live in one shared function? 3. Can a new caller use an existing `service.py` function instead of writing its own orchestration? 4. Is the same Pydantic field defined in two models? Factor out a base model. 5. Am I formatting output in two places (CLI + MCP)? Move it to `formatting.py`. 6. Am I opening a SQLite connection outside `cache.py`? Move it. 7. Am I building an httpx call outside `http.py`? Move it. 8. Am I writing a Norwegian-number / area / finnkode regex outside `parser.py`? Move it. 9. Am I adding an env-var lookup outside `config.py`? Move it. 10. Did I add a new behavior with only one front end (MCP or CLI)? If it should exist in both, the service function is missing. A small amount of duplication is acceptable to keep boundaries clean — see `PRD.md` §17.8. Past a handful of lines, extract. ## Errors * **Fail loudly** with actionable messages. ```python # ❌ raise ValueError("bad input") # ✅ raise ValueError(f"Unknown listing_status {status!r}; expected one of {VALID_LISTING_STATUSES}") ``` * **No silent failures.** `except Exception: pass` is forbidden. Catch the specific exception, log it, and either recover or re-raise. * **Service raises; MCP wraps.** Service functions raise normal exceptions. The MCP tool boundary translates them into `{"error": True, "code": ..., "message": ...}`. CLI lets typer handle non-zero exits. * **Graceful degradation is explicit.** If Eiendom.no enrichment fails, return a result with `eiendom_unit=None` and a warning, not a silently-missing field. ## State * No global mutable state. The only module-level constants allowed are configuration values loaded from env in `config.py`. * No module-level caches (dicts, lists) that mutate. Use `cache.py` if you need persistence. * Pass dependencies in (HTTP clients, DB connections) for testability. ## Dead code * No commented-out code. * No unused imports (ruff catches these — fix them, don't add `# noqa`). * No unused parameters (use `_` or remove). * No `if False:` blocks "for later". * Functions and classes that aren't called anywhere — delete them. Git keeps history. ## Magic numbers and strings Anything that influences behavior and isn't self-explanatory belongs in `config.py` (env-controlled) or as a named module-level constant near the top of the file. ```python # ❌ if days > 90: confidence = "low" # ✅ COMPS_STALE_AFTER_DAYS = 90 if days > COMPS_STALE_AFTER_DAYS: confidence = "low" ``` URLs, timeouts, retries, TTLs, status codes — never inline. ## Imports * Standard library first, third-party second, local last, separated by blank lines. * Ruff's `I` rules sort and group these — run `ruff check . --fix`. * No wildcard imports. * No relative imports above one level (`from ..thing import x` is a smell; refactor). * Each module's allowed import set is enforced by `tests/test_architecture.py`. ## Tests are first-class code Same rules. Same type hints. Same naming. Same DRY. If a fixture is used in three test files, it goes in `conftest.py`. If three tests share a setup, factor it into a fixture. ## Reviewing your own change before commit A 60-second self-review: 1. Did I add a function that already exists somewhere? (`grep` it.) 2. Did I bypass `service.py`, `http.py`, `cache.py`, or `formatting.py`? 3. Is everything typed? 4. Did I leave a `print()`, `breakpoint()`, or commented-out block behind? 5. Does the test for this change actually fail without the change? 6. Did I update `PRD.md` or the relevant instruction file if I changed an architectural rule? ## When in doubt about a library API Use the `context7` MCP server instead of guessing. See `docs.instructions.md`. Training-data memory of `pydantic.field_validator`, `typer.Option`, `mcp.tool` annotations, or `httpx.AsyncClient` is unreliable — they all change between versions.