150 lines
6.7 KiB
Markdown
150 lines
6.7 KiB
Markdown
---
|
|
name: Clean code rules
|
|
description: Best-practice standards for all production and test code
|
|
applyTo: "**/*.py"
|
|
---
|
|
|
|
# Clean code rules
|
|
|
|
These rules apply everywhere — every module, every function, every test. They are intentionally opinionated. If a rule conflicts with the architecture rules in `PRD.md` §17, the architecture rules win. If it conflicts with another best practice here, pick the one that produces the simpler, more readable result.
|
|
|
|
## Single responsibility
|
|
|
|
* One job per function. If a function name needs "and" to describe it, it's two functions.
|
|
* One job per module. `parser.py` parses. `cache.py` caches. `formatting.py` formats. Don't mix.
|
|
* One job per class. We rarely need classes outside Pydantic models, dataclasses, and the `HTTPClient`. Avoid OO for OO's sake.
|
|
|
|
## Function size
|
|
|
|
* Aim for under **30 lines** of body.
|
|
* Past **50 lines** it's a code smell — extract helpers.
|
|
* If you've got more than **3 levels of nesting**, the function wants splitting (extract the inner block into a helper named after what it does).
|
|
|
|
## Naming
|
|
|
|
* Names describe **intent**, not implementation. `get_or_fetch_ad`, not `process_ad`. `render_shortlist_markdown`, not `format2`.
|
|
* Verbs for actions (`fetch_`, `parse_`, `score_`, `render_`).
|
|
* Nouns for data (`FinnAd`, `EiendomUnit`, `shortlist`).
|
|
* Boolean variables / parameters read as predicates: `force_refresh`, `include_eiendom_no`, `is_recently_sold`. Not `flag`, not `do_thing`.
|
|
* Avoid abbreviations except those well-established in the domain (`url`, `ad`, `nok`, `bra`, `sqm`).
|
|
* Norwegian terms stay Norwegian when they're domain vocabulary (`hybel`, `fellesgjeld`, `finnkode`). Don't translate `finnkode` to `finn_code` — it's a proper noun.
|
|
|
|
## Type hints
|
|
|
|
Required on every function signature, including private helpers. Mypy in strict mode is the goal.
|
|
|
|
```python
|
|
# ❌
|
|
def parse(html, base_url=None):
|
|
...
|
|
|
|
# ✅
|
|
def parse(html: str, base_url: str | None = None) -> FinnAd | None:
|
|
...
|
|
```
|
|
|
|
Use modern syntax: `X | None` over `Optional[X]`, `list[int]` over `List[int]`, `dict[str, Any]` over `Dict[str, Any]`.
|
|
|
|
## Comments
|
|
|
|
* Comments explain **WHY**, never **WHAT**. The code already says what.
|
|
* If a comment is needed to explain *what* a line does, the line wants renaming or extracting.
|
|
* Use docstrings for public functions, classes, and modules. One-line summary, blank line, optional details and examples.
|
|
* No commented-out code. Delete it. Git remembers.
|
|
* No `# TODO` without a date or issue reference. `# TODO(2026-05): replace once Eiendom.no confirms ...` is fine.
|
|
|
|
## DRY — Don't Repeat Yourself
|
|
|
|
If you write the same logic, regex, SQL, or format string **twice**, extract it. The decision table in `PRD.md` §17.2 tells you where it belongs.
|
|
|
|
The pre-merge anti-duplication checklist (from `PRD.md` §17.4):
|
|
|
|
1. Is this logic already implemented somewhere? (`grep` the function name and obvious keywords.)
|
|
2. If I'm copy-pasting from another file, am I about to duplicate behavior that should live in one shared function?
|
|
3. Can a new caller use an existing `service.py` function instead of writing its own orchestration?
|
|
4. Is the same Pydantic field defined in two models? Factor out a base model.
|
|
5. Am I formatting output in two places (CLI + MCP)? Move it to `formatting.py`.
|
|
6. Am I opening a SQLite connection outside `cache.py`? Move it.
|
|
7. Am I building an httpx call outside `http.py`? Move it.
|
|
8. Am I writing a Norwegian-number / area / finnkode regex outside `parser.py`? Move it.
|
|
9. Am I adding an env-var lookup outside `config.py`? Move it.
|
|
10. Did I add a new behavior with only one front end (MCP or CLI)? If it should exist in both, the service function is missing.
|
|
|
|
A small amount of duplication is acceptable to keep boundaries clean — see `PRD.md` §17.8. Past a handful of lines, extract.
|
|
|
|
## Errors
|
|
|
|
* **Fail loudly** with actionable messages.
|
|
|
|
```python
|
|
# ❌
|
|
raise ValueError("bad input")
|
|
|
|
# ✅
|
|
raise ValueError(f"Unknown listing_status {status!r}; expected one of {VALID_LISTING_STATUSES}")
|
|
```
|
|
|
|
* **No silent failures.** `except Exception: pass` is forbidden. Catch the specific exception, log it, and either recover or re-raise.
|
|
|
|
* **Service raises; MCP wraps.** Service functions raise normal exceptions. The MCP tool boundary translates them into `{"error": True, "code": ..., "message": ...}`. CLI lets typer handle non-zero exits.
|
|
|
|
* **Graceful degradation is explicit.** If Eiendom.no enrichment fails, return a result with `eiendom_unit=None` and a warning, not a silently-missing field.
|
|
|
|
## State
|
|
|
|
* No global mutable state. The only module-level constants allowed are configuration values loaded from env in `config.py`.
|
|
* No module-level caches (dicts, lists) that mutate. Use `cache.py` if you need persistence.
|
|
* Pass dependencies in (HTTP clients, DB connections) for testability.
|
|
|
|
## Dead code
|
|
|
|
* No commented-out code.
|
|
* No unused imports (ruff catches these — fix them, don't add `# noqa`).
|
|
* No unused parameters (use `_` or remove).
|
|
* No `if False:` blocks "for later".
|
|
* Functions and classes that aren't called anywhere — delete them. Git keeps history.
|
|
|
|
## Magic numbers and strings
|
|
|
|
Anything that influences behavior and isn't self-explanatory belongs in `config.py` (env-controlled) or as a named module-level constant near the top of the file.
|
|
|
|
```python
|
|
# ❌
|
|
if days > 90:
|
|
confidence = "low"
|
|
|
|
# ✅
|
|
COMPS_STALE_AFTER_DAYS = 90
|
|
|
|
if days > COMPS_STALE_AFTER_DAYS:
|
|
confidence = "low"
|
|
```
|
|
|
|
URLs, timeouts, retries, TTLs, status codes — never inline.
|
|
|
|
## Imports
|
|
|
|
* Standard library first, third-party second, local last, separated by blank lines.
|
|
* Ruff's `I` rules sort and group these — run `ruff check . --fix`.
|
|
* No wildcard imports.
|
|
* No relative imports above one level (`from ..thing import x` is a smell; refactor).
|
|
* Each module's allowed import set is enforced by `tests/test_architecture.py`.
|
|
|
|
## Tests are first-class code
|
|
|
|
Same rules. Same type hints. Same naming. Same DRY. If a fixture is used in three test files, it goes in `conftest.py`. If three tests share a setup, factor it into a fixture.
|
|
|
|
## Reviewing your own change before commit
|
|
|
|
A 60-second self-review:
|
|
|
|
1. Did I add a function that already exists somewhere? (`grep` it.)
|
|
2. Did I bypass `service.py`, `http.py`, `cache.py`, or `formatting.py`?
|
|
3. Is everything typed?
|
|
4. Did I leave a `print()`, `breakpoint()`, or commented-out block behind?
|
|
5. Does the test for this change actually fail without the change?
|
|
6. Did I update `PRD.md` or the relevant instruction file if I changed an architectural rule?
|
|
|
|
## When in doubt about a library API
|
|
|
|
Use the `context7` MCP server instead of guessing. See `docs.instructions.md`. Training-data memory of `pydantic.field_validator`, `typer.Option`, `mcp.tool` annotations, or `httpx.AsyncClient` is unreliable — they all change between versions. |