Contributing¶
TRACE is a research library. Contributions are welcome — especially new language packs, fixture corpora, and apparatus-format conversions. This page describes the development workflow.
Branch model¶
TRACE follows a main / develop / feature-branch model:
Branch |
Role |
|---|---|
|
Stable releases. Updated only via merge from |
|
Integration branch with the latest development work. |
|
Branched off |
|
Short-lived branch off |
Pull requests target develop, not main. The PR description should summarise what changed and link to the relevant spec / plan / issue.
Quality gates¶
Every commit must pass before merge:
flake8 —
flake8 src/ tests/returns no output (PEP 8 compliant).pytest — the full suite passes (
pytest -q).
The GitHub Actions CI runs both on every push and pull request across Python 3.10, 3.11, and 3.12.
TDD discipline¶
For new features and bug fixes:
Red — write a failing test that pins the desired behaviour.
Verify red — run the test, watch it fail with a meaningful error.
Green — write the minimal production code that makes the test pass.
Verify green — run the test, watch it pass; run the full suite, confirm nothing regressed.
Refactor — clean up while keeping the tests green.
If you’re tempted to write the test after the implementation, write the test first instead. Tests-after answer “what does this do?”; tests-first answer “what should this do?”
Commit conventions¶
Conventional-commit prefixes are encouraged but not enforced:
Prefix |
Use for |
|---|---|
|
New feature |
|
Bug fix |
|
Documentation only |
|
New tests, no production change |
|
Internal restructure with no behaviour change |
|
Build / tooling / dependency / release prep |
|
Code formatting / whitespace |
|
CI configuration |
Commit messages should explain why in addition to what — the diff already shows the what.
No AI / assistant attribution in commit messages, PR descriptions, or any shipping artefact. Plain factual summaries only.
Adding a new language pack¶
Create
src/tracealign/lang/<code>/with__init__.py,pack.py, and your normaliser / tokenizer hooks / scoring tiers.Subclass
LanguagePack. Setcode,aliases,version. Overridepost_tokenize(optional),normalize(required),scoring_tiers(required).Register at import time:
register_language(MyLanguagePack())in your__init__.py.Update
src/tracealign/__init__.py’s_BUILTIN_PACK_MODULEStuple so the test registry-reset helper can reload your pack.Add tests under
tests/lang/<code>/. The Hebrew pack tests are a useful template.
Adding to the Hebrew lexicon¶
The seed lexicon lives in src/tracealign/lang/hebrew/data/:
abbreviations.json—{ "abbrev_form": ["expansion_1", "expansion_2", ...] }plene_defective.json—[["plene_form", "defective_form"], ...]
Additions to the seed lexicon should be limited to entries that genuinely belong in a core Hebrew pack — well-known rabbinic abbreviations, broadly attested plene/defective pairs. Project-specific entries should live in the user’s own Lexica.merge() call, not in core.
Working with the design documents¶
The full design spec lives at docs/superpowers/specs/2026-04-28-trace-v0.1-design.md, the v0.1 task breakdown at docs/superpowers/plans/2026-04-28-trace-v0.1.md. Both are kept in sync with the code; if a behaviour and the spec disagree, file an issue first.
The roadmap (docs/ROADMAP.md) tracks which sub-projects are done, in progress, and queued.
Reporting bugs¶
Open an issue at github.com/bsesic/trace/issues. A useful bug report includes:
The TRACE version (
python -c "import tracealign; print(tracealign.__version__)").A minimal code snippet that reproduces the behaviour.
What you expected vs what happened.
The language pack (if applicable) and any custom lexicon.
For alignment-correctness issues, the JSON dump of the offending AlignmentResult is the most useful single artefact.
Asking questions¶
Use GitHub Discussions for design questions, requests for new language packs, and feedback on the abbreviation lexicon. Issues are reserved for things that need a code change.