Skip to content

Onboarding

Start point for a new contributor. This single page gets you to a green build and shows you where everything lives. For the guided, explained walkthrough, follow the Start here tutorial; for a command-by-command validation run, see the quickstart.

Reach green

GitHub Codespace (primary): on the repository page, Code → Codespaces → Create codespace. Provisioning runs make bootstrap for you (installs the toolchain on Python 3.9.4 and generates the models). When the terminal is ready:

make verify     # lint + tests — confirm the environment is green
make pipeline   # produce build/acoustic_dataset.xml (schema-valid, round-trip-equal)

Local fallback: you need Python 3.9.x and make, then run make bootstrap yourself before the same make verify / make pipeline. Both paths reach the same green state (ADR 0006). Run make help to see every target.

Where things live

You're looking for… It's here
The contract (the XSD) schema/acoustic_dataset.xsd (how to change it)
The scientific seams (named, testable calc functions) src/acoustic_dataset/acoustics/
The one place the schema object is built src/acoustic_dataset/build.py
Generated models (never hand-edited — regenerate) src/acoustic_dataset/models/
Example calculation input examples/calculation_input.json
Tests (unit / integration / golden) tests/
The plan & design artifacts specs/001-codespace-xml-scaffold/ (spec.md, plan.md, tasks.md)
The generated schema reference + ERD reference/schema (run make gen-schema-docs)
Why each choice was made docs/decisions/ (ADRs, kept in the repo)

Build the mental model

Read these, in order — they're short:

  1. Schema as the contract — the idea everything follows from.
  2. Typed data, end to end — how you write Python here: start from a structured set of parameters and keep the data typed all the way to XML.
  3. The two verification gates — why schema-valid isn't the same as correct.
  4. Pipeline data flow — how input becomes validated XML.

What's done, what's next

The full Phase 1 pipeline is in place — environment, end-to-end pipeline, migration-safety compare, the generated schema reference/ERD, and the distribution bundle with a CI drift gate. Remaining polish is tracked in specs/001-codespace-xml-scaffold/tasks.md.