Scores you can read.
Measuring how someone builds with AI is a brand-new field — so the scoring here isn't a black box. Every number is plain arithmetic over counted signals, the bands are research-anchored, and the whole contract is public, versioned, and yours to shape.
What it measures
Each dimension is scored from counted signals in your own sessions and git history, against research-anchored bands — never a curve fit to one machine.
Signal Clarity
How precisely you direct the AI — prompt specificity, and how few iterations it takes to reach a usable result.
Build Stability
Whether AI-assisted code survives — churn, revert rate, and post-edit stability over time.
Decision Weight
The weight and durability of the technical decisions you make with the AI in the loop.
Recovery Velocity
How fast and how systematically you recover when the AI is wrong.
Context Command
How well you carry context across tools and sessions instead of starting cold each time.
Orchestration Range
How many tools, models, and agents you coordinate effectively — measured only when present.
Dimensions combine into nine archetypes and twelve crafts (plus the AI Explorer baseline everyone holds), placed on a build-domain × leverage map.
Arithmetic, not vibes
Plain arithmetic
Every score is arithmetic over signals counted from your local data, mapped against research-anchored bands. The formulas are in scoring.py and published in full.
No model writes a score
A model only ever writes the narrative of your own profile — it never assigns a number. Numbers are the engine's; words are optional and opt-in.
Insufficient, never estimated
Anything that can't be measured from your data is marked insufficient and excluded — your profile shows a completeness indicator instead of a fabricated number.
No ranking, ever
No percentiles, cohorts, leaderboards, or "top X%". Positioning shows how you build. Different builder kinds are crafts, not rungs.
Calibrated for developers and AI engineers — people who build software with AI — and not yet for other kinds of AI builders. The methodology is versioned: any change ships as a transparent version bump.
The community co-owns it
As we collectively learn what "good" looks like, the methodology should learn with it. Three ways to shape it:
Bring a case or a counterexample to Discussions → Methodology.
Open a PR against SCORING-METHODOLOGY.md with evidence or a good example. Accepted changes ship as a transparent methodology-version bump.
The open questions are the calibrations we're least sure of — the best place to push.
Nothing is hidden
The full methodology is in the repo, and the tool renders it as an interactive explorer locally — the weights, the research behind each dimension, and how the weighting adapts to how you build.
Run python3 -m nextmillionai report and open /methodology for the interactive explorer, served from your own machine.
Have a better calibration? The methodology improves when builders bring evidence.