writing
Design that merges. A dark editorial cover with the title set large in display and mono type.
§ writing

Design that ships as a PR

How I used a machine readable design system and daily AI routines as a founding design engineer to catch bugs, take execution off engineering and lift PR throughput.

Luiz Ottino6 min read
Design engineeringFintechAIFrontend

Across recent projects, I held one rule as a design engineer. Design should arrive as a pull request, not a Figma file that someone else has to rebuild.

The most demanding of them was in cross border payments. The interfaces are full of states most people never see. KYC checks, FX quotes that expire, pending settlement, failed transfers, retries. That is exactly where visual bugs hide and where engineering time quietly disappears, because every one of those states is a screen someone has to design, build and keep consistent.

The traditional loop made this worse. Design produces a file, engineering rebuilds it in code, something drifts, QA finds it late, and a payments screen ships with a state nobody handled. So I changed the input. I stopped treating design as a picture and started treating it as code the rest of the system could read.

Design should arrive as a pull request, not a file to rebuild.

A design system the machine can read

The design system lived in markdown and was versioned like code. Tokens, component variants, accessibility rules and usage constraints, all legible to Claude Code and Cursor rather than locked inside a design tool. That single decision is what made everything after it possible. When the system is context, Claude Code and Cursor can generate a component that already matches it, and I can open the pull request myself.

design-system · machine readable context
tokens.md        colour, spacing, radius, type scale
components.md    variants, states, a11y rules per component
patterns.md      payment flows, error and empty states
constraints.md   what never to do, brand and compliance limits
Design to pull request pipelineTop row, faded, the old loop: a Figma file goes to engineering to rebuild, drift and bugs appear, and QA finds them late. Bottom row, in the accent colour, the loop built here: the design system as markdown context feeds a design engineer working with agents, Claude Code and Cursor, who opens a design-complete pull request ready to merge.the old loop · handoffFigma fileengineering rebuildsdrift & bugsQA finds it latethe loop i built · context in, merge outdesign systemmarkdown contextdesign engineer + agentclaude code and cursordesign-complete PRready to mergemerged
[fig.01]The handoff loop collapses. Design enters as context and leaves as a mergeable PR.

A daily bug hunt, before QA

The first routine ran every morning. Claude Code and Cursor swept the open pull requests and the core payment flows and produced a short triage list. Not a vague summary, a fixed output: what looks broken, where, and against which rule.

Because the design system was the reference, Claude Code and Cursor could judge against something real rather than guess. I was catching design bugs before they reached QA, and often before engineering had looked at the PR at all. Caught early, a bug is a one line review comment. Caught in production on a payments flow, it is an incident.

Daily bug huntThree source nodes on the left, open pull requests, payment flows and the design system, feed a central accent node that sweeps and judges against the rules. It returns a fixed triage list on the right: a regression with file and line, token drift, a missing state, and an accessibility gap.sourcesroutinefixed output · triageopen PRspayment flowsdesign systemsweepjudge vs rulessame shape, every dayregressioncomponent.tsx : 42token drifthard-coded #1b6b45missing stateno pending / retrya11y gapcontrast 2.9 : 1
[fig.02]The bug hunt reads PRs and flows, judges against the system, returns a fixed triage list.

Taking execution off engineering

The second routine was the design to PR pipeline itself. Because the system was context and not a handoff, UI work shipped as design complete pull requests. The component was built, matched the tokens, handled its states and passed the accessibility rules before an engineer opened it.

That changed where engineering time went. Instead of rebuilding interfaces and chasing visual drift, engineers stayed on the work that genuinely needed them. Payments logic, FX handling, settlement, reliability. I was not adding to their queue, I was removing the part of it that did not need an engineer in the first place.

Engineering time and pull request throughputOn the left, two horizontal bars show where engineering time went. Before, mostly UI rebuild. After, mostly core payments logic. On the right, two vertical bars show merged pull requests per week, with the after bar taller than the before bar. The comparison is qualitative and illustrative, with no measured values.where engineering time wentbeforeUI rebuildcoreafterUIcore payments logiclarger block = where the week wentmerged PRs / weekbeforeafter
[fig.03]Illustrative. Relative block sizes only, not measured figures.

More shippable work in the pipeline

The throughput followed naturally. When the handoff loop collapses, the designer is no longer downstream waiting for someone to build the work. I was opening pull requests directly, every day, as part of the cadence. More design landed as code, faster, with fewer trips back and forth.

In payments this matters more than in most products, because the cost of a missed state is not a cosmetic glitch, it is a transfer that fails silently or a screen that traps a user mid flow. Enumerating those states up front, as part of the routine, meant fewer of them slipped through.

Payment state mapA horizontal flow of payment states: idle, KYC, FX quote, pending, failed, retry, success. A failed transfer loops back through retry into pending, and the happy path ends in success.idleKYCFX quotependingfailedretrysuccessfailed → retry → back to pending
[fig.04]In payments, a missed state is an incident, not a glitch.

The principle underneath

None of this was about using more AI. It was about making design a machine readable input so the loop could run. A trigger woke the routine, it read sources it could trust, it judged against the design system, it delivered a fixed output, and the next run improved on the last.

If you take one thing from this, make it the first move. Put your design system somewhere a machine can read it. Everything else, the bug hunt, the design complete PRs, the throughput, is downstream of that one decision. Build loops, not theatre. One painful workflow first.

But the real shift is not just speed. When design already arrives as implementation, the cost of trying another version of a flow drops close to zero. I can explore many variations of the same journey, design for specific and customised scenarios, edge cases, market by market differences, states that only a few users ever hit, and actually ship them as options to test, rather than leaving them as screens in a file that never get built. Testing more, varying more, implementing at a higher level and faster, all of that becomes normal instead of expensive.

That is how I think about my work now.

My job is to shorten the time an engineer spends rebuilding what I already designed, so that time goes into new problems instead.
Design stops being a layer on top of the product and becomes part of the pipeline that moves it forward.

Less downstream, more in the loop. That is the kind of design engineer I am trying to be.