typing was never the hard part

DevOpsDays Austin 2026

Ian Littman / @ian@phpc.social / @ian.im / @iansltx

Slides at ian.im/doda26

Warning: we're gonna talk about LLMs (#aI)

What happened in the last ~year (not just at Anthropic)
What this means software dev, and how to provide value
What this means for infra folks, and how to provide value

Slides are mine. LLMs didn't touch them.

We're talking about interacting with code.

LLMs can do other things, but they're out of scope for this talk.

If you have to use the tools,
they might as well be Useful.

other folks use LLMs more aggresively than Me

I'm still ahead on productivity vs. manual, with a safety level I'm comfortable with.

LLMs are good for tasks with...

Easy-to-describe acceptance criteria
Toilsome implementation
An existing pattern to follow (from context or training data)
Straightforward (preferably automated) verification
~~Small units of work~~ (this is no longer a constraint)

Capabilities and costs subject to change without notice

History: Better models && better harnesses

Late May 2025

Claude Sonnet 4 released by Anthropic
Claude Code went GA
I start using Sonnet ~1mo later, and then more in August
Opus existed but was $$$
Sonnet was a situational toil-reduction pick
Already easy to find a model that was better at jq/bash/regex than I was

september 2025

Sonnet 4.5
- Useful for a larger selection of toil
- Used occasionally by me
Qwen3-Next
GLM 4.6

Late November 2025

Opus 4.5
- Step change in capabilities
- Significantly less expensive
- I started throwing it larger tasks ~year-end
GLM 4.7 (in December)

February 2026

Sonnet 4.6 + Opus 4.6
GPT 5.3 Codex
Qwen3.5 (step change on local model ability)
GLM 5
Kimi K2.5 in January (I used it a bit)

march 2026

Claude Code instability
Claude plan rate limit revisions
Claude code review is more widely available
GPT-5.4

April 2026

Releases (non-exhaustive list)
- Qwen3.6
- Gemma 4
- GLM 5.1
- Kimi K2.6
- DeepSeek V4
- GPT-5.5
- Opus 4.7
Rugpulls
- GitHub Copilot
- Claude Enterprise

A digression on subsidies

LLM lab plans (OpenAI, Anthropic)
GitHub Copilot
OpenCode Go (less egregious)

vs. paying APIs per-token

May 2026

Multi-token prediction (faster local models)
Kilo Code v7 (harness)
Apple is out of >96GB desktops
Nvidia is getting into distributed compute???

The trend

Capabilities are improving quickly across frontier/open-weights/local
Tokens are increasingly likely to cost real money, utility-billed
Business models based on indefinite subsidies are time bombs
Doing less (token count) with less (cheaper/simpler models) matters
- Cheaper closed models (same or different vendors)
- Open-weights on not-your-machine
- Open-weights on your machine
Deterministic code is way cheaper to run than inference

The Hard Part ≠ The bottleneck

In Software Dev

Writing tests
All-else-equal refactors
Dependency upgrades, including major version bumps
Language ports

In Software Dev

Attempting bugfixes (sometimes succeeding, sometimes not)
Solving the clean-slate problem
Justifying nuking temporary code
Spelunking in a complex codebase
Reviewing new code as another set of eyes

Software dev caveats

Won't use new/best practices unless told to
Will follow bad patterns if your code has them
Will churn code absent guard rails
Test quantity/quality can vary
- "I can't test this so I'm going to write a parallel implementation and test that"
If something doesn't add up, there's a decent chance the model is hallucinating
- This is less of a problem than it used to be

Software Dev Caveat Remediation

You still have to (know how to) review the code (and the tests!)
Don't turn your architecture brain off
Don't be afraid of backing out diffs
Take responsibility for the work (because an LLM can't)

How do devs Provide value?

Determining guard rails (e.g. static analysis/linting) for agents to use
Upskilling into product
- Deciding what needs to be done
- Making requirements explicit
- Figuring out how to validate acceptance criteria
Reviewing output
Prioritizing useful change
Minimizing noisy churn

You're (Sort of) a manager now

But the computer won't be offended if you do X% of the work yourself

Infra Disclaimers

"Good at" and "Caveats" are secondhand for me
There are tons of primary sources here

In Infra

Net-new IaC
Runbooks -> bespoke case-specific troubleshooting steps
CI (e.g. GitHub Actions) glue code
Import scripts
Refactors

In Infra

Troubleshooting (>= rubber duck)
Sniff-testing terraform plan output
State validation

Infra caveats

Over-abstraction rather than KISS
- Particularly on updates
Stuff needs testing
- Preferably automated testing
Tricky IAM -> suboptimal training data -> suspect output
Running bash is a lot more dangerous when it can hit prod

Infra caveat Remediation

IaC all the things
Steer the model when it's overcomplicating/getting lost
For local models, be prescriptive
Give the model a way to check its work
- terraform validate
- terraform plan
If an LLM can't access it, it can't break it
Ensure skills match docs

How can Infra folks provide value?

Setting patterns
- Abstraction tradeoffs
- Infra/tool choices (not just for you)
- Why rather than just what, described in a way your audience cares
- Separating useful change from noisy churn
Validating changes before they're made
Setting up deterministic artifacts

At the end of the day, a human has to...

Steer the ship
Wield the tools
Take responsibility

Questions? Find me here / @ian@phpc.social / @ian.im / @iansltx

Slides: https://ian.im/doda26

Thanks!