Agent experience, Opus 4.6, and building a CLI for ClickHouse

Developer experience (DX) is now old and boring, and it’s all about agent experience (AX).

As it turns out, they’re practically the same thing. Developers write code by slapping buttons with the meat sausages attached to their arms. LLMs write code by generating it like magic.

A good experience allows them do what they need to do without thinking too hard, or needing to lower themselves to using a UI.

The platforms with the best AX today are, unsurprisingly, the platforms that had the best DX yesterday. Like Supabase. They’ve had a great SDK and CLI that meat-based users have enjoyed using for years. And now LLMs love using them too.

Why am I thinking about this?

I build with ClickHouse a lot, and while I’ve pretty much delegated all of the actual code-writing to Claude, I still find myself manually managing the local and cloud infra. I don’t need the repeatability of IaC, I just want to spin new stuff up with less effort.

ClickHouse has various language clients that let you integrate it with your app, and LLMs are great with those, but they don’t help you from the infrastructure side. If you want to build an app using ClickHouse with an LLM, you probably need to be comfortable installing and setting it up yourself. And when you want to move to ClickHouse Cloud, you’ll need to do that yourself, too.

So I wanted to see what it takes to close the gap. How can I touch nothing except the prompt?

Designing a CLI

To me, the first logical step was a CLI.

When I build apps, its usually with JavaScript or Python, and LLMs do great with uv or pnpm to manage Python/Node environments and dependencies. Maybe I could treat having ClickHouse like a Node environment; what if I had a pnpm env style CLI to install and manage my local ClickHouse?

So I started on chv.

Disclaimer: chv is a personal project of mine, and not in any way supported by ClickHouse Inc. You’re welcome to try it, but know that it’s just an experiment, and the code is 95% written by Claude and hasn’t been reviewed by a competent human.

The initial set of commands pretty much replicated what I could do with uv and pnpm, with the subcommands:

init
install
use
list

With these, I can easily find and pull down versions of ClickHouse and bootstrap my working dir. Now, agents have web tools, and they could go to GitHub, pull down the releases, parse it to find the version it wants and then form a curl commands. But this is far more expensive in context & tokens than chv use stable.

I also need ways to run a ClickHouse server and iteract with it. LLMs are great at pnpm run dev so…why not chv run?

Now I can do chv run server to start a ClickHouse server, and chv run client to run the client to connect and interact with my sever.

With those 5 commands, I can actually get pretty far.

But…how to get an LLM to understand and use it?

Self-discovery and Agent Skills

LLMs haven’t been trained on my CLI. They don’t even know it exists. So how can I get them to use it?

An LLM not knowing about my CLI is a simply a problem of it not existing in the world. And I can solve that by releasing it and talking about it. There’s not much I can do to make LLMs know it exists while it’s a bag of bits in my home dir. So, I focused on getting them to understand using it.

LLMs know how to use CLIs. And what CLI doesn’t have --help? LLMs know this convention, and you can see them use it all the time. They are able to use --help to self-discover how to use a CLI with pretty decent results. So, the first step was to make sure I had good quality help text.

That helped noticably, but I still found the agent making mistakes. They could work out how to use commands, but not always which commands to use together. And they would often resort to running --help on every single combination of subcommand.

So I added a paragraph of text to every command’s help output, aimed at agents. Its heavier than just listing commands with short descriptions, but by encoding common flows into this, the LLM stopped listing everything and focused on commands it thought it needed. Not always…but enough that I considered it a win.

CONTEXT FOR AGENTS:
  chv is a CLI to work with local ClickHouse and ClickHouse Cloud.

  Two main workflows:
  1. Local: Install and interact with versions of ClickHouse to develop locally.
  2. Cloud: Manage ClickHouse Cloud infrastructure and push local work to cloud.

  You can install the ClickHouse Agent Skills for best practices on using ClikHouse:

  `npx skills add clickhouse/agent-skills`

  Typical local workflow: `chv install stable && chv use stable && chv run server`.

  Use `chv <command> --help` to get more context for specific commands.

Agent Skills

I also created some Agent Skills that encode even deeper knowledge of using the commands. I forked the official ClickHouse Agent Skills, which already has skills to help LLMs write better SQL and schemas, and created two new skills with various guides on using the CLI.

And called out in the CLI help that it should install and use the skills.

This also worked pretty well. It did install the skills, and it did use them…sometimes. When it used them, the results were great. But I found that, even using the latest Opus 4.6, skill invocation is frustratingly inconsistent.

My feeling is that building Agent Skills for your CLI is worthwhile, particularly if you’re developing something new and LLMs haven’t been trained on piles of docs and examples. And I’m hopeful that skill utilisation is going to continue to improve in the models.

Going to Cloud

With all that done, the CLI was working pretty well. I could go into Claude Code, and say

Build me a competitor to Google Analytics. Use ClickHouse as the database, you have the ClickHouse CLI `chv` installed.

and it could one shot it somewhat-reliably. Install ClickHouse, start the server, bootstrap a Nextjs app, build a Chartjs dashboard, add the ClickHouse JS client, create my tables, populate fake data, write the queries and wire it all up. I didn’t have to do anything.

But, my friends said they couldn’t reach it when I shared the app with them - try it out - http://localhost:3000.

So I need to be able to deploy it. Agents can already work with Vercel to deploy the Nextjs part, but not with ClickHouse Cloud to take my database to prod.

The sixth subcommand:

cloud

ClickHouse Cloud already has a REST API and an OpenAPI spec, but LLMs aren’t good at using complex REST APIs. When they have to take in the full JSON OpenAPI spec, and form raw cURLs with big JSON payload, they get it wrong. And, most of the time, it seems like they pretend they don’t exist - I assume that is because most written examples are using CLIs!

When I started on the cloud subcommand, it was the day that Anthropic released Opus 4.6 (I had been using 4.5). So that was its first test: I gave it the full OpenAPI spec, and told it to write a wrapper for it under the cloud prefix.

I’d say it got about 80% there. It missed various properties, didn’t read descriptions that called out deprecations, etc. so I had to fix it up, but it was certainly close enough to save me some time.

This command let me operate all the usual bits of ClickHouse Cloud I need to get to prod: find my org, create a service, get my connection details.

I could now complete the loop, and after my agent was done building my app locally, I could ask it to push to cloud. The first time it successfully spun up all my cloud infra, pushed my work up, and toggled the app over was pretty enjoyable.

An interesting problem that I don’t think is solved all that well for agents is the boring admin stuff - auth, users management, and billing. I’ve seen some folks working on these areas, and I’m assuming we’ll eventually get some kind of MCP-style spec for agentic auth. I don’t think its wise to roll a bespoke experience yourself, and frankly, a waste of time that you could use to do something more valuable.

Playing with OpenClaw

I had resisted playing with OpenClaw, but a colleague was messing with it and I couldn’t resist. I had just been using Claude Code, and I felt like a cave man compared to him chatting with his via WhatsApp.

So, in a few hours, I had a VPS on Hetzner running OpenClaw, and revived an old phone to keep things separate.

Could I really build an app and push it to cloud, with just my phone, from the sofa?

Yes.

Not first time, of course. I built chv on a Mac, and never tested it on Linux. My VPS was using Fedora, and it failed to install ClickHouse using chv, went down the non-CLI route and got itself stuck.

With chv 0.1.8 supporting Linux, I returned to my sofa. And it worked first time.

I asked my agent to build a security manager for OpenClaw. It installed ClickHouse, ran the sever, set up my schemas, ingested SSH server logs, built a Node app, connected the two, designed security rules for hardening OpenClaw (mostly just hardening Linux)…and pushed it all to cloud when it was done.

The app it built is here: openclaw-security-manager

The app isn’t particularly novel, we’ve been able to vibe code this stuff for at least a year.

But the CLI workflow it was using to develop against ClickHouse did not exist until 2 hours prior.

Lessons

I thought I’d have to do a lot more. I didn’t relish the idea of building an SDK, but in the end, I didn’t have to. I suspect an SDK would allow some really cool stuff - what if most of this infra work was just inferred from our code? - but a super simple CLI with 6 subcommands enabled agents to take another job off of my hands.

This is already too long of a post, and I can’t be bothered to write “The 10 Lessons of Agent Experience”. So I’ll leave you with 3 things that I think are inarguably true:

LLMs consume & generate text. Focus on text. CLIs are back, baby!
LLMs like patterns. Be boring and predictable.
LLMs can understand tools, but don’t always know how to use them to accomplish a goal. It might be cheaper to give this guidance upfront, rather than let an LLM go down its own rabbit hole.