Building the ClickHouse Knowledge Base

For the past 2 months-ish I’ve been working on the ClickHouse Knowledge Base at Tinybird. While I’ve contributed to documentation before, I’ve never had to build out a documentation site from scratch. It has been a great learning opportunity, and while it’s not perfect, I’m super happy with the results. So I wanted to speak about how we got here.

The idea

Why did we choose to launch a Knowledge Base for ClickHouse? Well, we use ClickHouse under the hood at Tinybird and we love it. It’s a brilliant database & open source project. We’ve already got several ClickHouse contributors in the company, and now we’re building a team dedicated purely to open source ClickHouse development.

We don’t want to fork ClickHouse and we don’t want to build Tinybird-only ClickHouse features that no one else can benefit from. We have benefitted greatly from the ClickHouse project and it’s only fair that we give back and benefit others.

In the 3 years of building Tinybird, we have gathered a great deal of ClickHouse knowledge inside the company. So as well as helping to build ClickHouse, another way we can give back to the community is by sharing our knowledge.

We can benefit the ClickHouse project, the ClickHouse community & the wider data ecosystem with this knowledge. Of course, it is also beneficial to us as an early company looking to grow; we want to establish ourselves as experts in this field, increase brand awareness and capture organic search traffic. I think it’s a win-win.

We believe we have something of value to share, and it benefits everyone to share it. So, let’s share it.

Creating content

Firstly, for a Knowledge Base to be valuable, it needs to actually contian knowledge, so we had to start capturing information.

We held an internal Doc-a-thon, where our team could write absolutely anything they wanted about ClickHouse, putting in as much effort as they wanted; we didn’t require formattting, perfect grammar or even full sentences. We knew the team was busy, so we needed to lower the effort barrier. Just write down the things you have learnt about ClickHouse.

We opened an internal Git repo and had people submit their tips, with one tip per PR. To incentivise participation, we had a prize for the most submissions, creating a little competition amongst friends. It started off slow, but in the end we closed the Doc-a-thon with 130 unique tips submitted from our team. These tips ranged from beginner-friendly advice to deep-dives of ClickHouse internals. It was great stuff.

Reviewing these tips took a long time, and the editing process took even longer (actually, at the time of writing this, I haven’t finished the editing process, I’m at about 60/130). There was a lot of basic editing to do, like adding metadata, inserting external references, converting tips to Markdown, etc. but they also needed technical editing; ensuring the tips showed a real, full example, that the text clearly explained what was being shown, how to do it, and why this matters.

Don’t underestimate the importance of why; all the best documentation I have ever read as a developer doesn’t just give me something to copy & paste, but helps me to become self-sufficient so I don’t need to copy & paste it again.

Tooling

The Knowledge Base is a technical resource for technical people, and we want developers from outside Tinybird to be able to contribute to it. So, it made sense to use tools that developers are familiar with; we develop all of the content in a public, Apache2.0 licensed repo on GitHub. The content itself is written in standard Markdown. This should make it accessible to pretty much any developer, and makes it super easy for us to manage.

Markdown is widely adopted & there is a huge ecosystem of tools that can do things with it; one such tool is Docusarus, a static-site generator that generates web pages from Markdown files. It was developed at Facebook, it’s open-source and has a huge & active community. It’s super extensible and uses React, which is also what we currently use at Tinybird in our product. I evaluated a few tools; GitBook is a SaaS docs product, but there’s customisation we wouldn’t have been able to do, and it would cost us more; Docsify.js isn’t a static generator, instead rendering pages on demand, which is nice from the build-process, but bad for SEO.

We settled on Docusaurus, which has been fantastic; our design team did a great job customising the theme in a few hours, and we’re exploring building some custom components for things that aren’t yet supported officially. To deploy our Docusaurus site, we use Vercel. With Vercel, I simply point to the Git repo, and say Go. Like magic, my site is built & deployed. Tinybird & Vercel have a great relationship; we use Vercel to deploy the frontend of our product, and Vercel uses Tinybird for many of the analytical features they provide in their products.

The future

The jobs not done. There’s still a backlog of tips I need to get through and release. We’ve published the Knowledge Base, and we’re starting to share it out on social channels. We hope to get contributions from the community, and we’re going to give out some fun gifts to try and bootstrap that.

I’m really happy with the engagement we got from everyone at Tinybird helping to create content, the design team that made it look amazing, the marketing team that have made my writing less blunt and shared it with the world, and the support from the founders of the company. Docusaurus has impressed me and I’m now using it to build 2 more Docs-based projects, so I think we made the right choice there.

It was a lot of work to put this all together. Building great docs is not easy. I hope that we’ve done a good job and it becomes a useful resource; time will tell.