shipcat retrospective

clux December 15, 2018 Updated: September 28, 2023 [software] #rust #kubernetes

The now defunct unicorn startup babylon health needed to micrate about 50 microservices to Kubernetes in early 2018. At Kubernetes 1.8, supporting tooling was weak, but the company pace was fast.

This is an historically updated post about shipcat, a standardisation tool written to control the declarative format and lifecycle of every microservice, and get safety quickly.

This article was updated after babylon's demise in 2023. It now serves as a mini-retrospective instead of the mostly broken announcement (with the original repo being down). We add some better showcases and examples, and historical context, that together should help avoid some common misconceptions about why this weird tool was written.

First, a bit about the problem:

Kubernetes API

Migrating to Kubernetes was a non-trivial task for a DevOps team, when the requirements where basically that we would do it for the engineers. We had to standardise, and we had to decide on what a microservice ought to look like based on what was already there.

We didn't want engineers to all have to learn everything about the following objects at once:

ConfigMap
Secrets
Deployment / ReplicaSet / Pod
Service
HorizontalPodAutoscaler
ServiceAccount
Role
RoleBinding
Ingress

We needed validation. Admission control was new, didn't work well with gitops for fast client-side validation, and we just needed ci checks to prevent master from being broken.

Helm

The most successful abstraction attempt Kubernetes had seen in this space; helm. A client side templating system (ignoring the bad server side part) that lets you abstract away much of the above into charts (a collection of yaml go templates) ready to be filled in with helm values; the more concise yaml that developers write directly.

Simplistic usage of helm would involve having a charts folder:

charts
└── base
    ├── Chart.yaml
    ├── templates
    │   ├── configmap.yamls
    │   ├── deployment.yaml
    │   ├── hpa.yaml
    │   ├── rbac.yaml
    │   ├── secrets.yaml
    │   ├── serviceaccount.yaml
    │   └── service.yaml
    └── values.yaml

and calling it with your substitute myvalues.yaml:

helm template charts/base myapp -f myvalues.yaml | \
    kubectl apply -lapp=myapp --prune -f -

which will garbage collect older kube resources with the myapp label, and start any necessary rolling upgrades in kubernetes.

Drawbacks

Even though you can avoid a lot of the common errors by re-using charts across apps, there were still very little sanity on what helm values could contain. Here are some values you could pass through a helm chart to kubernetes and still be accepted:

misspelled optional values (silently ignored)
resource requests exceeding largest node (cannot schedule nor vertically auto scale)
resource requests > resource limits (illogical)
out of date secrets (generally causing crashes)
missing health checks / readinessProbe (broken services can rollout)
images and versions that does not exist (fails to install/upgrade)

And that's once you've gotten over how frurstrating it can be to write helm templates in the first place.

Limitations

While validation is a fixable annoyance, a bigger observation is that these helm values files become a really interesting, but entirely accidental abstraction. These files become the canonical representation of your services, but you have no useful logic around it. You have very little validation, almost no definition of what's allowed in there (helm lint is lackluster), you have no process of standardisation, it's hard to test sprawling automation scripts around the values files, and you do not have any sane way of evolving these charts.

Enter shipcat

What if if we could take the general idea that developers just write simplified yaml manifests for their app, but we actually define that API instead? By actually defining the structs we can provide a bunch of security checking and validation on top of it, and we will have a well-defined boundary for automation / ci / dev tools.

By defining all our syntax in a library we can have cli tools for automation, and executables running as kubernetes operators using the same definitions. It effectively provides a way to versioning the platform.

This also allowed us to solve a secrets problem. We extended the manifests with syntax that allows synchronsing secrets from Vault at both deploy and validation time. There are better solutions for this now, but we needed something quickly.

Disclaimer

This style of tool was not a revolutionary (nor clean) idea. At KubeCon Eu 2018 pretty much everyone had their own wrappers around yaml to help with these problems. For instance, kubecfg, ksonnet, flux, helmfile, all try to help out in this space, but they were all missing most of the sanity we required when we started experimenting.

so, how to homebrew Kubernetes validation in an early stage gitops world

The result, perhaps unsurprisingly, was babylon dependent, fast moving, and not fit for general purpose. But it was still very helpful for the company.

Manifests

The user interaface we settled on were service-level manifests:

name: webapp
image: clux/webapp-rs
version: 0.2.0
env:
  DATABASE_URL: IN_VAULT
resources:
  requests:
    cpu: 100m
    memory: 100Mi
  limits:
    cpu: 300m
    memory: 300Mi
replicaCount: 2
health:
  uri: /health
httpPort: 8000
regions:
- minikube
metadata:
  contacts:
  - name: "Eirik"
    slack: "@clux"
  team: Doves
  repo: https://github.com/clux/webapp-rs

This encapsulated the usual kubernetes apis that developers needed to configure themselves, who's responsible for it, what regions it's deployed in, what secrets are needed (notice the IN_VAULT marker), and how resource intensive it is.

It's obviously quite limiting in terms of what you actually can do on Kubernetes, but this simple "one deployment per microservice" with some optional extras was generally sufficient for years.

Strict Syntax

Because these manifests were going to be the entry point for CI pipelines and handle platform specific validation (for medical software), we wanted maximum strictness everywhere and that includes the ability to catch errors before manifests are committed to master.

We leant heavily on serde's customisable codegeneration to encapsulate awkward k8s apis, and to auto-generate the boilerplate validation around types and spelling errors.

The Kubernetes structs were handrolled for the most part, but later incorporated parts of k8s-openapi structs - however these were too Option-heavy to catch most missed-out fields on their own.

Here are some structs we used to ensure resources and limits had the right format:

/// Kubernetes resource requests or limit
#[derive(Serialize, Deserialize, Clone, Debug)]
#[serde(deny_unknown_fields)]
pub struct Resources<T> {
    /// CPU request string
    pub cpu: T,
    /// Memory request string
    pub memory: T,
    // TODO: ephemeral-storage + extended-resources
}

/// Kubernetes resources
#[derive(Serialize, Deserialize, Clone, Debug)]
#[serde(deny_unknown_fields)]
pub struct ResourceRequirements<T> {
    /// Resource requests for k8s
    pub requests: Resources<T>,
    /// Resource limits for k8s
    pub limits: Resources<T>,
}

Here, serde enforces the "schema" validation. It catches spelling-errors as extraneous types/keys due to the #[serde(deny_unknown_fields)] instruction, and it enforces the correct types. But on the flip side, having this in code also required us updating the spec (to say, support ephemeral storage requirements).

Still, this provided cheap schema validation (before helm got it) and there was also a verify method that every struct could implement. This genenrally encapsulated common mistakes that were clearly errors and should be caught before they are sent out to the clusters:

impl ResourceRequirements<String> {
    // TODO: look at cluster config for limits?
    pub fn verify(&self) -> Result<()> {
        // (We can unwrap all the values as we assume implicit called!)
        let n = self.normalised()?;
        let req = &n.requests;
        let lim = &n.limits;

        // 1.1 limits >= requests
        if req.cpu > lim.cpu {
            bail!("Requested more CPU than what was limited");
        }
        if req.memory > lim.memory {
            bail!("Requested more memory than what was limited");
        }
        // 1.2 sanity numbers (based on c5.9xlarge)
        if req.cpu > 36.0 {
            bail!("Requested more than 36 cores");
        }
        if req.memory > 72.0 * 1024.0 * 1024.0 * 1024.0 {
            bail!("Requested more than 72 GB of memory");
        }
        if lim.cpu > 36.0 {
            bail!("CPU limit set to more than 36 cores");
        }
        if lim.memory > 72.0 * 1024.0 * 1024.0 * 1024.0 {
            bail!("Memory limit set to more than 72 GB of memory");
        }
        Ok(())
    }
}

Ultimately, the Resources struct above was attached straight onto to the core Manifest struct (representing the microservice defn above). Devs would write standard resources and be generally unaware of the constraints until they were violated:

resources:
  requests:
    cpu: 100m
    memory: 100Mi
  limits:
    cpu: 300m
    memory: 300Mi

In this case, the syntax matches the Kubernetes API directly - and this was preferred - but had extra validation.

We did plan on moving validation to a more declarative format (like OPAs) down the line, but there was no rush; this worked.

All of the syntax ended up in shipcat/structs - and required developer code-review to modify since it could affect the whole platform.

Once a new version of shipcat was released, we bumped a pin for it in a configuration management monorepo with all the manifests, and the new syntax + feature become available for the whole company.

CLI Usage

Developers could check that their manifests pass validation rules locally, or wait for pre-merge validation on CI:

shipcat validate myapp # lint
shipcat template myapp # generate template output

the last being roughly equivalent to:

shipcat values myapp | helm template charts/base

We did always lean on helm charts for templating yaml, but this was always an implementation detail that only a handful of engineers needed to touch as we followed the one chart to rule them all approach. Templates were also linted heavily with kubeval against all services in all regions during chart upgrades.

Kubernetes Usage

We had wrappers around the normal shipcat template myapp | kubectl X pipeline:

shipcat diff myapp # diff templated yaml against current cluster
shipcat apply myapp # kubectl apply the template - providing a diff and a progress bar

We didn't really apply locally except when doing local testing, but we could. There was a glorified kubernetes context switcher that ensured we were pointing to the correct vault for the cluster, so it was pretty easy to test on and get accurate diffs.

The upgrade was much nicer than any other CLI that existed at the time, it tracked upgrades with deployment-replica progress bars, bubbled up errors, captured error logs from crashing pods, provided inline diffs pre-upgrade, gated on validation, sent successful rollout notifications to maintainers on slack.

CI actually used this apply setup and reconciled the whole cluster in parallel using async rust with StreamExt::buffer_unordered:

shipcat cluster helm reconcile

this help avoid the numerous tiller bugs and actually let us define a sensible amount of time to wait for a deployment to complete (there's an algorithm in there for it).

In the end, we almost turned it into a CD controller, but in an awkward clash of new and old tech, we just ran the above reconcile command on jenkins every 5m lol.

at the time helm 3 was planning to architect away tiller entirely.

The dev ergonomics was one of its biggest selling points (and possibly prevented a revolt against a ops-led + mandated tool). In my later jobs, achieving a similar level of dev ergonomics would take multiple microservices talking to flux.

Perhaps all this does not seem that impressive now, but it helps if you have visited that precise layer of hell that helm 2 dominated. It had such a painful and broken CD flow.

Conclusion

Looking back at this, it's a kind of wild everything-CLI. It accomplished the goal though. It moved fast, but did so safely. It was not universally well-received, but most of the people who complained about it early on later came to me later to say "i don't know how else we could have done this".

It also let us build a quick and simple service-registry on top of the service spec (there's a controller called raftcat that cross-linked to all the tools we used for each service).

Ultimately, it's not a tool most people know about, or at the very least not very well understood, and this makes sense. It was ultimately tied to babylon's platform. Why would you tell people about this, if not except out of interest? The more surprising nail in that coffin was in late 2022, when it was made private from its repo without much ceremony. Now only my safety fork remains. Similar unravellings later happened to the company, but unfortunately you cannot safety fork your share value.

Original Presentation

For anyone super interested, there is also our original talk: Babylon Health - Leveraging Kubernetes for global scale from DoxLon2018 that provides some context.

Don't make me watch it again though.