3 minutes
Helm problem solving: Handy commands
I’ve spent today trying to put together a working Helm chart to get a new application deployed to EKS. Writing helm charts isn’t something I’d done before so there was a lot of trial and error, and a lot of trying to diagnose my various mistakes.
Mainly for my own future benefit here’s a quick overview of the helm
and kubectl
that were been most helpful for getting feedback on why things weren’t working.
List helm releases
helm list -A
Get a list of all your helm releases so you can see at a glance which have deployed successfully and which are having problems.
If you only have a modest number of releases then using -A
to list across all namespaces is less fingerwork that specifying a single namespace.
A useful alternative might be helm list -A --failed
to just show failed deploys.
More detail on a helm release
helm history <name> -n <namespace>
Provides a very similar output to helm list
but for a single named release. Importantly this has an extra Description column which, for a failed deploy will give the failure reason.
helm history my-aweseome-service -n my-awesome-namespace
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
1 Wed Nov 13 14:52:19 2024 pending-install my-awesomeservice-0.1.0 1.16.0 Initial install underway
Looking at pods
kubectl get pods -A
This is much like helm list
, it gives an overview of all deployed pods and their status. If like me you’ll find one that’s in a CrashLoopBackOff
state.
NAMESPACE NAME READY STATUS RESTARTS AGE
my-awesome-namespace my-awesome-service-5879d8cc59-59gkn 0/1 CrashLoopBackOff 4 (87s ago) 3m4s
This will confirm that something is wrong, but not what that something is.
Pod events
kubectl get events --sort-by='.lastTimestamp' -n <namespace>
When you’ve got a helm deploy with a failing pod, this will show you at what point in the creation process things went wrong.
The sort-by
is required. I don’t know what the default sort order of output is, but it’s not helpful. Sorting by lastTimestamp will give most recent messages at the bottom.
There’s extra filtering that can be applied, but for me just being able to see events from the last few minutes was plenty to get an idea as to what was going wrong.
Helpful for,
- Problems with fetching the docker image
- Badly configured readiness probes
- Cases where you pod is falling over too early in its startup for
kubectl logs
to show anything
Application logs
kubectl logs -n <namespace> -f <pod_name>
Finally, getting the logs from the application itself (if you’re not super organised and using something like Cloudwatch or OpenTelemetry Logging). You can still get logs from a pod in the CrashLoopBackOff
state, but if you get no output it may mean something’s gone wrong early in pod creation - check the pod/namespace events.
Helpful for,
- Configuration issues
- If your centralised logging isn’t working
Happy problem solving.