Multiple Kubernetes cluster monitoring with Prometheus

If you have multiple Kubernetes clusters, there are a few ways to implement monitoring with Prometheus. Generally speaking, when using the Prometheus operator, it will assume it’s running in its current cluster. Now we still have two variations. Because are we talking about “querying data over multiple clusters” or “having the data from multiple clusters in one cluster”? For example, it’s possible to have a Prometheus instance in each cluster and one central Prometheus to gather it all. Or we can scrape all the clusters with one single central Prometheus.


I’m actually a team member of Thanos: - However I don’t have any (financial) gains to be made here. Unlike some other tooling that merely make it to promote their product with weird comparisons. Cough It starts with a V and ends with ictoria. Cough. So at some point I show a few options wich includes Thanos. I only do this because I’m familiair with Thanos. There are alternatives such as:

and there are a lot of solutions which allow for remote write, which include but not limited:

Again, I wrote this blog to give people options and share my knowledge. So, without further ado, let’s start with option 1;

One cluster with Prometheus to scrape multiple clusters

For this, I would just simply refer to the documentation of Prometheus itself. More specifically the kubernetes_sd_config:

# The API server addresses. If left empty, Prometheus is assumed to run inside
# of the cluster and will discover API servers automatically and use the pod's
# CA certificate and bearer token file at /var/run/secrets/
[ api_server: <host> ]

The short story here is that we auth ourself (again, various options) with an external cluster and based on your configuration you can discover services, nodes, pods and ingresses. Depending on further configuration, you can apply any other ‘rules’ as you would do in your own cluster. Normally the Prometheus operator has already done that for you.

My personal opinion on the pro’s and con’s:




One cluster with Prometheus using federation to scrape other clusters

Honestly, I’m only writing this because it is an option. Thought I really find this not a real solution. Obviously I’m biased with my experiences and this might work perfectly fine for you.

With Federation ( you will need:

As per example on the Prometheus website:

  - job_name: 'federate'
    scrape_interval: 15s

    honor_labels: true
    metrics_path: '/federate'

        - '{job="prometheus"}'
        - '{__name__=~"job:.*"}'

      - targets:
        - 'source-prometheus-1:9090'
        - 'source-prometheus-2:9090'
        - 'source-prometheus-3:9090'

This means data is actually duplicated and we are now just scraping the external Prometheus for all its data.




One cluster with Prometheus using exposes metric endpoints in other clusters

As long as Prometheus can reach a metric endpoint, it can scrape it. So if you only have a few endpoints you need to scrape (which is 99.9% of the time not the case), you could just simply expose your /metric endpoints via an ingress or whatever option you want.



On cluster with Thanos Querier and Thanos sidecars in other clusters

With this setup we leverage Thanos. We implement a sidecar at the external clusters. Now we can have a querier at our observer cluster and set each sidecar as store. It will look like this:


As you can see, there is also an S3 bucket. This is for long term storage, which is optional and not required.


Con’s / Neutral

On cluster with Thanos Querier and Thanos sidecars and query in other clusters

This option is nearly the same as above. Yet rather than using the sidecars in our observee cluster, we add a Thanos Query component in each external cluster. This makes it easier to just place an ingress on your query component. This then can be just implemented highly available with a loadbalancer.


I’m not going over the pro’s and con’s here again, the mere difference is how you ‘chain’ everything together.

External clusters use remote-write

We have discussed pull-based options, but its also possible to push metrics. This option is quite useful when its hard or impossible to allow externals to ‘enter’ your cluster. At that point, you could just simply sent your data.


In this example I have used Thanos recievers for this capability. However Prometheus just has a generic remote write feature. There is a complete list of possible options. However do check if the option actually works for your use-case:



Using Thanos but with long term storage

I’m going to place this option as we should think about long term vision about your metrics. What I personally like about Thanos is the way we can store data. We can limit the rentention time on our Prometheuses and ‘push’ our data to an object store like S3.

It looks like this:


What happens is that Prometheus prepares it blocks and every 2 hours, these get uploaded to S3. With the Thanos Store component we now can query this data on S3. For data < 2 hours, we still use a Query or Sidecar endpoint to get those metrics.

The advantage is that we can leverage Thanos to have highly available, pluggable and long term storage metrics. With the option to chain every component in such way that we can have a central ‘observee’ cluster.

General tips you should think about

In the end there are a lot of options you can implement. There are however also many solutions that allow to be chained / plugged, which makes it really neat to “do whatever you want”.

To end this post, I’ve drawn another example of a few posibilities you could do with remote write!


I'm working on releasing a monthly newsletter which will contain resources about: Cloud-native, Security, "DevOps", and a big focus on Observability. If you happen to like my blog posts, consider subscribing to my newsletter list!

comments powered by Disqus