Observability is a quality of a system that refers to how well its internal state can be measured externally. As a computer program evolves into a full-blown production system this quality becomes increasingly important. One of the ways to make a software system more observable is to export metrics, that is, to report in some externally visible way a quantitative description of the running system's state. For instance, to expose an HTTP endpoint where we can see how many errors occurred since the process has started. In this post, we will explore how to build more observable Ent applications using Prometheus.
Ent, is a simple, yet powerful entity framework for Go, that makes it easy to build and maintain applications with large data models.
Prometheus is an open source monitoring system developed by engineering at SoundCloud in 2012.
It includes an embedded time series database and many integrations to third-party systems.
The Prometheus client exposes the process's metrics via an HTTP endpoint (usually
/metrics), this endpoint is
discovered by the Prometheus scraper which polls the endpoint every interval (typically 30s) and writes it
into a time-series database.
Prometheus is just an example of a class of metric collection backends. Many others, such as AWS CloudWatch, InfluxDB and others exist and are in wide use in the industry. Towards the end of this post, we will discuss a possible path to a unified, standards-based integration with any such backend.
To expose an application's metrics using Prometheus, we need to create a Prometheus Collector, a collector collects a set of metrics from your server.
In our example, we will be using two types of metrics that can be stored in a collector: Counters and Histograms. Counters are monotonically increasing cumulative metrics that represent how many times something has happened, commonly used to count the number of requests a server has processed or errors that have occurred. Histograms sample observations into buckets of configurable sizes and are commonly used to represent latency distributions (i.e how many requests returned in under 5ms, 10ms, 100ms, 1s, etc.) In addition, Prometheus allows metrics to be broken down into labels. This is useful for example for counting requests but breaking down the counter by endpoint name.
Let’s see how to create such a collector using the official Go client. To do so, we will use a package in the client called promauto that simplifies the processes of creating collectors. A simple example of a collector that counts (for example, total request or number or request error):
Hooks are a feature of Ent that allows adding custom logic before and after operations that change the data entities.
A mutation is an operation that changes something in the database. There are 5 types of mutations:
In Ent, there are two types of mutation hooks - schema hooks and runtime hooks. Schema hooks are mainly used for defining custom mutation logic on a specific entity type, for example, syncing entity creation to another system. Runtime hooks, on the other hand, are used to define more global logic for adding things like logging, metrics, tracing, etc.
For our use case, we should definitely use runtime hooks, because to be valuable we want to export metrics on all operations on all entity types:
With all of the introductions complete, let’s cut to the chase and show how to use Prometheus and Ent hooks together to create an observable application. Our goal with this example is to export these metrics using a hook:
|ent_operation_total||Number of ent mutation operations|
|ent_operation_error||Number of failed ent mutation operations|
|ent_operation_duration_seconds||Time in seconds per operation|
Each of these metrics will be broken down by labels into two dimensions:
mutation_type: Entity type that is being mutated (User, BlogPost, Account etc.).
mutation_op: The operation that is being performed (Create, Delete etc.).
Let’s start by defining our collectors:
Next, let’s define our new hook:
After defining our hook, let’s see next how to connect it to our application and how to use Prometheus to serve an endpoint that exposes the metrics in our collectors:
After a few times of accessing
/ on our server (using
curl or a browser), go to
/metrics. There you will see the output from the Prometheus client:
In the top part, we can see the histogram calculated, it calculates the number of operations in each “bucket”. After that, we can see the number of total operations and the number of errors. Each metric is followed by its description that can be seen when querying with Prometheus dashboard.
The Prometheus client is only one component of the Prometheus architecture. To run a complete system including a scraper that will poll your endpoint, a Prometheus that will store your metrics and can answer queries, and a simple UI to interact with it, I recommend reading the official documentation or use the docker-compose.yaml in this example repo.
As we’ve mentioned above, there is an abundance of metric collections backends available today, Prometheus being just one of many successful projects. While these solutions differ in many dimensions (self-hosted vs SaaS, different storage engines with different query languages, and more) - from the metric reporting client perspective, they are virtually identical.
In cases like these, good software engineering principles suggest that the concrete backend should be abstracted away from the client using an interface. This interface can then be implemented by backends so client applications can easily switch between the different implementations. Such changes are happening in recent years in our industry. Consider, for example, the Open Container Initiative or the Service Mesh Interface: both are initiatives that strive to define a standard interface for a problem space. This interface is supposed to create an ecosystem of implementations of the standard. In the observability space, the exact same convergence is occurring with OpenCensus and OpenTracing currently merging into OpenTelemetry.
As nice as it would be to publish an Ent + Prometheus extension similar to the one presented in this post, we are firm believers that observability should be solved with a standards-based approach. We invite everyone to join the discussion on what is the right way to do this for Ent.
We started this post by presenting Prometheus, a popular open-source monitoring solution. Next, we reviewed “Hooks”, a feature of Ent that allows adding custom logic before and after operations that change the data entities. We then showed how to integrate the two to create observable applications using Ent. Finally, we discussed the future of observability in Ent and invited everyone to join the discussion to shape it.
Have questions? Need help with getting started? Feel free to join our Slack channel.