Source: Cloudprober: open source black-box monitoring software from Open Source
Ever wonder if users can actually access your microservices? Observe timeouts in your applications, and not sure if it’s the network or if your servers are too busy? Curious about the 99%-ile network latency between your on-premise data center and services running in the cloud?
Cloudprober, which we open sourced last year, answers questions like these and more. It’s black-box monitoring software that “probes” your systems and services and generates metrics based on probe results. This kind of monitoring strategy doesn’t make assumptions about how your service is implemented and it works at the same layer as your service’s users. You can make changes to your service’s implementation with peace of mind, knowing you’ll notice if a change prevents users from accessing the service.
A probe can be anything: a ping, an HTTP request, or even a custom program that mimics how your services are consumed (for example, creating and accessing a blog post). Cloudprober builds and exports standard metrics, and provides a way to easily integrate them with your existing monitoring stack, such as Prometheus–Grafana, Stackdriver and soon InfluxDB. Cloudprober is written in Go and works on all major platforms: Linux, Mac OS, and Windows. It’s released as a static binary as well as a Docker image.
Here’s an example probe config that runs an HTTP probe against your forwarding rules and exports data to Stackdriver and Prometheus:
# Probe all forwarding rules that contain web-fr in their name.
// Export data to stackdriver
// Prometheus exporter
The probe config is run like this from the command-line:
./cloudprober --config_file $HOME/cloudprober/cloudprober.cfg
This example probe config highlights two major features of Cloudprober: automatic, continuous discovery of cloud targets, and data export over multiple channels (Stackdriver and Prometheus in this case). Cloud deployments are dynamic and are often changing constantly. Cloudprober’s dynamic target discovery feature ensures you have one less thing to worry about when doing minor infrastructure changes. Data export in various formats helps it integrates well with your existing monitoring setup.
Other features include:
Though most of the cloud support is specific to Google Cloud Platform (GCP), it’s easy to add support for other providers. Cloudprober has an extensible architecture so you can add new types of targets, probes and monitoring backends.
Cloudprober was built by the Cloud Networking Site Reliability Engineering (SRE) team at Google to monitor network availability and associated features. Today, it’s used by several other Google Cloud SRE teams as well.
By Manu Garg, Cloud Networking Team