In the first part of our 2023 PromCom recap, we spent OpenObservability Talks exploring the Perses open source project. We found heavy users of open source Grafana who found themselves grappling with issues arising from managing a vast number of dashboards, and the need to manage dashboards as code in a GitOps fashion.
In this second part, I’d like to cover other noteworthy Prometheus updates around integration with OpenTelemetry, sharing of scraping and alerts, native histogram support and more. My guest on this episode was Augustin Husson, a Prometheus maintainer, as well as a principal engineer at Amadeus. Augustin guided us through the noteworthy developments and intriguing discussions that took place during the conference’s hallway track.
Furthermore, he shared exclusive insights from the DevDay, a post-PromCon gathering where Prometheus maintainers came together to shape the future strategy and roadmap of Prometheus, including the highlight of the Prometheus 3.0 release plan.
OpenTelemetry was the Talk of PromCon 2023
Augustin noted that OpenTelemetry was a theme that resonated across various sessions at this year’s PromCon. The enthusiasm surrounding OpenTelemetry was palpable, with numerous engaging talks shedding light on its increasing adoption.
What started primarily as a tool for traces has now expanded to encompass metrics as well. Interestingly, attendees expressed a strong desire to witness Prometheus serving as the backend for storing OpenTelemetry metrics, indicating a growing appetite among users for this collaboration.
One important step in this collaboration was the Prometheus support for receiving OTLP metrics which was recently announced. Other noteworthy advancements on this front include the Prometheus Java client (v1.0.0) which offers OpenTelemetry metrics and tracing support, and a new Arrow based OTLP protocol that boasted statements of impressive performance boost such as halving bandwidth usage and CPU costs.These claims certainly piqued the interest of attendees, adding an exciting dimension to the event.
This year’s conference also saw end users sharing practices of how they run OpenTelemetry and Prometheus, and actively contributing suggestions on how these two projects could collaborate effectively in production scenarios. These discussions delved into compatibility concerns at both the protocol and domain levels, adding depth and substance to the discourse.
Prometheus Operator, Scrape Sharding, Monitoring VMs Updates from PromCon
In the realm of Prometheus, a pivotal topic that garnered significant attention at this year’s event was the Prometheus Operator, a critical component for productization and operational efficiency. A significant update that stole the spotlight was the introduction of a new “scrape config” Custom Resource. This innovation marked a major shift, empowering users to monitor Virtual Machines (VMs) outside of Kubernetes—an endeavor once considered a nightmarish ordeal.
Prior to this development, monitoring VMs beyond Kubernetes posed formidable challenges, deterring many from exploring this avenue. However, with the introduction of scrape config, a world of possibilities unfolded. This enhancement not only simplified the process but also extended support to every discovery mechanism supported by Prometheus.
“It’s possible to support all discovery in Prometheus Operator,” Augustin says. “That’s a huge opportunity for a new contributor to start a new project in Prometheus, and in Prometheus Operator especially.”
Alert Sharding, Native Histogram Updates from PromCon
Furthermore, the conference touched upon the vital subject of alert rule sharding. The absence of robust alert rule sharding support had been a significant hurdle for many leading to the hesitancy in adopting Prometheus Operator. Augustin experienced this pain first hand at Amadeus: “you don’t want to provide the Alertmanager instance to everyone, because that means in a single button you can shut down your data center monitoring, which is not super great.”
Shopify also shared this pain on their PromCon end-user talk, along with their internal development to overcome this hurdle. The onus is on the Prometheus community to make it an integral part of the open source. I wonder if Amadeus and Shopify could collaborate on that one, to replace their individual home-grown solutions.
In the dynamic landscape of Prometheus, a significant stride was also taken with the integration of native histograms, a development that stirred excitement and anticipation within the community. The concept of native histograms was not entirely new, but its stability and its default activation was a milestone worth noting. Importantly, this does not signify the removal of the classic histogram, to maintain backward compatibility for existing users.
For additional updates, check out the this post.
Prometheus 3.0 Around the Corner: Updates from the Maintainers’ DevDay
After trumpets ceased and all PromCon attendees left, some of the real fun started, when the maintainers got together the following day for their internal DevDay, to make the most important decisions on the strategy and roadmap.
The highlight of DevDay was by far the decision on Prometheus 3.0, marking a significant milestone in the evolution of this powerful monitoring tool. The ambitious goal is to announce Prometheus 3.0 at KubeCon Europe 2024 in Paris. Let’s hold our fingers crossed.
This major version has been a topic of discussion and dreams within the community, and now, it’s finally becoming a reality. The team’s approach to this release emphasizes user continuity, assuring users that the transition to Prometheus 3.0 will be seamless without any breaking changes.
One of the standout features of Prometheus 3.0 is its revamped user interface. The maintainers envision a fresh, intuitive UI that integrates PromLens, a tool requested by users for its robust query building and debugging capabilities. The roadmap also includes a new AlertManager UI, based on React, after realizing that the existing ELM framework didn’t get traction as a common language.
Prometheus 3.0 also brings forth native support for OpenTelemetry, a significant step toward seamless integration and streamlined monitoring. Notably, this support will be activated by default, marking a stable foundation for users interested in leveraging OpenTelemetry metrics. Additionally, Prometheus aspires to become the default backend for storing OpenTelemetry metrics, a move that signals a new era for users desiring a singular OpenTelemetry-focused monitoring experience.
Want to learn more? Check out the OpenObservability Talks latest episode PromCon Recap: Unveiling Perses and Prometheus Ecosystem Updates.