AI/MLCloud Native ComputingDevelopersDevOpsFeaturedObservability /MonitoringT3M: TFiR Topic Of The MonthVideo

Observability Is More About Practices Than Tools | Justin Hartung – Qarik 

0

Guest: Justin Hartung (LinkedIn)
Company: Qarik Group (Twitter)
Show: TFiR: T3M

Observability is more than just buying an observability tool. Qarik helps companies develop cloud-native practices, such as using data to drive decisions. Contrary to what all the popular observability vendors would like you to believe, you don’t actually need observability software to start developing this practice.

In this episode of TFiR: T3M, Justin Hartung, Managing Partner at Qarik, shares his insights on how a data-driven culture enhances observability within enterprises.

Evolution of Observability:

  • Observability started out as understanding what went wrong with incident management. Then, companies like Google and Netflix started to build on these microservices and build distributed tracing, which allowed every component to send telemetry back to a central system to understand how all those systems are working together. Then, Google also through their SRE practices, went from reactionary to proactive measurements. The SLAs are indicators to understand something is about to go south, and then intervene before it does.
  • These practices and research papers led some crafty vendors to rebrand the whole monitoring into observability, i.e., understanding the state of a system based on its outputs or the data that it generates.
  • Today, there are still enterprises that are not embracing the data that is generated by their systems to help them make data-driven decisions.
  • More and more signals will be used to understand the full picture. Engineers and product managers are starting to take in FinOps data. For example, when looking at optimizing the outcomes for the user, what if the outcome gives you $5 more in profit, but costs $10 more in cost? Without the FinOps data, companies would only look at the revenue drivers.
  • Observability will expand way beyond measuring the technical components. For example, with developer workflow, the number of times developers get interrupted will drive the cost of creating software.

Characteristics of Enterprises with a Data-Driven Culture:

  • Instead of shooting the messengers for bringing bad news based on data, company leaders embrace the incredible insight, fix it, and improve it. This creates a culture of wanting to learn more and using it to make decisions.
  • They nurture people at different levels to create dashboards and use the systems for the data that they have and the data that they generate.
  • Developers are given the right visibility into the system and makes it easy for them to operate it. There is some type of platform engineering that removes the cognitive overload from the developers. Otherwise, they will spend more time trying to figure out the system and less time on feature development.
  • There are tools and platforms that allow people to consume data in a meaningful way, while fostering the culture of embracing data.

On Observability and Generative AI:

  • AI has been used in a lot of tools for a while now, but Hartung admits ChatGPT and the whole generative AI movement has taken him by surprise.
  • Companies in the observability space are leveraging the same techniques to understand correlation and causation.
  • The danger with generative AI is the lack of observability. If you let everyone start using AI, they’ll start going to non-sanctioned systems, start copying and pasting intellectual property into destination systems, and then information starts leaking. Companies today don’t have any observability on who’s using what, which can lead to an unknown attack vector and erosion of information security and intellectual property.

The Qarik advantage:

  • They are obsessed with data and firmly believe that if you can’t measure it, you can’t improve it.
  • They sit with their customers first, rather than do something for them. It’s about helping companies and the individuals at those companies understand what questions they should be asking, what data they should be gathering, and then how they use that to make a decision.
  • They instill the practice of asking questions, adding more data, and improving over time. That way, the company’s capabilities continue to evolve and grow beyond their engagement with Qarik.
  • They believe a company’s competitive advantage is not just their source code, but the combination of the systems they build, how these systems work together, the customer experience, and the products that they assemble from those components.
  • They sit with engineers in teams to help them learn how to use data. They also work with the executives to understand how to set goals and how to reward people based on the data, rather than penalize people for having something that looks bad.
  • When you create a data-driven culture, you enable different use cases you didn’t think were possible.

This summary was written by Camille Gregory.