Cloud Native ComputingDevelopersDevOpsFeaturedNewsroomObservability /MonitoringVideo

Rookout’s Snapshots Aims To Help Developers With Application Observability


Guest: Liran Haimovitch (LinkedIn)
Company: Rookout (Twitter)

Developers are dealing with bugs on a daily basis but particularly in the complex cloud-native world it can be difficult to get a clear idea of what is going on. While there is great value in metrics, logs and traces, they have their own limits. Rookout believes there is room for a powerful component such as ‘Snapshots’ that gives developers more control.

In this episode of TFiR: Newsroom, Swapnil Bhartiya sits down with Liran Haimovitch, CTO and Co-Founder at Rookout, to talk about their new Snapshots feature. He talks about how Snapshots can help empower developers and provide them with a clear view of the state of the application, which can be particularly helpful for dealing with bugs. He talks about the benefits of Snapshots and the shift toward developer-led observability. 

Key highlights from the video interview are:

  • Rookout have announced Snapshots as a standalone feature, which they are calling the fourth pillar of observability. It allows software engineers to capture the state of the application with a single line of code or a single click of the mouse, to see exactly all the variable values, or the stack traces. Haimovitch goes into detail about the new feature. 
  • Snapshots have existed in various forms for decades. We have also had logs for decades but they do not scale well in some advanced use cases, so metrics and then traces were invented to help with this problem. Haimovitch discusses how these can help provide a big picture approach but with the shift left movement, developers need accuracy and that is where Snapshots come in. 
  • Haimovitch goes into depth about the challenges developers can experience when writing loglines following an error with them struggling to know what to share. Error messages can end up minimal and lackluster, whereas Snapshots allow you to say in the writing the code to take a snapshot here, and you instantly get the full state of the application with all the local variables and statutes.
  • Developers are fixing budgets every day but Snapshots can help them zoom in to focus on their code that they are trying to investigate and fix. Haimovitch explains that it also helps with troubleshooting and onboarding. He talks about the effect cloud-native computing has had on debugging. 
  • Haimovitch does not know what the future holds for observability and there have been people asking about observability for AI and how we will do software engineering with these new paradigm shifts. He is unsure if the current direction observability is going in is the best way forward for the future. 
  • The biggest gap Haimovitch sees with customers is engineers needing to zoom in and how Snapshotting is helping them gain a comprehensive picture that allows software engineers to see their code very well and troubleshoot everything that goes wrong or is unexpected. 
  • Haimovitch believes that Snapshots will complement things like logs and metrics rather than replace them. However, he feels that the use of logs may decrease or be shifted from hot storage to cold storage because of the increased use of Snapshots and the move toward dynamic observability. He explains why this is a better approach. 
  • The logging process is very expensive because of the amount of engineering effort you spend on setting up the logs initially and optimizing them by adding in variables. Haimovitch feels that observability needs to be more reactive and why it is more effective for developers to specify in real time the data they need. 
  • Rookout started with a tool four years ago that allowed you to set a non-breaking breakpoint anywhere in the code, which could be turned into a Snapshot, log, metric, or trace. Haimovitch discusses how it was predominantly used for Snapshots though and how this led them to realize that they needed to create a separate Snapshots feature. 
  • Haimovitch shares a use case saying they have added the ability for the agent in the application to automatically take Snapshots when interesting things happen, such as when tests fail. 

This summary was written by Emily Nicholls.