Although enterprises are building a lot of software, they do not always have a clear gauge of how well it is running. This can be made more challenging if there is a lack of collaboration between teams and this is where the role of site reliability engineering (SRE) comes into play. Yet, there are often misconceptions around the role and how they can be used to full potential within organizations.
In this episode of TFiR: T3M, RackN CEO Rob Hirschfeld, talks about the role of SREs and where they fit within other roles and within the organization as a whole. He goes on to talk about some of the things organizations get wrong with SREs and how RackN is helping their enterprise customers.
Key highlights from this video interview are:
- Hirschfeld talks about Google’s definition of SRE where you had a team of very advanced engineers working at a system level to improve the reliability of their applications. He goes into detail about what the role can look like in terms of making applications resilient, robust, and improving observability.
- SRE teams run the code the company is trying to run and work to improve the performance and stability of that code. Hirschfeld discusses how the skill set differs from what has traditionally come under DevOps and what this means from a hiring perspective.
- Hirschfeld tells us they had expected to see more of an intersection between SREs, IaC, and DevOps but this has not been the case. He talks about how platform engineering differs from SRE as well and where they fit in with each other in terms of working on infrastructure as code.
- Hirschfeld feels that organizations bringing in SRE teams may find their expectations are not totally aligned if they expect the SREs to focus a lot on the infrastructure. He talks about how the role is very narrowly focused on the application side, and what this means for observability.
- While you will always have specialization in specific fields, Hirschfeld believes that IT infrastructure is more siloed now than in the past. He sees organizations putting SREs in individual teams but not creating an environment conducive to collaboration. He believes we need to work out how to have better sharing within organizations.
- Hirschfeld talks about being aware if you are hiring someone for the SRE role that it is a tightly constrained scope of work and expanding their role would make them more of an Ops engineer or DevOps engineer, whose focus would be more around maintaining systems and infrastructure.
- Looking at the tools someone uses can be helpful indicators of what their specialty is and the mix or make-up of their role. While it does not mean they will not use other tools, outside of their specialty it can help you see where their focus lies.
- Hirschfeld says although their enterprise customers have built a lot of software they do not always understand if it is running well. This is where SRE can be brought in and Hirschfeld takes us through some of the scenarios where SRE is being applied within organizations and the benefits it can bring.
- RackN helps customers implement infrastructure pipelines so that they can take a piece of infrastructure and transform it into a production state, doing it through a repeatable, consistent process where you can inject standards. Hirschfeld discusses the benefits for the customers to take this approach.
This summary was written by Emily Nicholls.