Guest: Roman Kharkovski (LinkedIn)
Company: Qarik Group (Twitter)
Show: Let’s Talk

Recent reports, including Flexera’s 2023 State of the Cloud Report, show that managing cloud costs is the number one challenge among large, small, and medium organizations. FinOps grew out of the realization that companies need to understand and control these costs.

In this episode of TFiR: Let’s Talk, Qarik Principal Architect Roman Kharkovski shares his insights on the FinOps practice and principles and how Qarik is equipped to help companies implement them.

Current trends in the market:

  • Kharkovski thinks 99% of organizations don’t consider the cost. Overuse of resources or improper allocation of resources is common. Sometimes, product managers don’t even know how much their product is costing the organization in terms of cloud consumption.
  • According to Google Cloud’s State of Kubernetes Cost Optimization Report, one out of 10 GKE clusters is completely idle at any given point of time. Some clusters are over-provisioned, i.e., 30% of clusters allocate 35 times more resources than they’re consuming.
  • The driving factor for increasing costs is the lack of incentives for engineering teams and product teams within organizations to control them. If the company’s priority is to deliver features/functions as fast as they can, then that’s what they’re optimizing for. Product design documents rarely include cost control and cost optimization.
  • Generally, the most effective FinOps implementation is creating a centralized, cross-functional team that executes the practice, instills the culture, runs the training, planning, and ongoing monitoring.

Key FinOps principles:

  • Collaboration between teams (engineering, finance teams, business units, and product managers). Together, they need to design the incentives OKRs and KPIs to measure the achievements.
  • Visibility into your cloud consumption. To achieve that, you need to provide very easy access to the cost structure in near real-time.
  • Ongoing cost optimization, i.e., cost decisions have to shift left, rate optimization, rightsizing virtual machines, and ongoing monitoring.

Challenges companies face when it comes to FinOps:

  • Resistance to change. Everybody likes to do what they’re used to doing. Solution: Give proper incentives. If the team has an incentive to reduce the cost for their project or their API or component, then they will try to do it.
  • Difficulty in getting proper cost attribution. To know exactly what a particular component or API or entire product is consuming out of the entire organization’s cloud bill, there should be labeling and tagging of resources at the cloud level. But very often, this is done manually, which can be inaccurate. The labels and tags can be inconsistent. Solution: Provide automated tooling.
  • Lack of visibility. If engineers and product managers cannot see up-to-date information about consumption, it is difficult for them to make cost-based decisions. Solution: Provide frictionless access to cost data.

The FinOps culture is very important, but it does not happen overnight. You need to

  • explain to people why it matters, what it means to them, and what it means to the organization.
  • train people on proper practices and reinforce them.
  • give incentives.
  • provide visibility and promote the best practices and lessons learned.

The Qarik advantage when it comes to FinOps:

  • It has engineers and principal architects who have significant experience with Google Cloud, including some ex-Googlers.
  • It has FinOps practitioners, including Kharkovski who obtained his FinOps Certified Practitioner (FOCP) from The Linux Foundation.
  • The company has a fairly sophisticated and comprehensive program to help customers establish their FinOps organization, create the charter of responsibilities, create a FinOps plan, help implement FinOps practices, use the tools provided by Google Cloud, build the looker dashboards for cost visibility, implement cost optimization, architect applications in a cost-effective way, optimizing storage costs, network compute costs, and ongoing monitoring and improvements.

This summary was written by Camille Gregory.

You may also like