Cloud Native

Skymel NeuroSplit reduces inferencing compute costs by up to 60%

0

Skymel’s NeuroSplit adaptive inferencing technology helps enterprises developing AI-enabled applications reduce cloud server costs by 60%. In this clip, Sushant Tripathy, Co-Founder and CTO, and Neetu Pathak, Co-Founder and CEO of Skymel, discuss the challenges enterprises face and how Skymel’s technology is helping to solve these problems. Tripathy says, “With Skymel, you can scale horizontally using older generation GPUs, which would be a lot cheaper and that’s where essentially the cost savings come in.”

Introduction to Skymel’s adaptive inferencing technology and its target audience

  • Tripathy discusses the company’s innovative adaptive inferencing technology for AI that reduces cloud server costs by 60% by splitting inference tasks between the end-user device and the cloud.
  • Tripathy explains how Skymel’s technology allows enterprises to use older generation GPUs for deploying advanced AI models, which helps address the challenges of scarcity and high cost of the latest GPUs.
  • Skymel’s target audience is enterprises developing end-user AI-enabled applications. Tripathy tells us that although cloud service providers could also benefit from their technology, their key focus is on AI application providers.

Addressing AI Scalability and Cost Optimization Challenges with Skymel’s library

  • Tripathy discusses the challenges of leveraging older GPUs and cost optimization for AI applications, explaining that the primary issues are the cost of talent and scalability.
  • Tripathy talks about how Skymel addresses these challenges by allowing horizontal scaling with older, cheaper GPUs to reduce costs.
  • Skymel’s library dynamically allocates compute capacity between devices and cloud servers, reducing response latency and compute requirements on backend servers enabling cost savings for using older GPUs.
  • Tripathy demonstrates an image tagging model, showing how their solutions maintain user experience while still cutting backend costs by up to 60%.

How NeuroSplit’s AI inference technology helps developers

  • Pathak explains how Skymel’s library, NeuroSplit, determines whether to run the entire model on a device, or whether the model is split between the device and the cloud based on the available idle compute on an end-user device.
  • Pathak discusses how NeuroSplit helps developers prioritize latency, cost, accuracy, or privacy, optimizing the inference accordingly.
  • Tripathy highlights NeuroSplit’s fit in multi-cloud and hybrid cloud environments, explaining that in the absence of idle compute it functions like a traditional cloud backend sitting as an orchestration layer.
  • Skymel offers a low-touch developer experience where they can upload their models to the dashboard which then splits the models and provides an endpoint URL for integration. Tripathy explains the benefits for developers.
  • Tripathy discusses what sets Skymel apart from other competitors.

Skymel’s growth plans, including target audience and potential use cases 

  • Pathak discusses Skymel’s funding and growth plans following their emergence from stealth mode.
  • Skymel’s target users are those concerned with cost or user experience. Pathak highlights some of the company’s potential use cases including social media applications and real-time recommendation systems.
  • Pathak emphasizes Skymel’s ability to reduce costs while providing the best user experience and how this makes it suitable for scenarios that require hyper-personalization without extensive data handling.

Guests: Sushant Tripathy (LinkedIn) | Neetu Pathak (LinkedIn)
Company: Skymel (Twitter)
Show: Let’s Talk

This summary was written by Emily Nicholls.

Cribl launches Technology Alliance Partner program

Previous article

Leveraging LLMs as shared knowledge bases to address team challenges | Tina Huang, Transposit

Next article