AI Infrastructure Cloud Native Open Source

How Kubernetes 1.36 Handles GPU Scheduling, DRA, and Kubelet Security | Ryota Sawada, Kubernetes | TFiR

Kubernetes 1.36 adds native GPU scheduling via Workload Aware Scheduling and DRA, plus stable fine-grained Kubelet authorization. Ryota Sawada, Release Lead, explains what changed.

By Monika Chauhan May 27, 2026

0

AI and ML workloads demand coordinated GPU scheduling, fine-grained resource allocation, and tight authorization controls that Kubernetes has not historically provided natively. Platform teams have been absorbing that gap with community plugins, custom controllers, and overly permissive Kubelet access. Those workarounds carry real operational and security debt.

In this interview on TFiR, Kubernetes 1.36 Release Lead Ryota Sawada covers the new Workload Aware Scheduling APIs, Dynamic Resource Allocation admin access, fine-grained Kubelet API authorization at stable, the beta graduation feedback process, and the structure and naming of the Haru release.

Guest: Ryota Sawada, Kubernetes 1.36 Release Lead
Show: TFiR

Here is what every platform engineer and cluster administrator needs to know.

Technical Deep Dive

Q: What is Workload Aware Scheduling in Kubernetes 1.36 and why was it added?

Kubernetes 1.36 Release Lead Ryota Sawada explains that Workload Aware Scheduling, abbreviated WAS, is a new scheduling capability introduced to address the demands of AI and ML workloads that require coordinated pod execution. Prior to this feature, teams relied on community solutions to schedule groups of pods simultaneously. Gang scheduling support was introduced in 1.35, and 1.36 extends it by splitting the scheduling template into two distinct APIs: Workload and PodGroup.

“The workload aware scheduling breaks that template part into workload and the actual runtime object into PodGroup, and that clear separation gives us even further clear connection point for the DRA.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Q: How do the new Workload and PodGroup APIs connect to Dynamic Resource Allocation for GPU workloads?

Sawada explains that Dynamic Resource Allocation, DRA, handles the GPU and peripheral resources needed for ML inference and LLM tooling on Kubernetes. The separation of the Workload and PodGroup APIs in 1.36 creates a clear connection point between the scheduling layer and the DRA resource claims. PodGroup now understands DRA resource claims directly, which means GPU resources and the compute that must run alongside them can be managed together through native Kubernetes primitives rather than external tooling.

“Now that we can combine those powers from DRA, making sure that the resource claim is something that the PodGroup understands and PodGroup understands the DRA, so that gives us the full control of how we can actually tackle the AI ML solutions that we need.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Q: What security problem does fine-grained Kubelet API authorization solve in multi-tenant clusters?

Sawada describes the core problem as over-entitlement in observability tooling. Before 1.36, authorization at the Kubelet level was coarse-grained, meaning an observability stack granted read access to Kubelet data could also potentially exec into containers or exercise administrative controls it did not need. With fine-grained Kubelet API authorization now at general availability, operators can restrict a specific observability component to only health check access, with all other Kubelet sub-resources managed under separate authorization controls.

“This gives us a new sub resource saying that we do want to just have the health check and that’s the only thing that this particular observability stack needs, and everything else related to more control is going to be managed by a separate sub resource.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Q: What DRA changes in Kubernetes 1.36 improve day-to-day cluster administration?

Sawada highlights the addition of administrative access to DRA resources as a notable improvement for cluster operators. Resources that have been claimed by a pod or are actively running can now be accessed by administrators directly through Kubernetes for cleanup, inspection, or troubleshooting tasks. Previously, operators had to embed workaround logic or rely on third-party solutions. SELinux integration is also cited as part of the broader set of security and control improvements in this release.

“Kubernetes natively supporting such DRA control and other controls, such as the security aspects of SELinux being part of it, so that you have more control without actually relying on something third party or having your own control.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Q: How does Kubernetes 1.36 balance shipping new features with maintaining production stability?

Sawada explains that the release team operates on a cycle of approximately 15 weeks, during which test cases and test scenarios are continuously validated. Features reaching stable GA status must meet strict documentation and testing criteria set by the release team in coordination with contributing SIGs. The expectation for stable features is that they are production-grade and that the API will not require further breaking changes. Kubernetes 1.36 contains 18 features reaching stable, the most in a single release, contributed by SIGs including SIG Node, SIG Network, and DRA-focused working groups.

“Hitting the GA stable feature is definitely a challenge for many KEPs, but we have so many contributions from all the different SIGs, and it is a bar that’s set really high and it has been that case for many releases in the past, but it’s not slowing down.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Q: What is the difference between alpha and beta in Kubernetes feature graduation, and why does it matter?

Sawada describes beta as a significant threshold because features at that stage are typically enabled by default, which means users can test them without any extra configuration work to manually enable alpha feature gates. Alpha features require deliberate operator action to activate, limiting the breadth of real-world testing. Beta features expose the capability to a much wider range of clusters and workloads, generating the community signal that informs final refinements before general availability graduation.

“Beta usually means that it’s enabled by default and that gives us the users the chance to actually test it and see if it really has any benefit or if it may have any problems with the running workloads that are already in your cluster.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Q: How do community feedback loops from beta features shape stable GA releases in Kubernetes?

Sawada explains that the beta phase is designed to surface signal from a broader set of use cases than any single team or contributor can anticipate. When a feature is enabled by default, clusters across different organizations and workload types encounter it in real conditions, revealing edge cases and enterprise blind spots that development teams may not encounter in isolation. That feedback is incorporated before the feature graduates to stable, ensuring the GA release reflects actual production requirements across a wide user base.

“The beta is a significant one that makes sure that we can support more use cases and more, get more signal from the community and fine tune it in a way that general available solution would take more advice and more inputs from the community.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Q: What is the Kubernetes 1.36 release name and what does it mean?

Sawada named the release “Haru,” a Japanese word with multiple layered meanings including spring, a clear sunny sky, and the sense of looking toward a distant horizon. The release coincides with a spring timeframe and the name was chosen to reflect both the scope of what 1.36 delivers and the forward momentum of the Kubernetes project. The previous release, 1.35, was named Timbernetes and was led by Drew Hagen.

“That name Haru is to give us the chance to soar into the clear sky and maybe look at the tomorrow’s dawn and further future ahead.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Q: Where can practitioners find detailed technical documentation for each Kubernetes 1.36 enhancement?

Sawada points practitioners to the Kubernetes blog, where the official 1.36 release announcement is published. Alongside the release announcement, the team publishes individual feature blogs, each focused on a specific Kubernetes Enhancement Proposal or group of related KEPs, providing technical background, history, and implementation details. Sawada states these feature blogs are scheduled to publish daily in the period following the release, covering the full scope of what Kubernetes 1.36 enables.

“There will be one published every day, that’s the schedule that we are looking at, so there will be a lot more coming and a lot more clarity on how that 1.36 Kubernetes release is going to affect you or have a significant impact in some cases.” — Ryota Sawada, Kubernetes 1.36 Release Lead, Kubernetes

Resources & Documentation

Kubernetes Blog, official release announcements and per-KEP feature blogs for Kubernetes 1.36
Kubernetes Scheduling and Eviction Documentation, reference documentation covering scheduling concepts including gang scheduling and resource allocation
Dynamic Resource Allocation (DRA), Kubernetes documentation on DRA for GPU and peripheral resource management
Kubernetes Enhancements Repository (KEPs), full index of Kubernetes Enhancement Proposals including those in 1.36

***

👇 Click to Read Full Raw Transcript

Swapnil Bhartiya: Kubernetes doesn’t stand still and that’s exactly why it keeps leading. It’s one of the biggest open source projects out there along with the Linux kernel. And now we have version 1.36 which also proves that the project continues to deliver. And here we have another wave of features across stability, security and obviously scale. And today we have with us here Ryota Swada release lead for Kubernetes 1.36 to unpack this release. Now if you look at this release, Ryota so much is there so many stable features, beta features, alpha features. But I think the focus if we, I mean forget about the rest of the world, the focus is more on AI and AI infrastructure. That’s what we saw at Kubecon as well. Can you talk about how does this release specifically improves resource management and scheduling for the massive GPU and data heavy demand for modern AI agents?

Ryota Sawada: AI and ML workloads are definitely a challenge and I think it is something that we have highlighted in the past few releases so far, maybe more, especially from 1.33 where I was the communication lead, where I was writing the blog for and I could see there is significant focus and pressure around the AI solution being able to run on Kubernetes and 1.36 marks it’s another new chapter where we have the new workload aware scheduling which is shortened to was. So the workload aware scheduling means that we can we never had this sort of stuff support before. You may have been able to use a community solution to have a port number of ports to go out at the same time. This was actually one of the things that was introduced in 1.35 it’s called gang scheduling. But the support for it has been further increased and the main takeaway from the workload of a scheduling is that we have two new APIs. One is called Workload as the name suggests and the other one is port group. So we only had a single port to schedule. Most of the time you have a deployment stateful set, whatever that may be. But when it comes to scheduling one port gets scheduled but with the workload API Port Group API, the gang scheduling was something that was supported in V135 but at the same time the 1.36 is workload aware scheduling breaks that template part to workload and also the actual runtime object into portgroup. And that clear separation gives us even further clear connection point for the DRA dynamic resource allocation. So the resource that we use for ML inference and LLM tooling, GPU and other Peripherals that we needed to connect on Kubernetes, those are supported by dra. How does it really connect to the compute that we need to run alongside with it? So those are the changes with the WAS and dra. And now that we can combine those powers from dra, making sure that the resource claim is something that the port group understands and port group understands the dra. So that gives us the full control of how we can actually tackle the AI ML solutions that we need. And the implication is that we have the native Kubernetes support for those massively demanding tasks and the new feature of the AI days to come.

Swapnil Bhartiya: Thank you for talking about of course AI ML workload. Now let’s talk about another important area which is security. And now the fine grained Kubelet API authorization has also graduated to stable for CISOs who are trying to secure multi tenant clusters. How does that feature change the security posture against both internal misuse and of course external threats?

Ryota Sawada: Absolutely. So the security around observability is something that we need to handle very carefully. The observability is there to provide us the stability, the security, the reliability. Make sure that everything is running in production and production grade support is something that we can only do from observability stack. But with observability stack there is always the strength of the observability stack really taking out the data that it doesn’t not just the data, but has access to potentially get the data or potentially get into the cluster with too much power, too much entitlement to be able to do things that it’s not supposed to. It should be about getting the data from, but it could potentially go into that territory of something that observability shouldn’t do. And that’s a security challenge around observability. And this fine grained auth set with Kubelet is definitely one of those that observability is definitely we don’t want to something we don’t want to lose, but on top of it we want to secure it even further. Before 1.36 the challenge around authentication authorization authz with Kubelet was that it was very coarse and you only get access to retrieve the data around Kubelet with extra access that you might be able to exec into container, you might have extra power that you’re not supposed to or you don’t really need for the observability tasks. But with the general availability of this feature of the fine grained control, this gives us a new sub resource saying that we do want to just have the health check and that’s the only thing that this particular observability stack needs and everything else related to more control around the more niche use, not niche, I guess more use cases of the administration and control is going to be managed by a separate sub resource. So that gives us a clear separation and gives us just better security posture as a whole in kubernetes.

Swapnil Bhartiya: Now if you look at enhancement, there are 71 enhancements in this release. Some matter more directly to day to day builders than others. Can you talk about which updates in this release are really aimed at simplifying, if we can use that word, for kubernetes, for developer experience and reducing the complexity of getting code from local environment into production?

Ryota Sawada: Absolutely. So related to the workload that we’re scheduling and there is a DRA dynamic resource allocation. There are a few changes around how the ports are scheduled, how the resources are managed, but on top of it, the DRA has been around for some time at this point. From 1.32133 there has been quite a bit of DRA focus in the release blocks and you can see all the blogs about how the DRA had been evolving and that will be something that we’ll be talking about in version 1.36 as well. 1 thing that caught my eye is the administration cluster administration tasks and how it’s made easier with the recent changes around not just DRA but in General in this 1.36 release. As you said, there are many releases, many kubernetes enhancement proposals in this release and it’s actually the largest number of keps in a single release. So there are so many things that happened. One thing that I can talk about the DRA is that there is the administration access to the resource. That resource may be claimed via port or maybe something that’s already running and you can get the access to it with the admin access maybe allowing you to do a bit of cleanup, maybe figuring out what’s going on as an administration. That may not be exactly how the developer interacts with kubernetes, but kubernetes as a whole as a development environment, that administration access with the DRA is definitely going to give you more control, more ease of use, because before that you had to maybe have another logic embedded or have to work around the solution somehow else. So with the kubernetes natively supporting such DRA control and other controls, such as the security aspects of Selinux being part of it, so that you have more control without actually relying on something third party or having your own control that just gives us more control and don’t have to think too much about how do we integrate with that extra tooling. Kubernetes does it natively. I think that’s the biggest change, that’s biggest improvements that we can talk about in the Kubernetes 1.36 if I’m not

Swapnil Bhartiya: wrong, there are almost 18 features that reach stable stage in this list, which says a lot about maturity and discipline of the project itself. Can you talk about how’s the release team kind of balancing between innovation and stability? I mean of course Kubernetes runs in production, so that is not even a question. But talk a bit about how you folks manage this balance to ensure long term stability. At the same time, more peace of mind for maintainer sustainability as well.

Ryota Sawada: Yeah, so one of the things that we definitely take into account with the release team specifically is that how can we make sure that the release is stable? How can we make sure that the release is ready for end users so that they can just use it without really worrying about what is it this release that contains. So the enhancement proposal and all the structure around the release management is definitely catered so that something surprising would not happen, especially around stable GA features. Those are rock solid. Those are meant to be rock solid. We do have the expectation that everything is done to the point that we cannot change too much more. And the expectation is that obviously we have very clear documentation, very clear testing in place. We have lots of testing that happen around release team that we make sure that everything is ready not just on the release day, not just when the release happens, but during the cycle of the release which is usually 15 weeks or more. We are checking all the test cases and test scenarios and making sure that every test are running correctly and working fine. We have a very clear and rock solid templates and also the criteria making sure that everything has to be up at the level that we require for production grade. Kubernetes environment Kubernetes experience and with that stable Hitting the GA stable feature is definitely a challenge for many keps, but at the same time we have so many contributions from all the different SIGs, not just the release team, but many SIGs related to the DRA, maybe related to working group as well as well as SIG nodes and SIG network. Many things are many, many teams are actually working out to get this out in a very stable and solid production ready environment. So it is a bar that’s set really high and it has been that case the case for many releases in the past, but I think it’s not slowing down. We are only having more and more contributions and more stable releases, stable enhancements that are part of the release now.

Swapnil Bhartiya: Now There are also 16 features that entered beta. Can you talk about what are the main feedback loops that they are there or you’re looking for from the community to make sure that these feature also mature, they become stable and they also kind of graduate. And if you can also talk about what kind of engagement is there with the larger community to kind of also address some real world enterprise blind spot. Because here’s the problem. You know developers, they don’t know the use cases because they are working on the code. And then different companies, they have different developers, they are focused into it. So this involvement, engagement and feedback, it does help them a lot. So can you talk about this process of from beta to stable?

Ryota Sawada: I think there is a significant difference between alpha and beta in most cases. This is not always the case, but in beta people do expect that the feature itself is enabled and is ready to test without doing too much of administration work to get that alpha root feature enabled in the Kubernetes ecosystem, the environments that you’re running. So beta usually it’s not always usually means that it’s enabled by default and that gives us the users the chance to actually test it and see if it really has any benefit or if it may have any problems with the running running workloads that that are already in your cluster. That gives us the chance to actually look at how that beta features may actually really help scheduling related to maybe workload aware scheduling. That’s still alpha. If it becomes beta with enable default, everyone will be able to use it without really enabling anything for the Kubernetes setup perspective. So the beta is a significant one that makes sure that we can support more use cases and more, get more signal from the community and fine tune it in a way that general available solution would take more advice and more inputs from the community, making sure that it’s relevant for not just one person, not just one team, not just a small percentage of the users, but also different use cases. And we can definitely have more support as a whole in the beta release. That really gives us the chance to get ready for the stable GA graduation. And that is another hurdle that everyone aims for and it is definitely a challenge. But also beta is such a significant milestone and we have quite a few in this version 1.36 release as well.

Swapnil Bhartiya: Were there any specific themes features in this release cycle that specifically stood out to you. I’m not talking in general you know Kubernetes community, but you know you in general. And also if you can talk about because every Kubernetes release also has a theme as well or a code name. What was that? So there are two questions put together.

Ryota Sawada: So the previous release 135 was called Timbernetes and it was a world tree and it was led by Drew Hagen and for 136 I had the chance and honor to lead it and name it. The release name is Haru. It’s a Japanese word for multiple meanings and mainly with spring. It happens to be a spring release and also a clear sunny sky and also from for remote distance and horizon. So there is a lot to cover in 136 and there is so many more that are to come with Kubernetes, not just 136. We are already looking at 137 and onwards and that name Hadoop is to give us the chance to soar into the clear sky and maybe look at the tomorrow’s dawn and further future ahead.

Swapnil Bhartiya: Excellent.

Ryota Sawada: Thank you.

Swapnil Bhartiya: Before we wrap this up, anything else that you would like to add which is really important in terms of this release that you would like to highlight?

Ryota Sawada: Yes, one thing. Kubernetes release always has this release announcement blog that goes out in the Kubernetes blog, which you can just find at the top of the Kubernetes blog. And it comes with feature blogs that we call and that feature blog is specifically just highlighting one Kubernetes enhancement proposal in most cases or potentially sometimes a group of keps as a whole. And that’s the chance for us to look into the actual technical details and how that particular change is relevant and how it’s actually done some of the background history to it. So there will be more, more featured blogs that are to come in the next month or so. So stay tuned. There will be one published every day that’s the the original the schedule that we are looking at. So there will be a lot more coming and a lot more clarity on what how that 136kubernetes release is going to affect you or have a significant impact in some cases. And that’s something that I am looking forward to reading myself. And it is something, it is a great material to look at, read through and understand what the Kubernetes 136 really enables you to do.

Swapnil Bhartiya: Ryota, thank you so much for joining us and for walking us through Kubernetes136 release. Really appreciate your sharing both the technical details and of course, the bigger picture behind this list. Thank you so much for your time and I look forward to the next release. Thank you.

Ryota Sawada: Thank you very much for having me.

You may also like

Why DDoS Attacks on Banks Last Longer and APIs Are the New Front Line | Steve Winterfeld, Akamai | TFiR

By Monika Chauhan5 hours ago

Why AI Coding Agents Fail in Jupyter Notebooks and How Jupyter AI Fixes It | Lahari Chowtorri, Amazon | TFiR

By Monika Chauhan1 day ago

AI Infrastructure

How to Route AI Inference Across Latency, Cost, and Model Fit Simultaneously | Ari Weil, Akamai | TFiR

By Monika Chauhan1 day ago

AI Infrastructure

Why HA Failover Fails: Overlooked Application Dependencies and Untested Runbooks | Matthew Pollard, SIOS Technology | TFiR

By Monika Chauhan4 days ago

Cloud Native

Why AI Inference Costs and Vendor Lock-In Are Now Your Biggest Infrastructure Risk | Swapnil Bhartiya, TFiR

By Monika Chauhan4 days ago

AI Infrastructure

Why AI-Generated Code Needs a Cloud Sandbox to Be Trustworthy | Waldemar Hummer, LocalStack | TFiR

By Monika Chauhan4 days ago

Cloud Native