AI InfrastructureOpen Source

How to Use AI Agents in Open Source Without Losing Architectural Control | Madelyn Olson, AWS | TFiR

0

AI-assisted contributions have caused code volume inside open source projects to surge faster than any maintainer team can review. The review bottleneck is not a tooling problem. It is a judgment problem: automated tools can check whether code runs, but they cannot determine whether code belongs in a project. When that distinction collapses, technical debt, architectural drift, and undetected security vulnerabilities follow.

In this interview on TFiR, Madelyn Olson, Valkey Project Maintainer and Principal Engineer at AWS, walks through how the Valkey project has built a layered system of AI agents, provenance guards, adversarial testing harnesses, and automated backporting pipelines to absorb the AI contribution surge while keeping human maintainers focused on the decisions that actually define the project.

Guest: Madelyn Olson, Valkey Project Maintainer and Principal Engineer at AWS
Show: TFiR

Here is what every open source maintainer and platform engineer needs to know.

Technical Deep Dive

Q: What has AI-assisted development actually done to open source contribution volume?

Madelyn Olson, Valkey Project Maintainer and Principal Engineer at AWS, reported that since approximately December 2024, the Valkey project has seen roughly a 30% increase in pull requests opened. More significantly, the number of lines of code changed across those PRs increased by approximately 500%. Contributors are not only submitting more frequently; each submission is substantially larger than before AI tooling became common. This has shifted the primary constraint inside the project from contribution volume to review capacity.

“The bottleneck that we’ve actually seen inside the project for a long time is actually reviewing that code.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: What are the two dimensions of code review, and which one can AI actually handle?

Olson describes code review as operating on two separate axes. The first is functional correctness: will this code crash, what bugs does it contain, does it handle edge cases? The second is architectural fit: is this the right code for the project, does it use the correct abstractions, does it integrate cleanly with the existing codebase? AI is reasonably capable on the first dimension and significantly weaker on the second. Foundational models carry a strong bias toward generating new code and toward specific structural patterns, which makes them unreliable judges of whether a change belongs in a specific project’s architecture.

“A lot of the foundational models have a very strong bias for generating new code. They often are not good at abstracting code in a good way to incorporate new functionality.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How does Valkey use AI agents to assist with code review without replacing maintainer judgment?

Olson uses an agent during code review specifically to summarize changes and identify the key modifications in a PR. This handles the orientation cost of reading a large diff and lets her focus directly on architectural questions. The agent performs the work it is well suited for: extraction and summarization. Maintainers then apply judgment on whether the change fits the project’s direction, a question the agent is not asked to answer.

“I have an agent that I use while reviewing code. It does the thing that AI is pretty good at. Summarize this code for me, tell me what the key changes are.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How did Valkey use adversarial AI testing to find security vulnerabilities?

Valkey built an adversarial testing harness in which an agent is given a code change and instructed to attempt to crash it or produce corrupted data. If the agent finds a failure, it reports back. The harness runs deterministic checks for crashes and data corruption, then instructs the agent to try a large number of different approaches. This methodology uncovered an embargoed CVE within the Valkey project as well as several additional esoteric bugs. Olson notes that proactive discovery by the project is significantly preferable to disclosure arriving from an external security researcher with an embargo deadline.

“We said go try 10,000 different things, come up with new things, and let us know what you find. It found the CVE we talked about. It found a bunch of other esoteric bugs.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: What is the Valkey provenance guard and what license problem does it solve?

Valkey is a fork of Redis, and the two projects operate under incompatible licenses. Redis uses a SSPL/AGPL-inclusive tri-license; Valkey uses BSD. This means commits from Redis cannot legally be applied to the Valkey codebase, even though both are technically open source. The provenance guard is a tool that maintains a large list of commit hashes from Redis and checks incoming contributions against that list. A secondary LLM-based check then reviews any match for confirmation. Olson flagged that AI agents contributing to Valkey could inadvertently pull Redis commits without understanding the license boundary, and the provenance guard is designed to catch that case automatically so maintainers do not need to audit it manually on every PR.

“What if an agent goes and tries to pick a Redis commit and apply it to Valkey, not knowing that they aren’t allowed to do this?” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How does Valkey automate the backporting process using AI agents?

Valkey’s release process requires selectively moving commits from the development branch into a release branch, since not every development commit is appropriate for a given release. Valkey built an agent that takes the identified commits, applies them to the release branch, runs the test suite, and attempts to resolve any test failures on its own. If the agent cannot resolve the failures, the work escalates to maintainers. The goal is that maintainers are only involved to confirm which commits should be backported, not to perform the mechanical work of doing it. Olson noted this was demonstrated in practice during the Valkey 9.1 release, which shipped during Open Source Summit Minneapolis.

“The ideal case is the maintainers don’t get involved at all. The maintainers are just there to say, these are the correct commits to be backporting.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: What percentage of maintainer work has Valkey offloaded to AI agents?

Olson estimates that the automation Valkey has built offloads roughly 20 to 50% of day-to-day maintainer work, with significant variation depending on the week. During a release week, a higher proportion is offloaded because the backporting pipeline handles much of the mechanical work. On a typical week, the proportion is lower and Olson spends the majority of her time on deep technical work. Core project decisions remain with human maintainers in all cases.

“Some of the automation we’ve built offloads probably like 20 to 50% of the day-to-day work of maintainers to AI.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: What is AI slop in open source contributions, and how do maintainers identify and handle it?

Olson frames AI slop as having two distinct levels. The first is code or documentation that is simply wrong, typically because the contributor pointed an LLM at an issue without providing sufficient project context. The output looks plausible, but misses edge cases and is not well integrated with the existing codebase. The second level is subtler: code that is largely correct but contains quiet bugs or is not production-ready. For the first type, Valkey will decline the contribution if the contributor shows no genuine understanding of the project. For the second type, maintainers work with the contributor to identify what was missed, because a human in the loop who understands the feedback can improve over time.

“How the slop was generated is more important, not necessarily that it was bad.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How has Valkey changed its good-first-issue process because of AI agents?

Valkey previously maintained a public list of issues labeled as good first tasks to help new contributors get started. Olson observed that automated agents, including publicly available tools, began picking up these issues and attempting to solve them without any human contributor involvement. Since those issues are designed to bring people into the project rather than to close tickets efficiently, automated resolution defeated the purpose. Valkey removed the public label and now onboards new contributors through its Slack channel, where maintainers can hand-assign first issues to individuals directly.

“These issues aren’t strategically important. They’re designed in such a way to help get people involved.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: What is the AGENTS.md file in the Valkey codebase and how does it improve AI contribution quality?

Valkey maintains an AGENTS.md file in the repository specifically so that contributors can point their LLM or coding agent at it before working on the codebase. The file provides project-specific context that general-purpose models lack by default, including architectural conventions and patterns that Valkey uses. The goal is to raise the baseline quality of what an agent will produce by giving it the information it needs to generate code that fits the project rather than code that merely compiles. Olson frames this as raising the bar for what a typical agent can contribute, rather than lowering the bar for what the project accepts.

“The whole goal of that is to raise the bar of what a typical agent will be able to do.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How does Valkey use AI agents to prototype architectural trade-offs more quickly?

Olson described Valkey’s exploration of SSD-based storage as an alternative to pure in-memory operation, given the cost and availability constraints on RAM. Rather than sequentially building and evaluating each architectural option, Valkey uses agents to rapidly prototype multiple variants in parallel: storing both keys and values on disk, keeping only values on disk with an in-memory index, storing values on a network-attached volume such as EBS, or offloading to a service like S3. Agents can generate and test these variants quickly because they are well suited to unbounded exploration. By the time maintainers evaluate results and choose a production direction, they have empirical data from multiple approaches rather than a single untested design.

“Agents can very quickly go and validate all of those trade-offs. Because as I said, they’re very good at generating lots of code, and if something doesn’t work, they can try new things.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How does Olson use AI when writing production-quality code versus exploratory prototypes?

Olson draws a clear line between the exploratory phase and the production coding phase. During exploration, agents are given wide latitude to generate and test options. When writing production code, Olson works in a pair-programming mode with the AI: she and the agent write tests together first to constrain the problem correctly, then the agent attempts an implementation to those specs. She then reviews and corrects the output to ensure the abstractions and architecture align with what the project requires. The agent accelerates the speed of writing well-constrained, test-backed code, but the architectural frame is set by the human.

“We’ll typically write all the tests together to make sure everything’s constrained correctly. Then I’ll maybe let go of the agent, go and try to implement everything to those specs.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How does Valkey ensure that AI agents do not drive the project’s technical direction?

Olson notes that Valkey sits at the engineering application layer rather than the research frontier, which means ideas come from reading papers and translating research into production-viable implementations. Maintainers generate all of the project’s architectural ideas by staying engaged with research literature and understanding real-world workload requirements. Agents are then used to build out those ideas. The creative and directional work is not delegated. Olson is explicit that Valkey maintainers are engineers, not researchers, and their role is to take known ideas and make them practical for production, a judgment that requires human context agents do not have.

“We come up with all the ideas and then we work with AI agents to go build them. It’s mostly us that are the ones coming up with all the ideas.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: What is the risk of offloading too much reasoning to AI agents, and where should maintainers draw the line?

Olson identifies two distinct risks. The first is hallucination: agents making high-confidence decisions in domains where their error rate is too high to trust without human verification. The second is more subtle. If maintainers offload the parts of their work they actually enjoy, they lose the intrinsic motivation that sustains long-term open source involvement. Olson is explicit that automation should target the work maintainers want to stop doing, not the work they find meaningful. She declined a suggestion to replace her direct engagement with users on Valkey Slack with an agent-driven knowledge base, because that interaction is precisely what she values.

“If there’s a part of the job you enjoy, keep doing it. Don’t offload it. That’s not keeping your sanity. That’s not maintaining your mental health.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How should open source maintainers who are reluctant to use AI think about where to start?

Olson’s advice is grounded in self-awareness rather than adoption pressure. Maintainers should first identify what originally motivated them to work on the project, then identify which parts of their current workload are eroding that motivation through repetition or grind. AI agents should be directed at the second category only. Olson is clear that she is not personally an enthusiast of AI across the board, but she treats it as a tool to be evaluated on fitness for specific tasks. She points out that the act of figuring out how a tool applies to your specific problem is itself a valuable and even enjoyable engineering exercise.

“Try to figure out what drives you to work on open source. Go find the task inside the project that you don’t like doing that the AI is well situated to offload.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Q: How has the Valkey project grown since its founding, and what role has AI played in that growth?

Olson describes Valkey’s growth trajectory as rapid, attributing it in part to the energy that came from building something new with a focused team working on problems they found genuinely engaging. The automation Valkey has built using AI agents has allowed that small maintainer team to handle a contribution volume that would otherwise require significantly more human capacity. Olson connects this to the broader potential for other open source projects: AI is not a replacement for good project governance, but it can help a small, motivated maintainer group absorb more community contribution without burning out.

“The joy of learning and trying to figure out how you can use these tools to solve some of your problems will get you a long way.” — Madelyn Olson, Valkey Project Maintainer and Principal Engineer, AWS

Resources & Documentation

  • Valkey, BSD-licensed open source in-memory data store and Redis fork, maintained by the Linux Foundation
  • Valkey on GitHub, source repository including the AGENTS.md file for LLM context
  • Valkey Slack Community, official community channel for contributor onboarding and user support

***

👇 Click to Read Full Raw Transcript

Swapnil Bhartiya: Hi, this is your Swapnil Bhartiya and we are here at Open Source Summit in Minneapolis and we have with us once again Madelyn Olson, Valkey project maintainer and principal engineer at AWS. Good to see you again.

Madelyn Olson: Yeah, great to see you again.

Swapnil Bhartiya: And this time we are, of course we’ll talk about Valkey. But this time I think the biggest discussion and I think you folks are also delivering a keynote is about AI and open source AI in general when it comes to software. Either way, it’s kind of becoming a point of friction, but also it’s bringing a lot of value also because the fact is that because of AI, a lot of code is being written, vibe coding is happening and a lot of developers are using it just to assist them as well. But which also means that a lot of code is coming. And now how do you maintain the balance between the quantity of code versus quality of code? At the same time you may also want to use it. And once again you have to ensure that you are able to. So let’s first of all talk about your favorite project because you said, let’s talk about Valkey. So let’s start with Valkey. How are you folks using AI internally for Valkey?

Madelyn Olson: So yes, the main project that I’m a maintainer of is Valkey. And I think it’s great to add some numbers to everything we’re talking about.

Swapnil Bhartiya: No, you’re talking about that.

Madelyn Olson: So since about December, we’ve seen about a 30% increase in the number of PRs getting opened to the project. But more importantly, we’ve seen about a 500% increase in the number of lines of code changed across those PRs. So the big difference is not only are people making more changes, those changes tend to be significantly bigger, they have way more changes involved. And the bottleneck that we’ve actually seen inside the project for a long time is actually reviewing that code.

Swapnil Bhartiya: Right?

Madelyn Olson: And when we review code, we look at it more or less in two different ways. We look both at whether this code is functionally correct and whether this code is the right code for the project. Is this the right architecture? And from a maintainer perspective, I think AI is kind of okay at that first one. It’s like, hey, is this code going to crash? What bugs are there? But it’s not very good at that second one. A lot of the foundational models, they have a very strong bias for generating new code. They have a very strong bias for building stuff a specific way. And they often are not good at abstracting code in a good way to incorporate new functionality. So we definitely haven’t been able to scale maintainers very well to be able to incorporate this sudden increase in code. And we’ve done some stuff to optimize that. I, for example, have an agent that I use while reviewing code and it does the thing that AI is pretty good at. It’s like, hey, summarize this code for me. Tell me what the key changes are. I used an agent not too long ago to find a lot of bugs inside the Valkey project. We actually found a CVE that’s currently under embargo, which I can talk about a little bit later, not today under embargo. But the methodology that we use to find these bugs works very well to just apply on code PRs, which is basically adversarial testing, which is, hey, go take this change and just go try to crash it, go try to find bugs. And then if you do find a bug, report back, which builds a lot of confidence that that code actually isn’t going to break in production, which is very important for Valkey. Valkey is a runs-in-memory database used for caching and you really don’t want that system crashing in production. So we’ve definitely been using AI for that. Is this functionally correct? But the architecture half is a little bit harder. You can’t necessarily ask the AI, is this code good for the project? Because it’ll be like, oh, of course. And you’ll probe it and it’ll be like, oh, you’re totally right, I misunderstood. As we’ve all had, AI agents love to double back on what they’ve said. And so we’re trying inside the Valkey project to automate as much of that first half as possible so that maintainers can focus on that second half, that maintainership half. I’m going to be doing a keynote tomorrow talking about AI, how we’re using AI agents inside the project. And one of the key things we’re going to be highlighting is basically how we use agents to automate some of the backporting process within the project. So, unrelated to this conference, we actually launched Valkey 9.1 this morning. And part of that process involves taking commits from the development branch into the actual release branch, because not everything that goes in development is going to go for a specific release. And so we built an agent that’s responsible for taking the commits, applying them from the development branch to the release branch and then if any tests fail, it can go and try to resolve them itself. And if all of that fails, we still have the backstop of maintainers to go and look. But the ideal case is the maintainers don’t get involved at all. The maintainers are just there to say like, hey, this is correct. These are the correct set of commits to be backporting. So we’re trying to take that load off of the maintainers.

Swapnil Bhartiya: This is actually very important because some projects, and I knew that the pitch will come for a lot of maintainers, they are overwhelmed in the past because they could not keep up. So if you look at Valkey, there are a lot of other projects. What percentage of work is being done by agents versus humans? And how is it freeing your time to focus on the things that you wanted to focus earlier, which you could not, because the grind that happens when you have to review the code was just too much.

Madelyn Olson: And so I kind of highlight like, the core decision making inside the project is still done by the maintainers. We’re not handing that over.

Swapnil Bhartiya: So what percentage? If you talk about like 90% AI, 10% maintainers, too much.

Madelyn Olson: Trying to come up with a number is probably going to be a little difficult. Like we’ve definitely with some of the automation we’ve built, that kind of offloads probably, like 20 to 50% of the day to day work of maintainers, we’ve been able to offload to AI. And that also obviously changes week to week. So like a week like this one where we’re doing a release, a lot more of the work has been offloaded because a lot of that backporting stuff’s been automated. But on a typical week, I might be doing almost all of my time doing this deep work, because it’s fun also.

Swapnil Bhartiya: Right? Sometimes it’s fun, yeah. To just dive into those. We hear the word slop a lot these days in terms of when it comes to a maintainer’s job. How would you define it and how do you catch it? Because that’s the tricky part. And actually when I was working on your interview, I just wanted to do some research. So Claude told me that you are founder of Valkey. I said no, I interviewed her. This is my interview. Oh, I’m sorry, you’re correct. So you have to keep an eye. It’s like dealing with a toddler which will lie all the time. So how do you define slop and how do you catch it?

Madelyn Olson: That’s a good question. So the way I think about AI slop is there’s a couple different parts. There’s the code that’s generated is just wrong, or documentation that’s generated is just wrong. It’s basically pulling from incorrect sources. And that’s mostly because you’re not providing enough context into the LLMs themselves. What we see is a lot of new contributors will often say, like, hey, Claude, go solve this issue. And they’ll point to some issue in Valkey and Claude will give its best guess. It will try to figure it out. What you need to do is give it a lot more context. If you just point at the issue, it’s going to probably come up with code that looks correct because that’s what they’re very good at doing. But it’s probably missing edge cases. It’s probably not well integrated inside the code. And when we see those as maintainers, the first thing is either to say if it’s the type of person who’s probably never going to be good at contributing code, we might just say, please don’t contribute. One of the things we’ve noticed is because it’s so easy to point an LLM at an issue that the bar for who’s contributing has come down a little bit. And so those are people who either, if they are excited but don’t have the understanding yet, are excited to be involved. And so we still want to encourage those people. One of the things we’ve changed in the project is we used to keep a list of issues inside the Valkey project which we labeled as good first task. So if you wanted to get involved, you could go pick these up. And one of the things we noticed is a lot of LLMs like various tools were going and picking these issues and just trying to solve them themselves, which doesn’t really solve the problem. These issues aren’t strategically important. They’re designed in such a way to help get people involved. So we still have another mechanism like you join our Slack and we will hand that individually to people. And that’s how we get people to actually do these first issues. So if you’re just showing up and generating a bunch of code because you just pointed the LLM at the project, we’re probably just going to gently push you away. And so that’s kind of like the first level of slop. The second level of slop is someone who genuinely does kind of understand what’s going on. The code, at least more or less looks correct, but it’s kind of subtly not production ready, maybe a subtle bug. That’s one of those cases where it’s actually just not too hard to just gently poke their agent to be like, hey, you’re missing these bugs. Think about these cases. So just because the code is not necessarily as good as a human would write, that doesn’t mean you should just throw it out. You should still work with them, help them grow. Because as long as there’s a human on the other side looking at these inputs and being like, hey, I understand, I missed these cases, next time, hopefully they’ll produce better code and continue to be involved in the project. So it’s kind of like how the slop was generated that’s more important, not necessarily that it was bad.

Swapnil Bhartiya: You said the bar is kind of low, but have you also changed some expectations which would not even be realistic two years ago, before agents, where you’re like, okay, these are the compromises we’re making. We are lowering the expectations of the bar so that AI agents can also contribute versus humans who are like, oh, that’s not even something that you can ask me to do.

Madelyn Olson: Yeah. And so just to clarify, when I say the bar for someone to be able to contribute is lower, our bar is definitely not lower. And we’ve, if anything, been trying to use agentic AI and code generation to raise the bar. We want to add more verification, more validation.

Swapnil Bhartiya: So the change of the bar is that you have increased the height of your bar, but lowered the bar for people to be able to contribute.

Madelyn Olson: Okay, perfect. That’s what’s important. If someone shows up and they have a simple feature, there are some simple features which you can just contribute and AI agents are helping with that. We have an AGENTS.md file inside the codebase so that you can basically point your LLM to it, your Claude Code to it and be like, hey, use this file as an understanding for the codebase. And the whole goal of that is to raise the bar of what a typical agent will be able to do.

Swapnil Bhartiya: Can you talk about what the provenance guard function does? What does it do that typical traditional review would miss?

Madelyn Olson: Yeah, so what you’re talking about is one of our goals recently in the project is to build more automation and verification so we can offload work from the maintainers. So one of the unique constraints that Valkey has is that Valkey is a fork of another project called Redis and they have different but incompatible licenses. So Redis is tri-licensed, one of which is AGPL, which is an open source license. Ours is BSD. So we can’t take commits from AGPL and apply them onto BSD licenses. So one of our concerns has been that someone might unintentionally think they are allowed to, since they’re both open source. Not everyone is deeply familiar with the nuances of those licenses. And one of our other concerns is, what if an agent does this? What if an agent goes and tries to pick a Redis commit and apply it to Valkey, not knowing that they aren’t allowed to do this? So one of the things we built is a tool that basically keeps a large list of hashes of commits from Redis. And we’re also basically able to quickly check, see, hey, is there a match here? And then on top of that, we can use LLMs to do a secondary level check to be like, hey, is this okay? And the goal is to add confidence that this edge case, which we’re worried about, we don’t spend so much time as maintainers worrying about. Because it’s possible they might not declare where the provenance came from of the code. And there’s a DCO check which we use, basically says you’re allowed to do this, but that’s still not a sufficient check. We want to make sure that we’re not accepting this code. So that’s been our goal as a project, to continue to try to build this automation. And I think that’s a great use of AI. As a maintainer, I can write a lot of code more quickly that’s very tailored to what the Valkey project needs. And so we’ve been trying to do a lot in the project.

Swapnil Bhartiya: You did talk about some of the tedious tasks, but if you look at things like backporting, CI testing, how much are you handing over to AI and AI agents?

Madelyn Olson: I’ll give another example, which is one of the features that we’re thinking a lot about inside Valkey is being able to support SSDs as an alternative storage of data and not just RAM. Obviously RAM is very expensive these days, there’s a RAM shortage. And so one of the things we can do very quickly is evaluate a bunch of different trade-offs with prototypes. Basically we can go and say, hey, here’s the high level interface we want to build. Now go try to store both the keys and the values on disk. Just store the values on disk, keep the index in memory, maybe store the keys in memory and the values on an external network attached drive, like an EBS volume or maybe S3 or maybe something like Dynamo. And so agents can very quickly go and validate all of those. Because as I said, they’re very good at generating lots of code and they’re very good if something doesn’t work. They can try new things, especially if you give them sort of guardrails to think through. And that helps us prototype a lot more quickly. And so when we actually get to the time where it’s like, okay, we’ve evaluated a bunch of prototypes, now which is the actual choice we want to make, we have a lot more information than we did before. So AI is great in that sense. But then when I actually go to write code myself and be like, hey, this is the actual production quality code that I want to be writing, I’ll typically do much more like a pair programming type of thing with the AI. We’ll typically write all the tests together to make sure everything’s constrained correctly. Then I’ll maybe let go of the agent and let it go try to implement everything to those specs. And then there’s always going to be that interface because we need to get that code generation pointed more towards the right abstractions and the right architecture. So AI is both good in the unbounded exploration phase and then also when you want to write the code, it’s good at helping accelerate the actual speed at which you can write well-constrained and well-focused code.

Swapnil Bhartiya: AI is very good at what you explained, but sometimes AI is not very good at coming up with new ideas because it’s always someone else’s idea that it learned from. How do you also ensure that when we look at the Valkey project, the trajectory, the vision, the evolution of Valkey is guided by humans, not by AI?

Madelyn Olson: I think the real mitigation we have for that is just making sure humans are in the loop for all of the key decisions that are being made. Valkey is actually in a unique position. It’s not really on the cutting edge of technology. It’s kind of at the edge of the application of it, the engineering side. So we do rely a lot on papers. We do read a lot of papers to figure out what we should be building. And that’s where a lot of the actual ideas come from. We’re not researchers. Most Valkey maintainers are engineers. We go and try to figure out how do we apply this to make it practical for production workloads. And so we’re just deeply involved in the entire process. We come up with all the ideas and then we work with AI agents to go build them. But yeah, it’s mostly us that are the ones coming up with all the ideas.

Swapnil Bhartiya: From your perspective, ever since Valkey came out and then agentic AI and now we have a foundation as well for that. How has AI impacted open source? I mean we talked about the toil and everything, but in general, not just for Valkey, but in open source because we hear mixed stories, we hear a lot of horror stories, but it can find bugs that humans will never find because there are a lot of things it can check, it can also help fix those bugs. So how have you seen that because of AI, Valkey itself grew very fast, and if it is done right, the way you focus on it, it can help other open source projects grow faster?

Madelyn Olson: At the end of the day, you really have to accept the fact that AI is just another tool. I’m not the biggest fan of AI for a bunch of different reasons, but the truth is it’s not going away and it’s not going away anytime soon. And so I think you do have to have this optimistic view of trying to figure out what’s the best way to use your tool to solve your problem. And you kind of already said everything that I was going to say, like go find the task inside the project that you don’t like doing that the AI is well situated to offload. I don’t really think you should be giving AI high level decisions of the project, but it’s pretty good at doing things where it can exploit its non-determinism. It can go try lots of things. As I said, we did a bunch of security reviews and basically the harness we set up was, hey, here’s a deterministic test and it looks for crashes, it looks for corrupted streaming of data. If you can find any of these cases, let us know. And then we just said, go try 10,000 different things, come up with new things, and let us know what you find. And it found a couple, it found the CVE we talked about. It found a bunch of other esoteric bugs. And it’s better for the AI to have found that proactively than some end user finding it or some security researcher who has an embargo date they want to release it on. So you need to at least be thinking about that as a maintainer. There’s a lot of stuff that we sort of internalize. We have to do this work and we have to say, no, these AI agents are tools. We should be letting the tools do what they’re good at. And that doesn’t mean handing over everything. There’s a phrase I hear a lot, which was from product managers at Amazon. They’re like, well, why can’t agents do that? And the answer in many cases is no, agents can’t do that. The risk of hallucination is too high. At some point, we really want humans to be thinking about the direction of the project. If we offload all of the reasoning and thinking to the agents, then the humans aren’t doing the part they enjoy. Like, you’re not required to use agents. If there’s a part of the job you enjoy, keep doing it. Don’t offload it. That’s not keeping your sanity. That’s not maintaining your mental health. There’s a long history of maintainers feeling burnout. I myself was very burned out back when I worked on Redis open source for a while. And I came out of it just because Valkey happened and I was, it was very fun to work on Valkey. I got to focus on all the stuff that I like to do. We got to build lots of things. And so it’s really, I think AI is at an interesting place where a lot of people are experimenting. A lot of the stuff I mentioned is not super novel, not super unique. But the joy of learning and trying to figure out how you can use these tools to solve some of your problems will get you a long way.

Swapnil Bhartiya: You actually hit on a very important point, which is burnout. I am also a gamer and a woodworker. I use power tools, but I am the operator. They are just a tool. A lot of games have a lot of grinding so I lose interest in the game. The same thing with a lot of projects, people burn out because it’s repeated, it’s grinding. So they don’t enjoy it. So I flipped it. What advice do you have for maintainers or developers, if not a total playbook, on how they should use AI? Because some are hesitant, some are totally reluctant, some don’t like AI at all. That will actually help them deal with burnout because then they will get back to the exciting part of the project they’re involved with, which is the exciting new things. Let AI do the toil. So your advice to maintainers and developers, how they should use AI and agents?

Madelyn Olson: I would say try to figure out what makes you, what drives you to work on open source. So the original thing for me was always that I like to help people. I originally got involved in Redis open source when there were bugs that I found inside the managed ElastiCache service that I would rather give out so that other people didn’t face the same issues that we did inside the ElastiCache service. And I enjoyed also just going through issues, finding other unrelated bugs and going to help solve their problems. That’s what I truly enjoyed from the project. And over time as I got more leadership in Redis Open Source and then in Valkey, I sort of lost a lot of that connectivity to the actual end users’ problems that we were solving. And the thing that I’ve been trying very hard to do is if I automate something with AI, I’m trying to not go do more of the things that burn me out, but basically take that time and spend it on stuff that I actually want to be spending time on, which is going and solving the little things, the little issues people have. I love being on the Valkey Slack when someone’s like, hey, I have this little weird edge case. Then maybe I can’t easily solve it, but I like going and talking to them and understanding why they have it. And someone in my team literally said, why don’t you offload that to an agent? You could install a knowledge base and they could go talk to the person. And I said, but that defeats the entire purpose. That’s not what I want to do. And that’s what I see with some folks, not all other maintainers, is they see this pressure to automate everything with AI and they’re like, this is what I love, why would I automate everything? And it’s really just be curious about the tool and go and try to automate the things that you don’t want to do. Don’t try to automate everything.

Swapnil Bhartiya: Madelyn, thank you so much for joining me. And of course, talking about the things that keep you excited and how AI is helping you in that excitement. And also thanks for sharing the way you folks are using AI because that also helps other projects to learn from it because the growth momentum is evidence of itself. So once again, thank you for your time today, and I look forward to chatting with you again.

Madelyn Olson: Yeah, so great talking. Thank you. Thank you.

Why High Availability Breaks Even in the Cloud and How to Fix It | Matthew Pollard, SIOS Technology | TFiR

Previous article

Why AI Observability Fails Without Dynamic Data Collection Control | Shahar Azulay, groundcover | TFiR

Next article