Cloud Native ComputingDevelopersDevOpsFeaturedLet's Talk

Striim Is Solving Three Major Problems Of Real-Time Data Movement: Analytics, Operations & Scale


Guest: Alok Pareek (LinkedIn)
Company: Striim (Twitter)
Show: Let’s Talk

Striim focuses on real-time data movement which solves three problems for their customers: Real-time analytics, real-time operations, and real-time data movement at a large scale.

In Founder and Executive Vice President of Products at Striim, Alok Pareek‘s, words, “Our customers are trying to just rewrite their applications, either for digital transformation or to take advantage of new customer experiences. To provide better, more agile, faster services, you need to have real-time data. And so, we really focused on that specific market.”

But is there a market for real-time data? Striim believes the outlook and framework of products like IBM’s InfoSphere are very batch-oriented and that, according to Pareek, “introduces a very interesting problem that when you move things in batches, let’s say, end of day, or sometimes for very large companies that have just massive volumes of data, it might be three to five days. The freshness of that data may or may not be sufficient.” With real-time data, it’s a very interactive outlook.

The market for real-time data is very diverse. Striim has customers in the banking and financial sector, healthcare, travel, logistics, transportation, and retail.

With regards to the modernization of data architecture, Pareek says it’s “important to make sure that modernization principles are being applied to the data itself. So, if you have data which is, let’s say in certain formats, that may not be easily shareable or accessible, then as part of not only converging and consolidating data across multiple systems, could you also modernize it so that it’s easily shareable.” Pareek goes on to mention the concepts of  citizen data, or data democratization, where we “want to look at data in very different ways to kind of understand what value could they add to the business. So, in a sense, I think data becomes the product at some point.”

Version 4.0 of the Striim platform helps make things simple and automated, with out-of-the-box wizards to ease the complexity for users and much-improved monitoring.

The summary of the show is written by Jack Wallen 


Swapnil Bhartiya: Hi, this is Swapnil Bhartiya here, and welcome to another episode of Let’s Talk. And today, we have with us Alok Pareek, founder and executive vice president of products at Striim. Alok, it’s great to have you on the show.

Alok Pareek: Thank you so much Swapnil, pleasure to be here.

Swapnil Bhartiya: Today, we are going to mostly talk about the launch of Striim 4.0, but before we go there, this is, I think, the first time that we are talking to somebody from Striim, if I’m not wrong. So, tell us a bit about the company itself. What do you folks do? Because, you are also the founder of the company. So, talk about the pinpoint that you saw in the industry that you wanted to kind of fix and address, and you created a company.

Alok Pareek: Yeah, gladly. So, we focus on real-time data movement in the industry, and this real-time data movement solves three very interesting problems for our customers. That has to do with real-time analytics, real-time operations, and then just the real-time data movement at a very large scale. And so, this day and age as our customers are trying to just rewrite their applications, either for digital transformation or to take advantage of new customer experiences, or to provide better, more agile, faster services, you need to have real-time data. And so, we really focusing on that specific market.

Swapnil Bhartiya: Can you quickly also talk about the [SPS 00:01:29], the market for of course, real-time data. What are SPS’ within industries, what kind of workloads that kind of use real-time data?

Alok Pareek: Sure, sure. Glad to. So, I mean I think it’s best contrasted with how a lot of businesses today are actually in fact moving data. To a large degree, if we examine the data integration landscape, you had a lot of infrastructure platforms. These could be the likes of Informatica, or Talend, or IBM’s own InfoSphere family, et cetera. In general, what we see is that to a large degree, the outlook and the framework of these products is very batch oriented, to a large degree. And, that fundamentally introduces a very interesting problem that when you move things in batches, let’s say, end of day, or sometimes for very large companies that have just massive volumes of data, it might be three to five days. The freshness of that data may or may not be sufficient to satisfy a number of different services that one has to offer for their services, or their customers, or their associates, or partners.

And so, in real-time data integration, what we really mean is, it’s a very interactive outlook. No longer am I satisfied with the fact that… And, if you’re analyzing all of my, for example, purchasing history over the last month, and then you have some sort of a recommendation for me, we are just one click away from selection of services, or selection of our choices, because of just the fact that you and I use our mobile devices, we can change our minds very quickly. There’s a lot of price fluctuations for our services or maybe an improvement in services. So, given that if you don’t really have real-time data, it’s very, very difficult to actually compete with others who do have real-time data.

So, what we try to do is we shrink the latency between many of your operational systems, and there are downstream systems that might be trying to do analytics, or AI, or your writing your own logic on top of that to serve your customers. So, we try to really make sure that that latency is very interactive, to the degree that you still have the attention of the customer or the consumer, and they’re still interacting with the business. And, that’s really what this whole real-time data space is all about.

Swapnil Bhartiya: Is it specific to certain industries? Or, real-time data integration makes sense for most modern businesses today who are looking at of course, when we talk about cloud, we talk about multi-cloud, there is no single cloud. We do on-prem, so if we talk about hybrid or multi-cloud strategy, so talk about who is it for?

Alok Pareek: Yeah, no, great question, Swapnil. So, of course our view and this is not a biased view, we have hundreds of customers and our own customer base is actually very, very diverse. So, we have a number of customers who are in the banking and financial sector area. We have a number of customers who are in the healthcare area. We have lots of customers in travel, logistics, transportation, and then finally we have a lot of customers in retail. So, if you take a look at this landscape, clearly this is something that is more of a ubiquitous type of a requirement. But, there are certain industries where this makes a lot more sense, particularly in retail, what we are seeing is, you have some mammoth marketplaces now that are created online, and you know who I’m talking about.

And so, you can always go and purchase almost any item there. So, if you are trying to compete in retail with these newer players that have emerged, and you are still invested in a lot of legacy infrastructure. Then, how do you make sure that your inventory, and your product, and your catalog, and your order management systems have visibility such that your fulfillment of the order is live in real-time, and you don’t end up with your consumers who mostly now have this omni-channel kind of a mindset, where I might actually go ahead and order online, and I might go pick up at a brick and mortar. And, if the item is not available, I’m going to have a very, very poor customer experience. So in retail, we do see this quite a bit, but like I said, it is universal.

We are seeing a lot of applications in healthcare, where increasingly, because of the current pandemic, oftentimes clinicians, and doctors, and physicians, and their assistants, and rehab facilities, they all want to see a temporal 360 of either the patient, or the records. So, how do you actually bring this information together so that the urgency and the low latency of it can be satisfied. And, we’ve all wanted to get there for the last 20, 30 years. The interesting thing is, it’s possible and many are actually doing it today. So, just the answer to your question now Swapnil, for sure is that, it’s not specific to a specific vertical, I do think that this is a very, very broad-based requirement. It just so happens that certain sectors are a lot more at risk. If they don’t adopt right away, if they don’t transform them right away.

Swapnil Bhartiya: How much importance is given to the modernization of data architecture, as we do talk about just the transformation, everybody talks about the right application with this code, that code. What about data? How much are you seeing there? How much focus is being given, if it is not enough focus, should it be given? And if yes, why?

Alok Pareek: Sure, sure. Yeah. That’s a very good question and very interesting question as well. I mean, I can share with you what we are seeing and my own view. Just starting back from my graduate school days at Stanford, we always used to understand that there’s a separation of concerns between business logic, application logic and the data. We tried this in the ’60s, and ’50s and ’70s where the data and the application logic coexisted, and that created all sorts of problems. And which is what led to the separation of concerns, relational models, and these other models came into picture. Besides these or that, data could… And, the storage layer could be somewhat separated out. And, it is as relevant today, in my view.

If you take a look at what… Let’s talk about cloud native applications. So, if I’m starting out fresh, I have a new startup, that’s one thing, I have a luxury to pick cloud native applications. And perhaps, all of the data comes in the future of my undertaking, which is the rare case, most of the businesses that exist today. And, there’s millions of these businesses, they’ve been operating for a number of years. So, through their own organic growth or through mergers and acquisitions, they just have a ensemble of these different systems, applications, right from servers, and finally, the data. So, it’s very important to handle data uniformly and that actually survives. And, that outlasts almost any coding language, or programming language, or any new architecture pattern, if you will.

So, the second part of it that we see, is it’s also important to make sure that modernization principles are being applied to the data itself. So, if you have data which is, let’s say in certain formats, that may not be easily shareable or accessible, then as part of not only converging and consolidating data across multiple systems, could you also modernize it so that it’s easily shareable. And, this pertains to these concepts of citizen data, or data democratization. Where, the time where there were a few experts in an organization, and they had the power because they could understand and interpret the data through their programs. That is a passe view of the world. What we increasingly see is that there are smart people, there’re smart people in your marketing organization, engineering organization, in your financial organization, risk organization.

And, these guys want to look at data in very different ways to kind of understand what value could they add to the business. So, in a sense, I think data becomes the product at some point. And, so unless your products are increasingly not beginning to reflect the importance of data within them, I think that you’re going to have more of an antiquated type of service, antiquated type of offering. So, I think data is hugely important, it’s going to continue to be important. And, the last point I’ll make there my own view is, we’ve taken several steps at this to try and get everything together maybe in Hadoop, or in maybe data warehouses, and operational systems.

I strongly think that’s going to continue to exist. There’s no such thing in my view as getting all the data in one place, this one size fit fits all paradigm. I think, many have commented before. It is just a super complex, super challenging thing to arrive at, both in terms of the throughput and the latency requirements that all different types of diverse applications have. So, I don’t know if I answered your question. That’s kind of the overall take on data in my view.

Swapnil Bhartiya: No, you did answer also, this is a topic which is open-end, there are so many ways where we can go with that. So, just thanks for explaining that. Now we talked about real-time, of course, integration, we talk about industries which are using it. Now, I want to talk a bit about what are you folks doing to, I mean, of course, help users? That’s where we are here for right. To help users with what our technology folks are building there. And, if you can also share the iterations of the platform itself, the version 4.0 was out. So, in a way, what I want to understand is that let’s just reflect on this decision, and how much of that is there info that you released to address some of the challenges to solve some of the problems. So, talk about it.

Alok Pareek: Yeah, sure. So, let me actually, just maybe step a little bit deeper and peel some of the layers. So, Striim is a platform, it has two Is, as you know, it’s spelled deliberately. Because, one of the I, represents integration and the other I, represents intelligence. And, when we talk about intelligence, we talk about it a lot. The founders in this company had a very strong background in database application in our prior products. And then, I had a database background. So, one of the things is, this requirement and need for real-time data, we’ve been working with this for many years. So, I would say close to 15, 20 years. What was missing was, as you move data, there’s a lot of trending, and a lot of analytics that customers begin to ask us. Could you go ahead and run these analytics for me?

So, the question is why do you care? I mean, you have such amazing analytic systems, why can’t you just move the data and then analyze it? Okay. And, it’s a very interesting analogy here of, if you’re… Maybe, just give a simple example, if I have a large jar and I’m trying to put in different types of marbles in it, which have different colors. If I already know that someone’s going to ask me for count of the marbles by color, I can dump all the marbles in the jar and then actually go ahead and allocate a few people to sort them out and do account. And then, that’s a group by query, and then you can actually come up with a result.

But in many cases, if I know ahead of time, that that question is interesting. The best time to count that up is while you are adding the marbles in the jar. Okay. And this is the intelligence piece that is a foundational principle in Striim. That, not only is it a real-time data integration platform, but we are aware, as the data is coming in, we have the luxury to count up the data and the metadata, so that there’s a number of continuous queries that can be answered in real-time. And, if I’m looking out for example, a window, and I should be able to tell you that there are 3 blue cars, and 15 orange cars, and 2 trucks and so forth, no matter what, it’s my view.

So, this windowing capability, is something that’s a very powerful thing that can be done on the fly, in the integration pipeline. This is where Striim is very, very different compared to a lot of the traditional players. In fact, many of the emerging startups are beginning to think about this when they talk about streaming analytics, but Aviva has streaming analytics in our data integration platform day one, that’s how it was designed. In fact, we had our own patent filing way back in 2014, and it’s been granted. So, people are welcome to read it. So, this is what’s new. So, now coming back to your question about Striim 4.0 specifically. Over the years, as we put in a lot of the features, and the functionality into the platform. So, we focused a lot on making sure there’s hundreds of connectors, and we can move data in real-time, and the system could be scalable. So, we solve those problems.

In 4.0, the key problem in my view that we have solved is number one, just making it simple, and making it automated. So, the user experience part of it, how do I actually design a pipeline so that it’s simple, and I don’t have to make too many modifications. And, almost as if I’m judging the intent of the user, and giving you a very, very rich experience, smooth experience. So, we’ve focused a lot, of a number of features that you can to that. Where we’ve reorganized our four designers, and there’s a lot of out of the box wizards that users can use. So, I think there’s a lot of emphasis in that area. Number two, is on the monitoring side, because one of the things that we learned along the way, was when you have real-time… If you move in real-time, by definition when that data is not available, it becomes somewhat mission critical, because your businesses and services start relying on it.

And this is a very interesting example. I was talking to one the CIOs of a telecommunications company. And, it was a little bit of an interesting discussion and an escalation. And, they told me, look, “We lost 15, 20 minutes on this thing, because they’ve done some maintenance.” And I asked them, “Well, what were you guys doing before using us?” And, the response was, “Well, we were getting data, we would basically have all these subscriptions, and end of the day you’d get a report. But, once you implement real-time, you’re getting the report every minute. And, so all of the executives in the company start questioning, “Hey, what is the current count if you’re not getting that data?” So, coming back to the second piece of it, the monitoring around it, is also something that we’ve invested a lot in and to try to make it simple.

A very simple example I gave is when you go to an airport and you walk to the baggage claim area, there’s a monitor there. And the monitor, when it says your first bag has arrived, that gives you a sense of reassurance. You don’t have to do it, but when it’s there, you actually appreciate the value of it. So, we’ve tried to give that kind of an experience for our data subscribers who are building these pipelines. So, when the first event in a specific pipeline comes in, or if that event is absent for a while and you’re expecting it, how do you alert on it? So, there’s a lot of emphasis on simplicity and monitoring, so that these critical applications have the right alerts and people can actually respond into them in a very business, efficient manner.

Swapnil Bhartiya: Excellent. Excellent. Once again, thanks for explaining things in detail. I think now I have everything, we understand about the company, we understand of course the latest release, and also thanks for going back a bit and explain the Striim origin as well. I think, now I have everything else and of course, we can always get you folks back on the show, whenever you know you want, should we wrap this up for today?

Alok Pareek: Sounds good, Swapnil, I mean, unless you have any other questions, I’m always happy to come back. And, I just wanted to just also mention that today we have… Almost every city block that you travel, you’ll definitely see a Striim customer. These are trucks going by delivering packets. These are retailers. These are shops that you can walk into. And, I’m really excited that we are part of this modern digital economy, that a lot of the customers are beginning to rely on our real-time data pipelines.

Swapnil Bhartiya: Excellent. Alok, thank you so much for taking time out today and talk about, of course, the Striim to platform and more importantly, the challenges for real-time data integration and how people should actually focus on data architecture as well. So, thanks for sharing all those insights. And, as you rightly said, let’s get you back on the show soon, but thanks for your time today. Thank you.

Alok Pareek: Thank you.