ArticleCXOsDevelopersEdge/IoTOpen Source

What You Need to Move from the Edge, to the Core–and Back Again


The edge and the core are good examples of two things that work well together. While each has its own processing capabilities and benefits, it’s often best to have the two working in concert so organizations can take advantage of instantaneous edge processing and the power of the core to derive actionable intelligence from their datasets.

Tight coupling of compute and storage resources at the edge is critical for fast processing as data becomes available. This will be particularly important as organizations move from batch oriented systems to real-time data processing. The ability to manage this processing is the power of an edge oriented architecture.

Additionally, organizations need to aggregate large amounts of data from many edge locations, and run machine learning algorithms on those aggregated data sets to understand trends on a more global level. That requires getting those data sets from the edge to the core, and getting the refined data algorithms that allow instant processing back to the edge, while initiating processing and analysis in real-time. This calls for a combination of infrastructure and an automated data pipeline that supports the ability to process, transfer and analyze large volumes of data at a remarkably fast rate.

Let’s take a look at how data processing from the edge to the core (and back again) is transforming cities. Traffic lights equipped with cameras that monitor traffic patterns, speeding and more, can process data in real-time and provide insights into congestion, pedestrian flow and other factors. They can act autonomously and respond immediately to what’s happening within those patterns to help keep traffic moving. The data they capture can be sent back to the core for deeper processing that can be used to further enhance the algorithms the traffic cameras will use in the future to optimize traffic flow. City planners can also use this information to glean insights for longer-term infrastructure planning.

A serverless and automated pipeline

Serverless computing, paired with an automated data pipeline, is the engine that drives the edge to the core process. Made easier by open source projects like the Kubernetes-based Knative, serverless supports accelerated and agile application development. Serverless also automatically scales up or down depending on the amount of data needed, providing cost and memory savings.

“Eventing” is one of the most important facets of Knative serverless computing. As defined by Knative, an event is the “universal subscription, delivery and management of events. Developers can build modern apps by attaching compute resources to a data stream with declarative event connectivity and a developer-friendly object model.”

In eventing, data that is uploaded into buckets triggers an event–for example, a message that is sent noting that a file of a certain size has been uploaded at a particular time. The event is passed along down to the serverless function, which is then able to connect to the storage, retrieve the file and analyze it. As such, the data is automatically and instantaneously processed, without the need for a system to continually check for new content.

During the initial assessment phases, inference at the edge allows real-time processing and analysis to take place in the field. Raw data derived from the inference is sent back to the core for more intensive machine learning and training. New inference models are then drawn from the core and sent back to the edge to inform the next set of data, and so on.

Potential in other industries

Smart cities are certainly not the only potential use cases for serverless computing and automated data pipelines. Other industries, including healthcare, can also benefit from the ability to move data from the edge to the core and back.

For example, let’s say a patient’s CT scans are placed into a data pipeline that runs between a doctor’s office and central location (the core). Once they get to the core, a learning process can be automatically triggered, whereby artificial intelligence and machine learning are used to compare those CTs with others.

Additional insights can then be derived from this learning process. Those insights can be instantaneously delivered back to the software the doctor is using to analyze the images, helping her provide a more accurate diagnosis for the patient. Each time this happens, the system will become smarter, leading to continuous improvements in patient care.

The right infrastructure for the job

Data and infrastructure have always been intricately intertwined, and that will remain true for the foreseeable future. Indeed, as organizations move closer to the edge, the need for an infrastructure that can support the intelligent movement of data from edge-to-core and back again has become more important than ever.

There must be a system that allows this transfer of information to happen quickly, securely, reliably and automatically. Open, serverless infrastructures and data pipelines are the answer. With this infrastructure in place, organizations will be able to leverage the edge and the core working together to process large amounts of information and derive actionable intelligence in moments, not hours or days.