Hello System Management!

By May 28, 2019 Blog, EdgeX Foundry

Written by Akram Ahmad, EdgeX Foundry contributor and Principal Software Engineer at Dell Technologies

For those of you not yet familiar with the canonical way of introducing new technology-centric stuff, at least the way we do it in the world of computer programming–and thinking here specifically to the “Hello World!” first-ever program introduced to the world by programming legends Kernighan and Ritchie with their C programming language–please allow me to clarify what may be an admittedly enigmatic title we’ve got for this blog post. Essentially, it was with the EdgeX Foundry Delhi Release that the team had the pleasure of introducing EdgeX System Management capability to the world! Hence, “Hello System Management!” (More on the Edinburgh Release in just a bit.)

It’s my ongoing privilege to be a part of helping design, implement, and shepherd System Management (or “SM” for short) to date, and going forward. With that in mind, I would like to give you a flavor of the capabilities that SM brings to the table.

You can think of the System Management Agent (SMA), in particular, as a brand-new service which serves as the coordinator for control plane information (i.e. status, configuration, and metrics for EdgeX services). The SMA also control actions on EdgeX services (i.e. starting, stopping, and restarting services). Cloud or third-party systems can, in turn, call on the API provided by the SMA to trigger the actions or to get the control plane data they need. In a nutshell, the SMA can serve as a one-stop shop for managing a deployed instance of EdgeX.

Each EdgeX micro service has a corresponding management API that the SMA calls on to help control that service (e.g. to stop the service) or fetch its latest configuration or metrics. The SMA, along with the management API provided by each service, will be expanded in future releases of EdgeX and will one day offer control plane data and actions via alternate protocols (for example via the well-known protocol SNMP that is part of the TCP/IP suite that powers the Internet as we know it today).

I invite you to hold on to the thought that, for the constellation of services that will be offered via EdgeX, there needs to be “controller” of sorts…

Now, let’s turn to the truism that an IoT platform like EdgeX is used to collect the data from “Things.” Put another way, the platform ingests data that is physically sensed from IoT sensors and devices. Work associated with collecting, managing, and disbursing sensed data is exactly the kind of work associated with a “data plane.” On the other hand, the kind of work associated with operating and managing the IoT platform software and infrastructure is best described as “control plane” operations.

This includes getting the IoT platform and infrastructure running (or shutdown), configuring the platform software for the particular use case, and understanding the health and status of the software platform (is it running and what type of resources is the IoT software platform using?). Analysis of any control plane data may be used to take action as well, but action revolves around the IoT platform itself–not the sensed or controlled world. For example, in the control plane, it may be determined that a service needs to be restarted because it is consuming too much memory.

This is where the SMA comes in!

The System Management (SM) service will assist in protecting EdgeX and reducing the surface area of an API attack. Rather than opening up access to all services to the central management system, the SM service serves as a single point proxy to the control plane for all of EdgeX services for the central management system. The SMA, therefore, reduces the number of access points to EdgeX and reduces potential security vulnerabilities. It also allows the central management system to be loosely coupled to all of EdgeX—requiring the central management system to again have just one access address (the address of the SM service) that it needs to know about for any EdgeX deployment.

Before digging deeper, let’s recap what we’ve learned so far: System Management (SM) functionality, as determined by the EdgeX community, is generally associated with control plane data and operations.  The control plane (and System Management) is about managing the IoT platform and infrastructure. The data plane is all about managing and understanding the physical world that the IoT platform is there to observe and control. Think about it: Whether one is talking about towering skyscrapers or flimsy tents rigged on the grounds of a park, there remains, as ever, the crucial need for control. Without coordination, things can get chaotic in a heartbeat.

Also, and crucially, SM is also about providing information—having retrieved that information in the first place—about the status of the services it manages. Eventually, building on this capability, SM will provide the means to reconfigure the services themselves. At this time, with the Edinburgh Release, SM can provide performance and memory metrics for requested services. Likewise, SM can provide detailed configuration information for the services requested by users of SM, as well as the health status of those services (whether given services are up or down.)

In other words, while control is a critical capability, SM is about more than just control. By the same token, we want to make it abundantly clear that we are building System Management (SM) capability to facilitate other central systems, and not be those central systems. In a nutshell, EdgeX SM is about helping promote interoperability—in this case, allowing you to manage EdgeX with your choice in central management system.

Let’s shift gears a bit now: When you look at a typical fog deployment, a larger management system will want to manage the control plane of the edge systems as well as all the intermediate and upper level nodes and resources of the overall deployment. Just as there is a management system to control all the nodes and infrastructure within a cloud data center, and across cloud data centers, so too there will likely be management systems that will manage and control all the nodes (from edge to cloud) and infrastructure of a complete fog or IoT deployment.

If you will be so kind as to allow me the use of just one more metaphor, it will be this one: Think to a team of workhorses ploughing the land (EdgeX services). Then think to the driver (System Management). Finally, and without going too crazy about the farming metaphor—all metaphors, including this one, can carry only so much water—I invite you to imagine two scenarios (1) First, the one without the other, and (2) Second, the two (i.e. the team of workhorses and the driver) working in unison. If you associated chaos with the first scenario, and clockwork unison with the second, you are in good company.

So with the Edinburgh Release, we will continue building SM capability to facilitate other central systems. Again, the goal is not to be those central systems, but rather to facilitate those systems. May your System Management (SM) learnings continue, and may the community be the better for it!

If you have questions or comments, visit the EdgeX Foundry Slack Channel and share your thoughts in the #community channel. Or, join the LF Edge Slack Channel and share your thoughts in the #EdgeX channel.