Skip to main content
All Posts By

LF Edge

eKuiper Issues 1.4 Release & Discusses 2022 Roadmap

By Blog, eKuiper

Written by Jiyong Huang, Chair of the eKuiper Technical Steering Committee and Senior Software Engineer of EMQ

eKuiper, a Stage 1 (at-large) project under the LF Edge umbrella, is a Go language implementation of the lightweight stream processing engine which can run in various IoT edge usage scenarios for real-time data analysis. Being able to run near the data source at the edge, eKuiper can improve system response speed and security, saving network bandwidth and storage costs.

The eKuiper project usually has one release per quarter with new features and several patch versions afterwards. The current active release is v1.4 which is the biggest update after joining LF Edge! We have solid community contributions for thee release, including requirement collections, code contributions, testing and trials. So we are grateful to share the updates and hope they can benefit our users.

Rule Pipeline: Building Complex Businesses Logic with Flexibility

eKuiper uses SQL to define business logic, which lowers the threshold of development. Simple SQL statements can efficiently define business requirements such as filtering, aggregation, and data conversion in practical usage scenarios. However, for some complex scenarios, it is difficult to address by defining a single SQL statement; even if you can, the SQL statement itself is too complex and difficult to maintain.

Based on the new in-memory source and sink, the rule pipeline can connect multiple SQL rules easily, efficiently, and flexibly. The readability and maintainability of SQL statements can be improved when implementing complex business scenarios. The rules are connected with in-memory topics similar to the MQTT topics and support wildcard subscriptions, enabling an exceptionally flexible and efficient rules pipeline. While improving business expressiveness, rule pipelining can also improve runtime performance in certain complex scenarios. For example, multiple rules need to process data that has been filtered by a certain condition. By extracting that filtering condition as a predecessor rule, you can make the filtering calculated only once, significantly reducing the computation in the case of many rules.

Portable Plugin: Making Extensions Easier

The original version of eKuiper supported an extension scheme based on the Go native plug-in system, supporting individual extensions to source, sink and function (UDF). However, due to the limitations of the Go plugin system, writing and using plugins is not easy for users familiar with Go, let alone users of other languages. eKuiper has received a lot of feedback from users in the community about plugin development, operation and deployment, and various operational issues.

To balance development efficiency and runtime efficiency, v1.4.0 adds a new Portable plugin system to lower the threshold of plugin development. The new Portable plug-in is based on the nng protocol for inter-process communication and supports multiple languages, currently providing go and python SDKs, with more SDKs to be added in subsequent versions according to user requirements; simplifies the compilation/deployment process, and runs like a normal program written in various languages without additional restrictions. Due to the different operation mechanisms, portable plugin crashes will not affect eKuiper itself.

Native plug-ins and portable plug-ins can coexist. Users can choose the plug-in implementation or mix them according to their needs.

Shared Connections: Source/Sink Multiplexed Connections

eKuiper provides a rich set of sources and sinks to access and send results to external systems. Many of these sources and sinks are input/output pairs of the same external system type. For example, MQTT and EdgeX both have corresponding source and sink, and in the new version, users can do all connection-related configurations in the connection.yaml file; in the source/sink configuration, you can specify which connections to use without repeating the configuration. Shared connection instances reduce the additional consumption of multiple connections. In some scenarios, users may be limited in the number of connections and ports to external systems, and using shared connections can meet this limit. Also, based on shared connections, eKuiper can support connections to the EdgeX secure data bus.

Other enhancements

  • Support for configuration via environment variables
  • By default, eKuiper uses SQLite, an embedded database, to store metadata such as streams and rules, allowing for no external dependencies at runtime. In the new version, users can choose Redis as the metadata storage solution
  • Rule status returns error reason
  • Optimized SQL Runtime, reduce CPU usage up to 70% in a shared source user scenario
  • sink dynamic parameter support, e.g., MQTT sink can set the topic to a field value in the result so that the data received can be sent to a dynamic topic
  • Authentication support: user-configurable JWT-based authentication for REST API

2022 Roadmap

As a young project, eKuiper is still far from perfect. There are a long list of feature requests from the users and community. eKuiper will continue to grow in the new year and open to anyone who is interested to make edge computing powerful and easier to use at the same time. The team will focus on the four themes:

  • Enrich API and syntax
    • more SQL clause (in, between, like etc.)
    • streaming feature (dynamic table etc.)
    • more handy functions (changed, unnest, statistic functions)
  • Stability & Performance
    • Allow to select feature by build tag and provide a minimal core version
    • Incremental window
    • Continuously memory footprint and CPU optimization
  • Extension
    • Stabilize portable plugin runtime and support more languages
    • More http source type
  • Build & Deployment
    • Cluster and HA
    • Android support

Please check the 2022 Roadmap in our Github project for details, and stay tuned in with the eKuiper community.

Cloud Services at the Edge

By Blog

This post first published on the IBM blog at this link; it has been reposted here with permission. Some content has been redacted so as not to be seen as an endorsement by LF Edge. 

By Ashok Iyengar, Executive Cloud Architect & Gerald Coon, Architect & Dev Leader, IBM Cloud Satellite

Where the enterprise edge ends, where the far edge begins and what, if any, are the various points of intersection?

What do AWS Outpost, Azure Stack, Google Anthos and IBM Cloud Satellite have in common? Each one of them is essentially an extension of their public cloud offering into an enterprise’s on-premises location or edge facility. This is, in fact, the hybrid cloud platform paradigm.

Each vendor has their offering nuances. They even support different hardware for building the on-premises components of a hybrid cloud infrastructure. But the end goal is to combine the compute and storage of public cloud services into an enterprise’s data center — what some might call Enterprise Edge. It is worth pointing out that IBM Cloud Satellite is built on the value of Red Hat OpenShift Container Platform (RHOCP). This blog post will discuss where the enterprise edge ends, where the far edge begins and what, if any, are the various points of intersection.

To reiterate from previous blogs in this series, edge encompasses far edge devices all the way to the public cloud, with enterprise edge and network edge along the way. The various edges (network, enterprise, far edge) are shown on the left side of Figure 1 along with the major components of a platform product that include the cloud region, the tunnel link, a control plane, and different remote Satellite locations:

Figure 1. Different edges and IBM Cloud Satellite components.

 

Note that one would need more than one control plane only. For example, a telco location for the network team and a development location for deploying edge services.

Please make sure to check out all the installments in this series of blog posts on edge computing:

Edge components

As we have mentioned in our other blogs in this series, there are three main components in an edge topology, no matter which edge we are talking about:

  • A central hub that orchestrates and manages edge services deployment.
  • Containerized services or applications that can run on edge devices.
  • Edge nodes or devices where the applications run, and data is generated.

Some edge solutions do not use agents on edge devices, while others like IBM Edge Application Manager require an agent installed on each device. An agent is a small piece of code running on edge nodes or devices to facilitate the deployment and monitoring of applications. Refer to “Architecting at the Edge” for more information.

Which cloud?

In most cases, these platform products that bring public cloud services to an on-premises location work with one cloud provider. AWS Outpost, for example, is a hardware solution only meant to work with AWS. IBM Cloud Satellite, on the other hand, has certain connectivity and resource requirements (CPU/memory) but is agnostic to the hardware. The requirements generally begin at the operating system level (Red Hat) and leave the hardware purchasing to the customer. The Red Hat hosts provided can even be EC2 instances in AWS or other cloud providers. This means IBM Cloud Satellite can bring IBM Cloud services to remote locations as well as services from AWS, Azure, Google Cloud and more that are planned.

[…]

Overlapping or complementary technologies?

We hear the phrase “cloud-out” when describing the compute moving out toward the edge. But what we see from Figure 1 is that the services brought on-premises from the public cloud cannot quite be extended out to the far edge devices. That is where one would require a product like the IBM Edge Application Manager to deploy and manage services at scale.

A common challenge of edge workloads is training the artificial intelligence (AI) and machine learning (ML) models and using predictive model inferencing. An IBM Cloud Satellite location can act as the platform in close proximity where data can be stored and accessed, and AI/ML models can be trained and retrained before they are deployed on to edge devices. Or the apps running on the edge nodes could access a suite of AI/ML services via the Satellite location. Thus, low latency and data sovereignty are two major reasons why enterprises would want to deploy such solutions. Compliance and other security requirements are easier to implement when the cloud object storage or database is physically located on-premises.

It is easy to envision a use case where a retail chain would use a product like AWS Outpost or IBM Cloud Satellite to establish a satellite location in a city. That satellite location could then provide the required cloud-based services to all its stores in that city. These could be a common set of services like AI/ML analytics, access policies, security controls, databases, etc. — providing consistency across all environments. Consistency and access to a large set of powerful processing services are additional advantages of such deployments.

Another common example is with telecommunication service providers that are looking to monetize 5G technology by offering cloud services to their customers. Figure 3 shows a Telco MEC (Mobile Edge Computing) topology making use of IBM Cloud Satellite, IBM Edge Application Manager (IEAM) and Red Hat OpenShift Container Platform (RHOCP):

Figure 3. Telco MEC topology using IBM Cloud Satellite and IEAM services.

 

To provide a bit more context, MEC effectively offers localized cloud servers and services rather than depending on a larger, centralized cloud. This basically means the edge/IoT devices will communicate with more, smaller data hubs that are physically closer to them (i.e., on the “edge” of the network). Rather than online games having to send data to a distant central server, process it and send back a response — all of which slows down overall communication speeds — they will be able to access more power, closer to the gamers.

Wrap-up

In addition to the millions of devices, IoT and edge computing have the challenge of accessing and storing data in the “last mile.” Products like AWS Outpost, Azure Stack, Google Anthos and IBM Cloud Satellite complement IoT and Edge topologies. In fact, the IBM Edge Application Manager Hub is often deployed in a Satellite location or resides in the cloud. The combination of the two technologies provides a compelling solution that companies in healthcare, telecommunications and banking can use. The agnostic nature of IBM Cloud Satellite even allows it to not only bring IBM Cloud services to remote locations but also services from AWS, Azure and Google Cloud.

The IBM Cloud architecture center offers up many hybrid and multicloud reference architectures including AI frameworks. Look for the IBM Edge Computing reference architecture here.

This blog post talked about bringing cloud services to the edge in what is commonly called distributed cloud or “cloud out.” It offers the best of both worlds — public cloud services and secure on-premises infrastructure. The folks at mimik have a very interesting notion of “edge in,” wherein they describe a world of microservices, edge-device-to-edge-device communication and creating a sort of service mesh that expands the power of the edge devices toward the cloud.

Let us know what you think.

Special thanks to Joe Pearson, David Booz, Jeff Sloyer and Bill Lambertson for reviewing the article.

 

Introduction: Akraino’s Federated Multi-Access Edge Cloud Platform Blueprint in R5

By Akraino, Blog

Introduction

This blog focuses on providing an overview of the “Federated Multi-Access Edge Cloud Platform” blueprint as part of the Akraino Public Cloud Edge Interface (PCEI) blueprint family. This blog specifically provides an overview of the key features and implemented components as part of Akraino Release-5. Key idea for this blueprint is to have a federated Multi-Access Edge Cloud (MEC) Platform that showcases the inter-play between telco side and cloud/edge side.

Prior to discussing the specifics of this blueprint implementation, the following subsection provides a brief description of what Multi-Access Edge Cloud (MEC) is and how it is ushering in as an enabler for emerging 5G/AI based applications landscape.

What is MEC and what are its  challenges?

MEC is a network architecture concept that enables cloud computing and IT capabilities to run at the edge of the network. Applications or services running on the edge nodes – which are closer to end users – instead of on the cloud, can enjoy the benefits of lower latency and enhanced end-user experience. MEC essentially moves the intelligence and service control from centralized data centers to the edge of the network – closer to the users. Instead of backhauling all the data to a central site for processing, it can be analyzed, processed, stored locally and shared upstream when needed. MEC solutions are closely “integrated” with access network(s). Such environments often include WiFi, mobile access protocols such as 4G-LTE/5G etc.

MEC opens up plethora of potential vertical and horizontal use cases, such as Autonomous Vehicle (AV), Augmented Reality (AR) and Virtual Reality (VR), Gaming, and Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) enabled applications like autonomous navigation, remote monitoring by using Natural Language Processing (NLP) or facial recognition, video analysis, and more. These emerging 5G/AI based applications landscape typically exhibit characteristics such as the following:

  • Low Latency requirements
  • Mobility
  • Location Awareness
  • Privacy/Security etc.

All of these characteristics pose unique challenges while developing emerging 5G/AI applications at the network edge. Supporting such an extensive feature-set at the required flexibility, dynamicity, performance, and efficiency requires careful and expensive engineering effort and needs adoption of new ways of architecting the enabling technology landscape.

To this regards, our proposed “Federated Multi-Access Edge Cloud Platform” blueprint enables desired abstractions in order to address these challenges and, as a result, ushers in an application development environment that enables support for ease of development and deployment of these emerging applications landscape. Subsequent sections delve deep into the proposed “Federated Multi-Access Edge Cloud Platform” blueprint details.

Blueprint Overview: Federated Multi-Access Edge Cloud Platform

The purpose of the “Federated Multi-Access Edge Cloud Platform” blueprint is an end-to-end technology solution for mobile game deployed across multiple heterogeneous edge nodes using various network access protocols such as mobile and WiFi and others. This blueprint demonstrates how an application leverages a distributed and multi access network edge environment in order to get all the benefits of edge computing.

The diagram above highlights the use case scenario. On the left hand side is the device – as can be seen that the device is moving from location x to y.

The whole use case goes through 4 distinct steps. The first step is the service discovery flow. And then the game service flow follows. And once the device actually moves, it would trigger additional session migration flow. This also includes subsequent service discovery to go along with this session migration. Finally, step number four is once this migration happens, the UE will go to the new edge node.

In order to support all this, platform provides two key abstractions:

  • Multi-Access/Mobile Operator Network Abstraction: Multi-access network means a mobile, Wi Fi or whatever it takes. Multi-operator means even for the same 4G/5G there could be different operators (Verizon, AT&T etc.). There are various MEC edge nodes, they could be the WiFi based edge node and they can be from different operators.
  • Cloud-Side Abstraction: Cloud side abstraction includes key architectural components to be described in the subsequent sections.

Functional Diagram: Federated Multi-Access Edge Cloud Platform

The key component is this federated multi-access edge platform. The platform sits between applications and underlying heterogeneous edge infrastructure and also abstracts the multi-access interface and exposes application developer friendly APIs. This blueprint leverages upstream project KubeEdge as baseline platform – this includes the enhanced  federation function (Karmada).

Telco/GSMA side complexities (5GC/NEF etc.) need to be thought through and designed appropriately in order to realize extremely low latencies (10 ms) requirements desired by typical MEC use cases. For the multi access, we may initially use a simulated mobile access environment to mimic a real time device access protocol conditions as part of the initial release/s.

Key Enabling Architectural Components

Federation Scheduler (Included in Release-5)

As a “Global Scheduler”, responsible for application QoS oriented global scheduling in accordance to the placement policies. Essentially, it refers to a decision-making capability that can decide how workloads should be spread across different clusters similar to how a human operator would. It maintains the resource utilization information for all the MEC edge cloud sites. Cloud federation functionality in our blueprint is enabled using open source Karmada project. The following is an architecture diagram for Karmada.

Karmada (Kubernetes® Armada) is a Kubernetes®  management system that enables cloud-native applications to run across multiple Kubernetes® clusters and clouds with no changes to the underlying applications. By using Kubernetes®-native APIs and providing advanced scheduling capabilities, Karmada truly enables multi-cloud Kubernetes® environment. It aims to provide turnkey automation for multi-cluster application management in multi-cloud and hybrid cloud scenarios with key features such as centralized multi-cloud management, high availability, failure recovery, and traffic scheduling. More details related to Karmada project can be found here.

EdgeMesh (Included in Release-5)

EdgeMesh provides support for service mesh capabilities for the edge clouds in support of microservice communication cross cloud and edges. EdgeMesh provides a simple network solution for the inter-communications between services at edge scenarios (east-west communication).

The network topology for edge cloud computing scenario is quite complex. Various Edge nodes are often not interconnected and the direct inter-communication of traffic between applications on these edge nodes is highly desirable requirement for businesses. EdgeMesh addresses these challenges by shielding the complex network topology at the edge applications scenario. More details related to EdgeMesh project can be found here.

Service Discovery (Not included in Release-5)

Service Discovery retrieves the endpoint address of the edge cloud service instance depending on the UE location, network conditions, signal strength, delay, App QoS requirements etc.

Mobility Management (Not included in Release-5)

Cloud Core side mobility service subscribes to UE location tracking events or resource rebalancing scenario. Upon UE mobility or resource rebalancing scenario, mobility service uses Cloud core side Service Discovery service interface to retrieve the address of new appropriate location-aware edge node. Cloud Core side mobility service subsequently initiates UE application state migration process between edge nodes. Simple CRIU container migration strategy may not be enough, it is much more complex than typical VM migration.

Multi-Access Gateway (Not included in Release-5)

Multi access gateway controller manages Edge Data Gateway and Access APIG of edge nodes. Edge data gateway connects with edge gateway (UPF) of 5G network system, and routes traffic to containers on edge nodes. Access APIG connects with the management plane of 5G network system (such as CAPIF), and pulls QoS, RNIS, location and other capabilities into the edge platform.

AutoScaling (Not included in Release-5)

Autoscaling provides capability to automatically scale the number of Pods (workloads) based on observed CPU utilization (or on some other application-provided metrics). Autoscaler also provides vertical Pod autoscaling capability by adjusting a container’s ”CPU limits” and ”memory limits” in accordance to the autoscaling policies.

Service Catalog (Not included in Release-5)

Service Catalog provides a way to list, provision, and bind with services without needing detailed knowledge about how those services are created or managed.

Detail Flow of various Architectural Components

What is included in Release-5

As mentioned earlier that the purpose of this blueprint is an end-to-end technology solution for mobile game deployed across multiple heterogeneous edge nodes using various network access protocols such as mobile and WiFi and others. This blueprint demonstrates how an application leverages a distributed and multi access network edge environment for realizing all the benefits of edge computing.

This is the very first release of this new blueprint as part of the Akraino PCEI family. Current focus for this release is to enable only the following two key architectural components:

  1. Open source Karmada based Cloud Federation
  2. EdgeMesh functionality

This blueprint will evolve as we incorporate remaining architectural components as part of the subsequent Akraino releases. More information on this blueprint can be found here.

Acknowledgements

Project Technical Lead: Deepak Vij, KubeEdge MEC-SIG member – Principal Cloud Technology Strategist at Futurewei Cloud Lab.

Contributors:

  • Peng Du, Futurewei Cloud Lab.
  • Hao Xu, Futurewei Cloud Lab.
  • Qi Fei, KubeEdge MEC-SIG member – Huawei Technologies Co., Ltd.
  • Xue Bai, KubeEdge MEC-SIG member – Huawei Technologies Co., Ltd.
  • Gao Chen, KubeEdge MEC-SIG member – China Unicom Research Institute
  • Jiawei Zhang, KubeEdge MEC-SIG member – Shanghai Jiao Tong University
  • Ruolin Xing, KubeEdge MEC-SIG member – State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications
  • Shangguang Wang, KubeEdge MEC-SIG member – State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications
  • Ao Zhou, KubeEdge MEC-SIG member – State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications
  • Jiahong Ning, KubeEdge MEC-SIG member – Southeastern University
  • Tina Tsou, Arm

Vision 2022: Open Networking & Edge Predictions

By Blog

By: Arpit Joshipura, GM, Networking, Edge & IOT

As we wrap up the second year of living through a global pandemic, I wanted to take a moment to both look ahead to next year, as well as recognize how the open networking and edge industry has shifted over the past year. Read below for a list of what we can expect in 2022, as well as a brief “report card” on where my industry predictions from last year landed.

  1. Dis-aggregation will enter the “Re-aggregation” phase (in terms of software, organizations, and industries) This will be enabled by Super Blueprints (which bring end- to- end open source projects together), and we’ll see more multi-org collaboration (e.g., Standards Bodies, Alliances and Foundations) re-aggregating to solve common problems. Edge computing will serve as the glue that binds common IoT frameworks together across vertical industries.
  2. Realists and Visionaries will fight it out for dollars and productivity Given that what started as a pandemic could become endemic, there will be an internal tussle between Realists (making money off of 4G), Engineers currently coding 5G,  and Visionaries looking to 6G and beyond. (In other words, the cycle continues).
  3.  Security will emerge as the key differentiator in Open source Collaboration among governments and other global organizations against “bad actors” will penetrate geopolitical walls to bring a global ecosystem together, via open source.
  4. Market Analysts will reinvent themselves There is no longer a clear way to track Cloud, Telecom, Enterprise and other markets individually. There is a big market realignment in progress,  with new killer use cases (such as X and Y).
  5. Seamless Vertical industries will emerge Enabled by Open Source Software — many vertical industries will not even know (or care) how the pipe traverses across their last mile to central cloud and edges (led by Manufacturing, Retail, Energy, Healthcare & Automotive).

What did I miss? I would love to have your comments on LinkedIn.

Now let’s take a look at where my predictions from last year actually landed…

See my 2021 predictions from last year: https://www.lfnetworking.org/blog/2020/12/15/predictions-2021-networking-edge/

Hindsight 2021

Prediction 1:Telecom & Cloud ‘Plumbing’ based on 5G Open Source will drive accelerated investments from top markets (Government, Manufacturing, and Enterprises) 

Grade: A Where we netted out: Great stories on end user deployment and momentum (Deutsche Telekom, AT&T, Orange, Bell, China Mobile, Verizon, DARPA WorldBankWalmart…. plus cloud players like Google, Microsoft, and top global Network vendors). 

Prediction 2: The Last piece of the “open” puzzle will fall in place: Radio Access Network (RAN)

Grade: B Where we netted out:  The puzzle has fallen into place,  but with many pieces (e.g., ORAN SC, OpenRAN, SD-RAN, Open in the name of RAN). RAN and packet core continue to be the focus in an Open World.

Prediction 3: “Remote Work” will continue to be the greatest positive distraction, especially within the open source community

Grade: A+ Where we netted out: Spot on!  >200% growth in Commits across LF Networking and LF Edge. 

Prediction 4: “Futures” (aka bells and whistle features & future-looking capabilities) will give way to “functioning blueprints”  

Grade: A Where we netted out: The US DoD is now banking on Open Source for security reasons; 5G Super Blue Print is the fastest growing initiative in Open Source; Akraino has 25+ deployed blueprints; and more. At the end of the day, open source has moved from classroom theory to in- field practical code.

Prediction 5: AI/ML technologies become mainstream 

Grade: B Where we netted out: Still not there. While Intent based has been incorporated into open source, common frameworks are still fragmented. Deployments are specific to carriers, countries, and enterprises.

About the Author: Arpit Joshipura is General Manager, Networking, Edge & IoT, the Linux Foundation.

EdgeX 3.0 – the Future of Edge Computing

By Blog, EdgeX Foundry

By Jim White, EdgeX Foundry TSC Chair

Recently, the EdgeX Foundry project (a Linux Foundation project and part of the LF Edge umbrella of projects), released a dot release (v2.1) on top of our second major release which came out this summer.  This release was codenamed Jakarta and it is the community’s 9th overall release.  Jakarta was a stabilization release and is our first long-term support (LTS) release.

In reading my blog post title, you may be asking, what is this joker-of-a-technical-steering-committee-chairman talking about?  The paint on EdgeX 2.0 and our first ever LTS isn’t even dry and he’s talking about the third major release?

To put everyone’s mind at ease, EdgeX 3.0 is a long way off.  EdgeX 2.0 was a very big release.  The LTS release was a big commitment of support on the part of our community (we will support Jakarta for 24 months as defined by our LTS policy).  Furthermore, the community recently held its semi-annual planning meeting for the next release (codenamed Kamakura), and we know the spring 2022 release will be another dot release (EdgeX 2.2) with some additional new features but still backward compatible with all 2.x releases.  So, there is nothing on the immediate horizon that says EdgeX 3.0 is eminent.

I am not going to put a timeline on EdgeX 3.0 availability.  As a project, I am very proud of the fact that we have regularly released twice a year.  In 4 years of existence, EdgeX has had just 2 major releases.  EdgeX 3.0 would be the third, and given our current cadence.  Do the math and you can see that we are looking at more than year before EdgeX 3.0 is even remotely possible.  EdgeX 3.0 here serves as a metaphor of what big things are on the horizon for the project.

What I am going to pose in this post is a vision and a roadmap for the future of EdgeX Foundry.  I have the privilege of owning a front row seat in the creation and use of EdgeX for more than 6 years now.  I don’t suggest that I have all the answers or perfect vision with regard to the future, but I think my position grants me enough context and understanding to make some forecasts.  History combined with current requirements sprinkled with a few strong technical indicators can present a pretty good directional.  

To provide future me and my prognostications some escape, I include a couple of caveats with regard to my “vision”:  major industry disruption and time.  What I am presenting is a vision based on technology we know today or expect tomorrow.  Significant disruption (i.e., Internet or smartphones size impacts) is not something I can predict and it would potentially make this roadmap useless.  I am also not suggesting everything I am envisioning will be or even has to be in EdgeX 3.0.  The vision is a guidepost, but the journey may take a bit longer and more than one major release to accomplish – so again EdgeX 3.0 is metaphorical.  

OK – that said, allow the crazy TSC chairman to paint the picture of EdgeX 3.0.

Take the Bus but Allow Walking

When EdgeX was initially introduced, all service communications were via REST.  Whether your application needed something from an EdgeX service or two EdgeX services needed to talk, we used REST over HTTP.  This was a conscious decision.  We felt that REST was clean, simple and well understood.  If you are looking to get adopted and you have a complex tool to solve complex problems, you at least want to make it easy to communicate with.  REST is pretty easy to work with.

Quality of service needs, throughput efficiency, pub/sub model vs point-to-point communications, and message size are reasons why architects choose a message bus approach over REST.  These reasons are valid reasons to adopt messaging in edge/IoT computing too.

Over time, EdgeX has instituted more message bus communications for our services.  As of EdgeX 2.0, All communications from sensor to enterprise/cloud (the northward travel of the sensor data), can be done via message bus to deliver edge data.

 

In EdgeX 3.0, adopters must have the opportunity to use a message bus for all communications.  Most notably, an enterprise application or cloud system should be able to send a message on a message bus (MQTT, AMQP, Redis Streams, etc.) to request actuation on a device/sensor or to get details about what is happening at the edge.  Today, communications from north to south are all via REST.  Queries into EdgeX services or requests to change metadata, schedules or any type of configuration are also via REST.  Going forward, EdgeX 3.0 needs to allow communications with all services to happen by message bus.  That means all EdgeX services will need to be outfitted with topic endpoints and subscriptions to receive messages as well as the means to publish a message to a message bus.

This does not mean REST should be removed.  EdgeX is also about flexibility and simplicity.  REST interfaces suffice for some production use cases.  Further, REST makes it easier for a student of EdgeX to get up to speed and explore the platform.  REST also offers a great deal of assistance when debugging a broken system.  Use of REST allows users to walk around the services with a simple browser (whether trying to learn it or fix it), and with no additional setup or tools.

Finally with regard to EdgeX 3.0 and messaging, the platform should also embrace AMQP as an alternative to MQTT and Redis Pub/Sub (and ZMQ in some limited cases).  AMQP offers more features such as better support for cache and proxy.  AMQP is popular among some industries (finance and business) and is used in some IoT circles (notably supported in Microsoft’s Azure IoT Hub).  Because AMQP requires more resources (see issues with resource constraints below), it should not be the default message bus implementation.  Providing an optional implementation of the EdgeX message bus via AMQP, however, allows EdgeX to compete with products at the edge using this protocol and give it one more tentacle of flexibility.

Size Matters and Time is Relative

Since the beginning of my journey in IoT/edge computing, we were always working in limited resource environments but with the hope and expectation that Moore’s law would eventually be applied at the edge just as it has on our desktops, phones, enterprises and clouds.  This is a false assumption.  

I do think that the availability of resources at the edge is trending upward – it just happens much more slowly than every two years (per the law).  Why?  Two reasons: scale and time.

If your company was to replace all of its employee laptops, how many laptops would they be replacing?  Hundreds?  Thousands?  Sizeable yes.  And not cheap, but IT organizations rotate our laptops about every 2-3 years as a matter of course.  Agile development practices have our enterprise applications updated every few weeks.

Now imagine you are a city and you want to put an edge compute platform on every traffic signal.  Back of the napkin math says there are about 200 to 6000 compute platforms you are going to need for a square mile of city (using 4-12 lights per intersection, and about 45-550 intersections per square mile in a city – thanks Google).  How many square miles in a city?  Given Philadelphia has around 130 mi2 and Seatle has around 80 mi2, let’s just say 100 mi2 as a good working number.  That means we would need to field 20K to 600K platforms!  The cost to stick a high-end server with lots of resources on each traffic light would break a city.  The sheer scale of the deployment is too big to expect the use of something other than smaller/cheaper edge platforms.  This is not an isolated case.  IoT / edge deployments tend to get very big, very quickly.

Second reason – Moore’s law generally applies to information technology (IT) areas much better than it applies to operational technology (OT) areas.  Einstein was right – time is relative.  In OT realms, the rate of change is slower – much slower.  So even if companies, governments and institutions had access to cheaper, faster and bigger, stronger technology, deploying it into the hot, smelly, dirty, wet, corrosive and generally hard to reach places that edge computing goes to doesn’t happen overnight.  Getting to these systems can be a nightmare.  Most systems in OT stay in place at least 7 years.  It is not unusual to find edge technology that is in place for 15 years or more.  This is a far cry from the 2-3 year (or less) upgrade cycles found in IT environments.

Consequently, the environments that a platform like EdgeX needs to run in are constrained and likely to remained constrained for some considerable time into the future.

EdgeX started life as a Java platform.  It was too big to meet edge resource constraints.  We’ve worked hard to get our micro services lean and to operate in limited resource environments.  We also made an increasing number of services optional (use a minimal set of EdgeX services to support bare bones use cases).

As a project, we have to fight the urge to add to the platform such that it can only be used in expensive, non-resource constrained environments.  From the project’s onset, we used a small Raspberry Pi as our guidepost.  We were not endorsing Raspberry Pi.  It was a measuring stick.  If we could run EdgeX on that platform, we were within the realm of being able to run in a smaller, resource constrained, and yes cheaper, edge environment.

That measuring stick is still relevant to EdgeX 3.0 for reasons stated above.  

  • EdgeX needs to run in 1GB of RAM or less.  
  • EdgeX can run on a single core platform
  • EdgeX needs to run within 32GB of storage space.  
  • EdgeX needs to be able to ingest sensor data, make a decision (from a rules engine or other analytics package), and actuate a device in less than one second (within a single instance of EdgeX).
  • EdgeX needs to startup and be operational is 10 seconds or less.

There are some implications of this that are mentioned in other future platform decisions below (such as use of 3rd party components).  But importantly, as we add to EdgeX, we must take care not to allow EdgeX to get so big as to grow beyond the constraints of its slow-changing OT environment.

Adopt Versus Build – Finding Edge Worthy Components

For expediency and because the areas were not our specialty, the EdgeX community choose to adopt some enterprise grade open-source products.  We chose Drools initially to give us a rules engine to drive low latent decisions at the edge.  We incorporated Kong and Vault to provide our security API gateway and secret store.  We use Consul for our configuration/registry service by default.  We initially chose MongoDB for persistence.  

These were the right decision at the time.  We took a path where we do the edge and we let others do the things that they are good at.  “You do you” we said to 3rd party components and “we’ll do edge.”  Security, in particular, is not something we felt we should be creating on our own.

Over time, we have chosen alternates to help lower the EdgeX resource footprint.  We replaced MongoDB with Redis.  We chose eKuiper to replace Drools.  Still, it was a relatively hands off approach to incorporating best of breed 3rd party components.  Again – “you do you” and “we’ll do edge.”

But many of these choices, due to their enterprise and cloud native nature, are big – too big.  They are wonderful, but offer more capability than what will ever be used at the edge. They were not built for the edge.  eKuiper being an exception, but even there we have seen a large expansion of their feature set to address more (and perhaps more than the edge needs?).  The you-do-you, and we-do-edge approach is something we need to reconsider for EdgeX 3.0.  As a customer said to me recently, “EdgeX needs a diet plan.”

Have a look at the EdgeX performance numbers in the table below.  With regard to EdgeX services, most are under 25MB or less image footprint.  Memory usage by any EdgeX service is around 11MB and CPU utilization is miniscule.  Now note the size of the non-EdgeX services (those without the EdgeX prefix).  The numbers are stark.  Over the last few releases, EdgeX services are trending within a few megabytes of their original performance numbers (some even getting smaller).  3rd party services are getting bigger.    Its not that EdgeX needs a diet plan so much as we need to put our 3rd party components on a diet plan.

It is the 3rd party components that are costing EdgeX most on its ability to fit on resource constrained platforms.  There is a chance that there are still other, smaller options for some of these components.  But are the options built for the edge and with an edge attitude toward resource constraints? 

EdgeX has come to a point where it must consider one of a few options in adoption of 3rd party components:

  • Locate 3rd party component providers that are thinking edge resource constraints and use their smaller components to replace the current enterprise sub-systems we use today.
  • Partner with the current 3rd party component providers; give them our edge and resource requirements, and see if they are willing to create smaller, lighter versions of their products for the edge.
  • Create our own smaller, lighter components to address edge needs.  Perhaps start from trying to take an existing open-source product and cut it down; more limited in functionality but fine for the edge.
  • Allow adopters to unplug the heavy bits or implement their own.  Some use cases don’t require the 3rd party components.  When they are required, and by providing abstractions for all 3rd party components, EdgeX can continue to use the same 3rd party components as implementations for those abstractions, but make it easier for adopters and commercial implementers to create or select alternate implementations as use case resource constraints dictate.

I prefer these in order.  Finding other alternates has so far proven to be a challenge.  I recognize that other projects have their own priorities and supporting edge computing may not rise to a level of their concern (although one would hope that the size of the IoT edge opportunity would attract some).  Almost certainly, a project like eKuiper (now a sister project in LF Edge) would be open to finding ways to modularize or otherwise offer resource constrained functionality for resource constrained edge platforms.

Where we cannot find or convince a third party, open-source project to help meet our component need, EdgeX may have to look at implementing their own.  A dubious sounding task, but in actuality may not be that daunting given that edge needs are a map-reduce function against the bigger enterprise versions.  Depending on the architecture, we may find we can take a selective scalpel to the 3rd party component and create an edge worthy edition reasonably quickly.

There are some use cases where the security components are not needed (in physically restricted environments for example).  Some adopters may just drop the weight of our enterprise level components if we make it easy for them.  Forcing adopters to have to create their own components based on our abstractions should be used only as temporary fix or the answer when well-known, proprietary options exist for adopters to use in production deployments.

No longer can we just do “edge parts”.  If a 3rd party component was built for the enterprise / cloud and is not meeting our edge needs, we must begin to explore alternatives – to include our own implementation – to satisfy edge resource constraints.

We are going to want to add additional capability to EdgeX (as discussed in various parts of this document).  We are going to want to improve the platform.  If we want to stay within the resource constraints of our target host platform, we are going to need to reduce resource consumption in 3rd party areas to allow EdgeX to grow in other areas.

Not Cloud Native – Aim for Edge Native

Cloud native is the current rage in software development.  It is a term used to describe building and running applications that exploit cloud computing delivery models.  That is, the ability for an application to access compute resources in more of an on-demand way with scale, resiliency and flexibility in mind.  Specifically, cloud native applications take advantage of micro service architectures, containerized deployment and orchestration, agile development process, 12-factor app patterns, and good DevOps automation to usher work from code to deployable artifact.

There is a movement to drive cloud native principles in software engineering to the edge.  While there are some good aspects of cloud native computing that can be applied directly, using a cloud native approach on edge applications forgets many of the constraints of the edge (see here for a good list of some of those differences).  Others are calling for a modified approach that considers the unique requirements of the edge.  That is, they are cherry-picking elements of cloud native but still considering the constraints of the edge.  This approach is called edge native.  

EdgeX has adopted some elements of the edge native approach (small micro services, service availability tolerance, good DevOps automation, etc.).  In fact, EdgeX adopted some of these elements even before the term edge native existed.  I believe that our EdgeX community would agree with the general philosophy in edge native computing.  That is, we would agree that edge applications should be “built from the ground up with the Edge in mind – just like Cloud-Native applications are built for the Cloud.”  

The definition of edge native is still somewhat nebulous and debated.  Depending on the source, EdgeX adheres to some elements of edge native, but does not adhere to other characteristics.  And there are some guidelines of developing edge native applications that I would not suggest EdgeX adhere to blindly. 

As an example, use of containers is considered a staple (required?) in many edge native guidelines.  EdgeX has always provided containers, but doesn’t require use of containers.  In fact, the project supports alternate delivery technologies (like snap packages – developed by Canonical) and understands the reality of today’s edge infrastructure.  

  • Some OT groups have not embraced containerized workloads and have suggested it may be several years before they support containers in production.  
  • Some resource constrained platforms make use of containers impossible.  
  • Certain devices/sensors are going to make an all-container strategy a challenge.  

While EdgeX supports container use, it does not dictate it.  Flexibility is key so long as the edge is very heterogenous.  

What principals of edge native computing should EdgeX look to embrace that it does not do today?

Distributable

Edge native applications should be able to move as resources (compute, network, storage, etc.) dictate.  Services should be able to scale out to the edge or scale back to the enterprise / cloud as resources warrant.  For example, an application service could run at the far edge and provide for low-latency decisions when resources are available.  But they may also run in the enterprise or in the cloud when resources aren’t available.

In theory, EdgeX services were designed to be distributed on different hosts (on physical systems or virtual machines with a different IP address).  However, in reality, there are a number of issues with the current architecture that might make distributing services across hosts difficult.   Chief among them is that when the services are distributed, there isn’t an easy way to secure the communications between services – a necessity for edge native applications.  Providing central configuration and secrets across the distributed services is not fully addressed in EdgeX.  EdgeX 3.0 needs to allow any EdgeX micro service to live anywhere, anytime and still operate securely.

Resiliency and Rapid Recovery

Enterprise and cloud platforms and associated resources (compute, network, storage, etc.) are relatively stable.  Edge platforms are notoriously unstable.  EdgeX 3.0 needs services to be more resilient to edge failures and outages.  When service failures do occur, the services need to be brought back up quickly.

This does not mean EdgeX services need to be “highly available.”  High availability (HA) is the ability of a system or service to operate continuously without failing for long, agreed upon lengths of time.  It often requires some orchestration capability to monitor the services and offer “backup” or redundant services in the face of failure.  The resource constrained nature of edge platforms makes offering HA at the edge a particular challenge.

Dependent services will go down.  Resources like the network will become unavailable.  The EdgeX 3.0 services must be built to be withstand these issues.  When a service does go down, it must be able to be restarted quickly and not require a lot of new configuration or setup on the part of the user (such as new security keys that cannot automatically be provisioned).

I submit that in EdgeX 3.0, a service remains up and continues to try to acquire any dependent services or resources indefinitely unless trying to do so creates other issues.  It should also be able to use the alert/notifications service (if it is up) to alert on the situation to a default HTTP REST endpoint, send an email, SMS message or otherwise alert the systems overseer.  When the dependency is available, the service should come back on line and continue normal functions without the need for other intervention on the part of the user.

When a service fails, it should be able to restart within the one-minute system startup time and without other intervention on the part of the user that had to issue the start (or restart) command.

The community should also offer example scripts or services that would check that all services are functioning and when a service (or dependent facility such as the database) are down, it attempts to restart them.  The example “restart” service or script does not have to be part of a default EdgeX 3.0 deployment, but should serve to help adopters think about how to keep an edge system alive even during partial failures.

Indeed, when something like Kubernetes is used to deploy and orchestrate EdgeX services, resiliency and rapid recovery may be offered taken care of by that environment.  But EdgeX must always be able to handle itself when platform constraints don’t allow for CNCF types of deployment management.

The Other Data – EdgeX Control Plane Telemetry and Health / Monitoring

Today, EdgeX does not report on its own health and operations.  Therefore, there is no way to automate any type of higher order management of the platform.  For example, there would be no way to know if a device service is reporting sensor data as expected – at least not without manual intervention and exploration of log files (provided the right log levels are set).

Starting with the Delhi release, EdgeX offers a system management service.  This service was created to be able to perform a limited set of monitoring and control plane functions.  This includes:

  • Starting, stopping, restarting the EdgeX services
  • Providing memory and CPU usage of the service
  • Provide a service’s configuration
  • Provide an indication of whether the service is operational based on its response to ping requests

Because some of these functions (start, stop, restart and metrics collection) are dependent on how the services are deployed and running (via container or on a native Linux OS), the architecture of the service requires an “agent” request information from a specific platform-dependent “executor” to carry out much of the system management service functionality.  The system management service, therefore, suffers from several issues:

  • It doesn’t work for all environments (Windows)
  • It has to be informed of all new or removed EdgeX services (not easy to do in dynamic situations)
  • The architecture (two components versus a single service) makes it more difficult to setup and run

Perhaps, most importantly, the service isn’t providing any capability that couldn’t be obtained from use of other tools depending on how EdgeX is deployed and where it is running.  For example, if EdgeX is deployed via Docker containers, Portainer or Linux tools could be used to do most of the system management functions except for providing configuration which is always available via Consul.

The intention was that the system management service would eventually be extended to get more information from each service (telemetry and event information that was specific to EdgeX) and make that available to monitoring services and/or other applications.  The telemetry and event information is not something an outside tool could provide since it would require knowledge and access to EdgeX service internals.  

Where that leaves the project is with a service that is redundant to better tools and technology and unable to provide much needed (and otherwise unobtainable) telemetry data that it should be providing.  For this reason, the community has decided to mark the service deprecated with the Jakarta release.  It may not be removed from the platform until something else provides for its replacement, but at least the community is signaling that the system management service in its current form is nearing end of life.

By the way, as reported earlier in this post, the system management agent (SMA) is the most expensive service in the EdgeX inventory by a factor of 10.  So, removing this service as it exists today in EdgeX paves the way for additional features going forward.

EdgeX 3.0 needs to offer much more data to adopters about its health and operations.  Telemetry or metrics from each service can be used to understand whether a sensor is reporting correctly, if the system has the proper resources to support adding additional sensors, or even if sensors are being used to flood the system with information (denial of service through sensors).

Rather than collecting telemetry by having an outside service request (pull) it, each service needs to be able to publish (push) telemetry out.  The telemetry can be pushed anywhere.  In early implementations, EdgeX services may just be configured to send telemetry to a designated message topic where it is up to the adopter to figure out how to collect, use or respond to any telemetry.  Later, the telemetry data can be treated as alternate edge data (control plane data vs sensor data) that is consumed by EdgeX application services, rules engines or other analytics services.  Telemetry data does not originate from a “thing”, but its data can help drive operations and actuation at the edge as necessary.  

For example, imagine a device service reported telemetry about the number events it produced over a given period of time.  If the telemetry – in the form of control plane Events/Readings – were picked up by application services and routed to the rules engine, the rules engine could be configured to look out for sensors reporting more events than expected.  This could trigger the shutdown of that rogue sensor until a user explores the reason for the extra reporting.

Application services could also subscribe to service telemetry messages in order to filter, aggregate and otherwise prepare and ship telemetry data out of EdgeX – just as they are used to export sensor data today.

Each service is responsible for certain functions and responsibilities within an EdgeX instance.  Each service knows (or should know) what is critical to its operations and functions.  Therefore, each service should have the means to report telemetry specific and important to that service, and make this EdgeX specific information available to EdgeX as well as external systems.

The Event/Reading structure and services that handle Event/Readings may need to be modified slightly in recognition that the Event/Reading may contain sensor data or service telemetry data.   Hopefully, Telemetry data and sensor data Event/Readings will differ only slightly in EdgeX 3.0 thereby requiring only minimal change.

EdgeX 3.0 services will need additional configuration so as to control how much and what telemetry gets reported.  The user should be able to increase or decrease the telemetry reporting based on operational circumstances and use case needs (not unlike how logging output is adjusted today).  Telemetry or metrics collection can impact the performance of EdgeX services.  There may be a need, based on resource constraints in some deployments, to complete turn off telemetry collection and publication for the entire instance.

The basic premise of initial EdgeX telemetry collection is already specified in a proposed EdgeX ADR.

More EdgeX 3.0 Features

In addition to the vision provided above, EdgeX 3.0 will need to add some additional capability that it does not have today.

Alternate Language App Functions SDK

EdgeX has two device service (DS) software development kits (SDKs); one in Go lang and the other in C.  Since most of the platform is created in Go lang, it seems fitting and natural to have a DS SDK in Go.  We also have a DS SDK in C because it is the most natural fit for most low-level protocols and “thing” communications.

On the north side, we have just a single App Functions SDK to create new/custom application services (AS) in Go Lang.  Again, given EdgeX’s foundation in Go, a Go lang SDK here also makes sense.  Going forward, I anticipate more interfaces with artificial intelligence (AI) / machine learning (ML) and other analytics packages.  One of the more popular languages for the AI/ML communities is Python.  It would seem appropriate that EdgeX 3.0 provide a north side SDK that is more familiar to those more likely to need and build AS.  Other organizations are using EdgeX to translate from one OT protocol (like Modbus) to another OT protocol (like BACnet or OPC UA).  OT people tend to use C and C++ and it stands to reason that an App Functions SDK in one of these languages may better support their needs.  In general, the SDKs at either north or south end of EdgeX need to tend toward the language tendencies of the user groups most likely to use them. 

Distributed Ledger Support

IoT platforms are anticipated to generate 80 zettabytes of data by 2025 by one recent report.  As that data begins to be shared – potentially even bought and sold on the open market – the origination, legitimacy, and ownership of that data is going to need to be monitored and managed.  This screams for distributed ledger technology (DLT).  The closer that the data can be tracked and attributed, the better.  This means that IoT/edge platforms like EgdeX will need to integrate with DLT platforms in order to put its sensor data into a digital ledger.

EdgeX 3.0 will need to offer some initial integration to popular, open source DLT.  DLT can be resource intensive.  So DLT at the edge may be an option to be used only when the use case dictates it and the edge platform can support it.

K8s Is Coming

I said earlier that EdgeX 3.0 will move closer to edge native, but that doesn’t mean it has to offer high availability.  However, without a doubt, Kubernetes (K8s) is moving to support the edge.  K3s, MicroK8s, KubeEdge and other efforts stemming from the Kubernetes are just the forerunners to what will be some type of K8s support at the edge.

Not every organization will support its use at the edge – it can be complex.  Not all edge environments will have the resources to run some form of K8s at the edge.  And using K8s in whatever form it takes to deploy and orchestrate edge native applications will have to make some allowances for edge constraints and challenges.  But make no mistake, like winter, K8s is coming.  EdgeX 3.0 will need to do more for those looking to use K8s.  Providing deployment / orchestration assistance equal to that which we do for Docker Compose and Snaps will be imperative.

We have always claimed to be deployment/orchestration technology agnostic.  We will remain so.  But at some point, we should expect some form of K8s technology (probably a far cry from what we see today which is largely inadequate to edge native) to be the predominant means of deploying edge applications.  We will also have to educate those coming from a cloud native world why K8s is not always a fit for edge native.

UoM

We have already begun a conversation in the community about how to support unit of measure standards with regard to our sensor data collection.  That is, how to attribute sensor data to a well-defined unit of measure.

IoT / edge platforms have been be specific about the sensor data collected.  For that matter, devices are pretty lack with regard to how they send data.  I own two IoT temperature probes that send an integer of “793” when trying to tell me the temperature reading is 79.3 Fahrenheit.

In order for the sensor data to be more trusted and offer better value (see DLT comments above), the data will need to be tied to an appropriate unit of measure and that unit of measure should be well defined by some body.

EdgeX 3.0 will not create or dictate the unit of measure standard, but it will allow those that need more specificity to label sensor data with a unit of measure and the standard it comes from.

This discussion and the premise for a unit of measure solution are already specified in a proposed EdgeX ADR.

Automate Thing Provisioning

The “last mile” is a term that originated in supply chain management and then adopted by telecommunications. It was used to describe the most difficult part of a journey or furthest part in a network – which usually was found at the end or literally the last mile.

In edge computing, the last mile is that between our platform and the actual device or sensor.  Connecting the sensors or devices with all the protocols, data formats, complex hardware, etc. is a real challenge and is why EdgeX is so important.  EdgeX, as we know, helps to simplify and standardize how “things” of the OT world get connected to the IT world.

While EdgeX makes the last mile shorter, so to speak, there is still a fair amount of ceremony and work required to provision a new sensor.  As an EdgeX user, you still have to provide a device profile, issue the correct Metadata calls or providing the right device configuration in order to onboard a new “thing”.  We provide some amount of device discovery and automated provisioning, but it is limited to a few protocols and doesn’t go far enough.  

Many sensors/devices today can provide more information about the resources they have to offer or the actuation commands they support.  Many sensors/devices advertise their presence and offer platforms like EdgeX the ability to discover and onboard them automatically.

EdgeX 3.0 needs to complete the journey and make the last mile of connectivity easier for adopters.  Where possible, and with appropriate safeguards, EdgeX 3.0 should be able to onboard a new senor/device as soon as it is powered up and in communication range of EdgeX and the host it runs on.

There will be challenges, especially with more legacy protocols.  However, an edge platform that enables “thing” connectivity simply by powering the sensor on will help to make IoT / edge computing truly more ubiquitous.  

Wrap Up

The opinions and vision expressed in this post are my own.  While I am currently the EdgeX Foundry TSC chairman, the vision depicted here is not yet the opinion adopted by the EdgeX community.  It is not the codified roadmap for EdgeX.  I hope it will be – or that the community will take this vision and improve upon it, which they usually do.  “A goal is not always meant to be reached; it often serves simply as something to aim at.”

We’ve spent the last two years working on EdgeX 2.0 and our first LTS release.  I am so proud of the work that was accomplished and where EdgeX is at, but I don’t want adopters or the community to think we have reached the end.  We are not at the end.  We are at the beginning of creating the best open source IoT / edge platform on the planet.  

 

Where the Edges Meet, Apps Land and Infra Forms: Akraino Release 5 Public Cloud Edge Interface

By Akraino, Blog

Written by Oleg Berzin, Ph.D., Co-chair of the Akraino Technical Steering Committee and Fellow, Technology and Architecture at Equinix

Introduction

In the PCEI R4 blog we described the initial implementation of the blueprint. This blog focuses on new features and capabilities implemented in the PCEI in Akraino Release 5. Before discussing the specifics of the implementation, it is useful to go over the motivation for PCEI. Among the main drivers behind PCEI are the following:

  • Public Cloud Driven Edge Computing. Edge computing infrastructure and resources are increasingly provided by public clouds (e.g., AWS Outposts, IBM Cloud Satellite, Google Anthos). In the PCEI R4 blog we described  various relationships between PCC (Public Cloud Core) and PCE (Public Cloud Edge), ranging from PCE being Fully-Coupled to PCC at hardware, virtualization, application and services layers to PCE being Fully-Decoupled from PCC at all these layers. This “degree of coupling” between PCE and PCC dictates the choice of orchestration entry points as well as the behavior of the edge infrastructure and applications running on it.
  • Hybrid infrastructure. Most practical deployments of edge infrastructure and applications are hybrid in nature, where an application deployed at the edge needs services residing in the core cloud to function (coupled model). In addition, a PCE application deployed at the edge, may need to communicate, and consume resources from multiple PCC environments.
  • Multi-Domain Interworking. Individual infrastructure domains (e.g., edge, cloud, network fabric) present their own APIs and/or other provisioning methods (e.g., CLI), thus making end-to-end deployment challenging both in complexity and in time. A Multi-domain orchestration solution is needed to handle edge, cloud, and interconnection in a uniform and consistent manner.
  • Interconnection and Federation. Need for efficient and performant interconnection and resource distribution between edge and cloud as well as between distributed edges proximal to end users. We would like to point out that a common assumption in many infrastructure orchestration solutions is that the fundamental L1/L2/L3 interconnection between edge clouds and core clouds as well as between the edges is available for overlay technologies such as SD-WAN or Service Mesh. We specifically see the need for the orchestration solution to be able to enable L2/L3 connectivity between the domains that are being orchestrated.
  • Bare Metal orchestration. As with the interconnection, many orchestration solutions assume that the bare metal compute/storage hardware and basic operating system resources are available for the deployment of virtualization and application/services layers. We would like to point out that in many scenarios this is not the case. 
  • Developer-centric capabilities. Capabilities such Infrastructure-as-Code are becoming critical for activation and configuration of public cloud infrastructure components, interconnection as well as the end-to-end application deployment, integrated with CI/CD environments.

PCEI in Akraino R5

Public Cloud Edge Interface (PCEI) is a set of open APIs, orchestration functionalities and edge capabilities for enabling Multi-Domain Interworking across the Operator Network Edge, the Public Cloud Core and Edge, the 3rd-Party Edge as well as the underlying infrastructure such as Data Centers, Compute Hardware and Networks. 

Terraform-based Orchestration

One of the biggest challenges with multi-domain infrastructure orchestration is finding a common and uniform method of describing the required resources and parameters in different domains, especially in public clouds (PCC). Every public cloud provides a range of service categories with a variety of different services, with each service having several different components, and each component having multiple features, with different parameters.

Terraform emerged as a common Infrastructure-as-Code tool that allows to abstract diverse provisioning methods (API, CLI, etc.) used in the individual domains and provision infrastructure components using a high-level language, if a Terraform Provider is available for the domain. 

The notable innovation in PCEI R5 is the integration of Terraform as a microservice within the PCEI orchestrator (CDS, see below). This allows for important orchestration properties:

  • Uniformity – use of the same infrastructure orchestration methods across public clouds, edge clouds and interconnection domains.
  • Transparency (model-free) – the orchestrator does not need to understand the details of the individual infrastructure domains (i.e., implement their models). It only needs to know where to retrieve the Terraform plans (programs) for the domain in question and execute the plans using the specified provider.
  • DevOps driven – the Terraform plans can be developed and evolved using DvOps tools and processes.

Examples of Terraform plans are shown below.

Open-Source Technologies in PCEI 

The PCEI blueprint makes use of the following open-source technologies and tools:

  • EMCO – Edge Multi-Cluster Orchestrator. EMCO is used as a multi-tenant service and application deployment orchestrator.
  • CDS – Controller Design Studio. CDS is used as the API Handler, Terraform Executor, Helm Chart Processor, Ansible Executor, GitLab Interface Handler.
  • Terraform. In PCEI R5, Terraform has been integrated with CDS to enable programmatic execution of Terraform plans by the PCEI Enabler to orchestrate PCC, PCE and interconnection infrastructure.
  • Kubernetes. Kubernetes is the underlying software stack for EMCO/CDS. Kubernetes is also used as the virtualization layer for PCE, on which edge applications are deployed using Helm.
  • Helm. Helm is used by EMCO for deployment of applications across multiple Kubernetes edge clusters.
  • GitLab. GitLab is used to store Terraform plans and state files, Helm charts, Ansible playbooks, Cluster configs for retrieval and processing by CDS using API calls.
  • Ansible. Ansible can be used by CDS to deploy Kubernetes clusters on top of bare metal and Linux.
  • Openstack. PCEI R5 can use Terraform to deploy IaaS infrastructure and apps on Openstack edge clouds.

Functional Roles and Components in the PCEI R5 Architecture

Key features and implementations in Akraino Release 5

  • Software Architecture Components
      • Edge Multi-Cloud Orchestrator (EMCO) 
      • Controller Design Studio (CDS) and Controller Blueprint Adapters (CBA)
      • Helm
      • Kubernetes
      • Terraform
  • Features and capabilities
    • NBI APIs
      • GitLab Integration
      • Dynamic Edge Cluster Registration
      • Dynamic App Helm Chart Onboarding
      • Automatic creation of Service Instances in EMCO and deployment of Apps
      • Automatic Terraform Plan Execution
    • Integrated Terraform Plan Executor
      • Azure (PCC), AWS (PCC)
      • Equinix Fabric (Interconnect)
      • Equinix Metal (Bare Metal Cloud for PCE)
      • Openstack (3PE)
    • Equinix Fabric Interconnect
    • Equinix Bare Metal orchestration
    • Multi-Public Cloud Core (PCC) Orchestration (Azure, AWS)
    • Kubernetes Edge
    • Openstack Edge
    • Cloud Native 5G UPF Deployment
    • Deployment of Azure IoT Edge PCE App
    • Deployment of PCEI Location API App 
    • Simulated IoT Client Code for end-to-end validation of Azure IoT Edge 
    • Azure IoT Edge Custom Software Module Code for end-to-end validation of Azure IoT Edge

DevOps driven Multi-domain INfrastructure Orchestration (DOMINO) 

In PCEI R5 we demonstrated the use of PCEI Enabler based on EMCO/CDS with integrated programmatic Terraform executor to orchestrate infrastructure across multiple domains and deploy an edge application. The DevOps driven Multi-domain Infrastructure Orchestration demo consisted of the following:

  • Deploy EMCO 2.0, CDS and CBAs.
  • Design Infrastructure using a SaaS Infrastructure Design Studio.
      1. Edge Cloud (Equinix Metal in Dallas, TX)
      2. Public Cloud (Azure West US)
      3. Interconnect (Equinix Fabric)
  • Push to GitLab.
      1. Cluster Info
      2. Application Helm Charts (Azure IoT Edge, kube-router)
      3. Terraform Plans
        1. Azure Cloud
        2. Equinix Interconnect
        3. Equinix Metal
  • Provision Infrastructure using CDS/Terraform.
      1. Bare Metal server in Equinix Metal Cloud in Dallas, TX
      2. Deploy K8S on Bare Metal
      3. Azure Cloud in West US (Express Route, Private BGP Peering, Express Route GW, VNET, VM, IoT Hub)
      4. Interconnect Edge Cloud with Public Cloud using Equinix Fabric L2
  • Deploy Edge Application (PCE).
      1. Dynamic K8S Cluster Registration to EMCO
      2. Dynamic onboarding of App Helm Charts to EMCO
      3. Composite cloud native app deployment and end-to-end operation
        1. Azure IoT Edge
        2. Custom Resource Definition for Azure IoT Edge
        3. Kube-router for BGP peering with Azure over ExpressRoute
  • Verify end-to-end IoT traffic flow.

The video recording of the PCEI R5 presentation and demonstration can be found at this link.

For more information on PCEI R5: 

Acknowledgements

Project Technical Lead:
Oleg Berzin, Equinix

Committers: 
Kavitha Papanna, Aarna Networks
Vivek Muthukrishnan, Aarna Networks
Jian Li, China Mobile
Oleg Berzin, Equinix
Tina Tsou, Arm

Contributors: 
Mehmet Toy, Verizon
Tina Tsou, Arm
Gao Chen, China Unicom
Deepak Vij, Futurewei
Vivek Muthukrishnan, Aarna Networks
Kavitha Papanna, Aarna Networks
Amar Kapadia, Aarna Networks

 

EdgeX Performance Update

By Blog, EdgeX Foundry

Written by James Butcher, EdgeX Foundry Core Working Group Chair and Edge Xpert Product Manager at IOTech Systems

 Following the recent release of EdgeX Foundry version 2.1, codenamed “Jakarta”, I thought it would be useful to provide a quick update on some of the performance metrics of the platform as it has evolved over the last couple of release cycles.

This release of EdgeX is also the first long term support (LTS) edition, whereby the EdgeX community will support this version with critical fixes for major flaws or bugs. The project’s detailed testing strategy helps to provide the confidence that this version is robust and reliable – and is key to the community making those LTS statements. See here for more details about the LTS policy.

Recent EdgeX Working Group Changes

You may know that the EdgeX QA & Test Working Group was previously responsible for the creation and operation of the community’s testing strategy and its main Test Automation Framework (TAF). This summer, the QA & Test group was merged into the EdgeX Core Working group, and I was pleased to be given the opportunity to chair the new combined group.

A key part of the EdgeX Core Working Group is its commitment to testing which helps ensure the quality and robustness of the framework. An EdgeX version is only released, for example, when all key requirements are reliably tested and preferably automated as part of TAF.

EdgeX Performance Metrics 

Another key part of the EdgeX testing strategy is the recording and monitoring of performance metrics. Since EdgeX 1.1 (Fuji), we have been running dedicated performance tests with each release and producing formal performance reports that describe the findings.

I mentioned some of the performance testing advancements in my blog around the release of EdgeX 1.3 (Hanoi). We cover key points such as footprint, CPU usage and latency of data flow through the platform. 

In the last couple of cycles (Ireland and Jakarta) we are now also recording performance related to running with the EdgeX Security Services.

Please find the Jakarta Performance Report here or click on the image below. 

The data continues to show that the EdgeX microservices developed by the community are generally pretty small and lightweight. One of the EdgeX Core Services, Core Metadata, for example, has a Docker image footprint of around 17MB.

Some of the third-party services we bring in, such as the Security Services or the Registry Service are a little larger, but still the footprint of the complete EdgeX stack requires less than 1GB of disk space. Note also the microservices architecture means not all services need to be deployed in all scenarios. It’s easy to pick and choose the services needed for each use case or physical hardware capability.

So whilst quite lightweight, we are still pushing for EdgeX to be smaller and faster where possible. The next couple of EdgeX development cycles (Kamakura and Levski) are devoting time to this, but a nice reduction in Jakarta is a drop in the run-time memory usage for the API Gateway Security Service. In previous EdgeX versions, memory consumption of the complete stack was recorded as around 1GB RAM, but an optimization such as configuring the API Gateway Service to run with a specific number of worker processes, means we can be much lower than that when needed. These types of config options are invaluable in helping to tune the framework, if physical resources are a concern.

Full Commercial Support and Value Add

I also wanted to mention IOTech’s commercially supported edition of EdgeX, named Edge Xpert. Edge Xpert 2.1, based on EdgeX Jakarta, will be available very soon so stay tuned for more info. Head to the IOTech website to understand how Edge Xpert features and its technical support offerings can help users deploy the EdgeX based technology more easily.

Open Community

Finally, please feel free the join our EdgeX Core meetings where we discuss progress and other issues that need to be addressed each week. We meet every Thursday at 8am PST. You can find the meeting links on our page here

 

 

How Do You Say Thank You to Contributors? EdgeX is Trying to!

By Blog, EdgeX Foundry

By Jim White, EdgeX Foundry TSC Chair

As I write this, it is Thanksgiving week in the United States.  A time for everyone to reflect, share some time with loved ones, and show appreciation for the year’s blessings.  I cannot think of a better time to also say “thank you” to the EdgeX Foundry community of volunteers.  We just released our 2nd release of the year (the Jakarta release which is our 9th release overall).  Saying “thank you” seems insufficient and I really wish there was a better way to express the gratitude I have to all the men and women that continually do such great work and allow us to consistently deliver EdgeX releases cycle after cycle. 

I have said it before but it bears repeating, I consider myself fortunate to be associated with such a great group of people that work on EdgeX Foundry – both our volunteers and those that assist from the Linux Foundation.  Co-founding the project and serving as the project’s leader has been the highlight of my career.  Importantly, it is the people that I have had a chance to meet and interact with through the project that has made the experience so wonderful.

As anyone that has the fortune to lead an open-source project will tell you, volunteers – that is the contributors to the project – are the lifeblood of the project.  If you don’t have enough volunteers or if you don’t have a community of people with great attitudes and willingness to work together to make something great, your project will soon flounder and fail.  Because of organizational or personal commitments, an open-source project must constantly seek out new contributors who also bring new ideas and energy to a project.  This is not easy and project leaders often have to play the role of cheerleader and recruiter to find people willing to part with their most precious commodities: time and knowledge.

That is why EdgeX Foundry is excited to announce a new program meant to thank those that contribute to the project and hopefully entice new contributors to spend time on the project.

As of  November, the EdgeX Developer Badge program will send a digital badge to contributors on two occasions:

  1. When a contributor submits their first pull request that gets accepted and merged into the EdgeX Foundry code base (via GitHub)
  2. When a contributor fixes two bugs that have been documented via GitHub issues in one of the project’s repositories

The new EdgeX Foundry Developer Badges for bug fixing (the “Bug Hunter” Badge) and first contribution (First Time Contributor Badge)

Additional badges may be awarded by the project in the future.  The badges will be automatically issued via the project’s CI/CD processes and through Credly.com.  Developers will receive an email notification from the project when they have made their badge-worthy contribution.   They will also receive an email from Credly that allows them to accept their digital badge and then share their verifiable credential on LinkedIn, Twitter or other social media platform of their choosing.  It is a small token of appreciation on the part of the project to say “thank you for your work” that also allows a contributor to post recognition of their effort to professional or personal media outlets.  In other words, it’s an official way to provide some “street cred” to our volunteers, which we hope will attract their peers and co-workers to seek the same.

To our knowledge, this is a first of its kind program for an open-source project.  If there is an open-source project out there that would like to copy our program, please feel free to reach out to me for more information.  We’d like to see more contributors in more open-source projects and if this can help, we would be happy to share what we have done.

While saying thanks, I’d like to thank Aaron Williams who was the LF Edge Developer Advocate until this summer for coming up with the initial program idea.  I’d also like to thank Ernesto Ojeda, our EdgeX Dev Ops Working Group Chairman, and his Intel DevOps team for implementing the program.  It’s another example of the great people and work found in the LF Edge and EdgeX communities.

On behalf of the EdgeX Foundry project, we wish everyone a joyous holiday season.

 

Blogs of The AI Edge: I-VICS

By Akraino, Blog

By Zhuming Zhange & Hao Zhongwang

The trends, key technologies and scenarios of VICS

Intelligent networking promotes the evolution of electronic vehicles and its architecture 

Infrastructure required for Intelligent vehicle-infrastructure cooperation systems(I-VICS)

  • Road: Informatization, intelligence and standardization
  • Communication: Unified communication interface and protocol, coordinated vehicle-road interconnection
  • Network: Car wireless communication network, narrowband Internet of Things
  • Services: High-precision time-space reference services, vehicle emergency systems, rapid assisted positioning services
  • Maps: basic maps and geographic information systems
  • Data: big data cloud platform, software

Use case 1: Safety Of The Intended Functionality (SOTIF)  and I-VICS

SOTIF(ISO/PAS 21448) emphasizes to avoid unreasonable risks due to expected functional performance limitations.

The background of the birth of SOTIF is the development of intelligent driving

If classified according to the functional chain of intelligent driving: perception-decision-execution, the “functional performance limitation” is reflected in three aspects:

  • Sensor perception limitations lead to scene recognition errors (including missed recognition of driver mis-operation)
  • Insufficient deep learning causes the decision algorithm to judge the scene incorrectly (including the wrong response to the driver’s mis-operation)
  • Actuator function limitations lead to deviation from the ideal target

  • For Area2 (known unsafe scenarios), the basic idea of SOTIF is to identify risk scenarios through safety analysis, and develop countermeasures against risk scenarios.
  • For Area3 (unknown unsafe scenarios), various scenarios that a car may encounter under various road conditions need to be identified (in theory) in the early stage of development

Use case 2: Autonomous Valet Parking

Functions:

  • Automatically drive a car from a pre-defined drop-off zone (e.g. the entrance to a carpark) to a parking spot indicated by an external system.
  • Park a car in a parking spot, starting in a lane near that parking spot.
  • Drive out of a parking spot.
  • Drive to a pre-defined pick-up zone (e.g. the exit from a carpark).
  • Automatically stop for obstacles while achieving the above.

AVP’s New features / benefits based on I-VICS:

  • Expand the perception range of car
  • Improve the ability of perception and realize swarm intelligence
  • Solve the problem of automatic driving safety

-Convert unsafe scenario to safe scenario

-Convert unknown scene to known scene

EdgeX Foundry Announces EdgeX 2.1 LTS, the Project’s First Long Term Support Release

By Announcement

Community debuts Developer Badge Program to recognize, reward  developer contributions as it begins plans for Spring 2022 release, codenamed ‘Kamakura’

SAN FRANCISCO December 1, 2021 EdgeX Foundry, a Linux Foundation project under the  LF Edge project umbrella, today announced the release of version 2.1 of EdgeX, codenamed ‘Jakarta.’  The project’s ninth release, it follows the recent Ireland release, which was the project’s second major release (version 2.0). Jakarta is significant in that it is EdgeX’s first release to offer long term support (LTS). 

Long Term Support

“Our Jakarta release is a stabilization release,” said Jim White, the EdgeX Foundry Technical Steering Committee  (TSC) Chairman and co-founder of the project.  “As such, it is our project community’s pledge to adopters that EdgeX offers you a stable version of the platform that you can expect the community to stand behind and support for a period of two years.  We stand with you in support of EdgeX in real world, commercial deployments of the platform.”

“Only a few open-source projects offer long term support; the rapid change of open source projects and the effort needed to LTS is significant,” said Arpit Joshipura, general manager, Networking, Edge and IoT, at the Linux Foundation. “By including LTS, EdgeX demonstrates it understands the needs of the operational technology (OT) user base, and how products in this space must work and operate over longer periods of time than traditional IT solutions,” said Arpit Joshipura. “This is a big milestone for any open source community, and we are incredibly proud of EdgeX Foundry for this achievement.”

 The EdgeX long term support policy states that the community will work as quickly as possible and give “best effort and development priority to fix major flaws as soon as possible.”  Major flaws by the project are defined as 

  • bugs causing the system or service to crash and where there is no work around for the function
  • bugs for a feature/function that does not work and there is no work around for the function
  • a security issue deemed a critical or high-level CVE (per CVSS)

The project has further stipulated in its LTS policy that “no new major functionality (at the discretion of the TSC) will be added” to the LTS version after the release happens.

More information about the Jakarta release, including a list of new features, can be found here: https://wiki.edgexfoundry.org/display/FA/Jakarta

EdgeX Developer Badge Program

As a part of this release cycle, EdgeX  also announced a new EdgeX Developer Badge program.  EdgeX has created the Developer Badge program to thank those making initial impacts to the project by providing  something that they can use to highlight their efforts and volunteerism on social media platforms.   Contributors have started receiving an official digital badge (award through Credly) when 

  • they make their first contribution (their first GitHub Pull Request is accepted by the project and merged into one of the project’s code repositories)
  • they fix two documented bugs of the project

Additional badges for other work may be awarded by the community in the future.

Kamakura Release – Spring 2022

The next EdgeX release, codenamed “Kamakura,” is set for Spring 2022.  The community has held its semi-annual planning session to lay out the goals and objectives of this release.  Kamakura is likely to be another dot-release that will again be backward compatible with all EdgeX 2.x releases (Ireland and Jakarta).  Major additions currently under consideration and being developed by the community include:

  • Initial north to south message bus.  Improved security secrets seeding and allowing for delayed service starts.
  • Metrics collection. .
  • Dynamic device profiles.  Better (native) Windows support
  • Improve testing – including real hardware testing
  • A second version release of the EdgeX Command Line Interface (CLI) which,  compatible with EdgeX v2.x.

 Learn more about this release on the project’s Wiki site.

About the Linux Foundation

Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open-source software, open standards, open data, and open hardware. Linux Foundation’s projects are critical to the world’s infrastructure including Linux, Kubernetes, Node.js, and more.  The Linux Foundation’s methodology focuses on leveraging best practices and addressing the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org.

 ###

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.