Monthly Archives

June 2021

Where the Edges Meet and Apps Land: Akraino Release 4 Public Cloud Edge Interface

By Akraino, Blog

Written by Oleg Berzin, Ph.D., Co-chair of the Akraino Technical Steering Committee and Fellow, Technology and Architecture at Equinix

Introduction

The Public Cloud Interface (PCEI) blueprint was introduced in our last blog. This blog describes the initial implementation of PCEI in Akraino Release 4. In summary, PCEI R4 is implemented based on Edge Multi-Cluster Orchestrator (EMCO, a.k.a ONAP4K8S) and demonstrates

  • Deployment of Public Cloud Edge (PCE) Apps from two Clouds: Azure (IoT Edge) and AWS (GreenGrass Core),
  • Deployment of a 3rd-Party Edge (3PE) App: an implementation of ETSI MEC Location API Server.
  • End-to-end operation of the deployed Apps using simulated Low Power Wide Area (LPWA) IoT client and Edge software.

Before describing the specifics of the implementation, it is useful to revisit PCEI architectural concepts.

The purpose of Public Cloud Edge Interface (PCEI) is to implement a set of open APIs and orchestration functionalities for enabling Multi-Domain interworking across Mobile Edge, Public Cloud Core and Edge, the 3rd-Party Edge functions as well as the underlying infrastructure such as Data Centers, Compute Hardware and Networks. Interfaces between the functional domains are shown in the figure below:

The detailed PCEI Reference Architecture is shown in the figure below. For the full description of the PCEI Reference Architecture please refer to the PCEI Architecture Document.

There are some important points to highlight regarding the relationships between the Public Cloud Core (PCC) and the Public Cloud Edge (PCE) domains. These relationships influence where the orchestration tasks apply and how the edge infrastructure is controlled. Both domains have some fundamental functions such as the underlying platform hardware, virtualization environment as well as the applications and services. In the Public Cloud context, we distinguish two main relationship types between the PCC and PCE functions: Coupled and Decoupled. The table below shows this classification and provides examples.

The PCC-PCE relationships also involve interactions such as Orchestration, Control and Data transactions, messaging and flows. As a general framework, we use the following definitions:

  • Orchestration: Automation and sequencing of deployment and/or provisioning steps. Orchestration may take place between the PCC service and PCE components and/or between an Orchestrator such as the PCEI Enabler and PCC or PCE.
  • Control: Control Plane messaging and/or management interactions between the PCC service and PCE components.
  • Data: Data Plane messaging/traffic between the PCC service and the PCE application.

The figure below illustrates PCC-PCE relationships and interactions. Note that the label “O” designates Orchestration, “C” designates Control and “D” designates Data as defined above.

The PCEI implementation in Akraino Release 4 shows examples of two App-Coupled PCC-PCE relationships: Azure IoT Edge and AWS GreenGrass Core, and one Fully Decoupled PCE-PCC relationship: an implementation of ETSI MEC Location API Server.

In Release 4, PCEI Enabler does not support Hardware (bare metal) or Kubernetes orchestration capabilities.

PCEI in Akraino R4

Public Cloud Edge Interface (PCEI) is implemented based on Edge Multi-Cluster Orchestrator (EMCO, a.k.a ONAP4K8S). PCEI Release 4 (R4) supports deployment of Public Cloud Edge (PCE) Apps from two Public Clouds: Azure and AWS, deployment of a 3rd-Party Edge (3PE) App: an implementation of ETSI MEC Location API App, as well as the end-to-end operation of the deployed PCE Apps using simulated Low Power Wide Area (LPWA) IoT client and Edge software.

Functional Roles in PCEI Architecture

Key features and implementations in Akraino Release 4:

  • Edge Multi-Cloud Orchestrator (EMCO) Deployment
    • Using ONAP4K8S upstream code
  • Deployment of Azure IoT Edge PCE App
    • Using Azure IoT Edge Helm Charts provided by Microsoft
  • Deployment of AWS Green Grass Core PCE App
    • Using AWS GGC Helm Charts provided by Akraino PCEI BP
  • Deployment of PCEI Location API App
    • Using PCEI Location API Helm Charts provided by Akraino PCEI BP
  • PCEI Location API App Implementation based on ETSI MEC Location API Spec
    • Implementation is based on the ETSI MEC ISG MEC012 Location API described using OpenAPI.
    • The API is based on the Open Mobile Alliance’s specification RESTful Network API for Zonal Presence
  • Simulated IoT Client Code for end-to-end validation of Azure IoT Edge
  • Azure IoT Edge Custom Software Module Code for end-to-end validation of Azure IoT Edge

 

End-to-End Validation

The below description shows an example of the end-to-end deployment and validation of operation for Azure IoT Edge. The PCEI R4 End-to-End Validation Guide provides examples for AWS GreenGrass Core and the ETSI MEC Location API App.

 

Description of components of the end-to-end validation environment:

  • EMCO – Edge Multi-Cloud Orchestrator deployed in the K8S cluster.
  • Edge K8S Clusters – Kubernetes clusters on which PCE Apps (Azure IoT Edge, AWS GGC), 3PE App (ETSI Location API Handler) are deployed.
  • Public Cloud – IaaS/SaaS (Azure, AWS).
  • PCE – Public Cloud Edge App (Azure IoT Edge, AWS GGC)
  • 3PE – 3rd-Party Edge App (PCEI Location API App)
  • Private Interconnect / Internet – Networking between IoT Device/Client and PCE/3PE as well as connectivity between PCE and PCC.
  • IoT Device – Simulated Low Power Wide Area (LPWA) IoT Client.
  • IP Network/vEPC/UPF) – Network providing connectivity between IoT Device and PCE/3PE.
  • MNO DC – Mobile Network Operator Data Center.
  • Edge DC – Edge Data Center.
  • Core DC – Public Cloud.
  • Developer – an individual/entity providing PCE/3PE App.
  • Operator – an individual/entity operating PCEI functions.

The end-to-end PCEI validation steps are described below (the step numbering refers to Figure 5):

  1. Deploy EMCO on K8S
  2. Deploy Edge K8S clusters
  3. Onboard Edge K8S clusters onto EMCO
  4. Provision Public Cloud Core Service and Push Custom Module for IoT Edge
  5. Package Azure IoT Edge and AWS GGC Helm Charts into EMCO application tar files
  6. Onboard Azure IoT Edge and AWS GGC as a service/application into EMCO
  7. Deploy Azure IoT Edge and AWS GGC onto the Edge K8S clusters
  8. All pods came up and register with Azure cloud IoT Hub and AWS IoT Core
  9. Deploy a custom LPWA IoT module into Azure IoT Edge on the worker cluster
  10. Successfully pass LPWA IoT messages from a simulated IoT device to Azure IoT Edge, decode messages and send Azure IoT Hub

For more information on PCEI R4: https://wiki.akraino.org/x/ECW6AQ

PCEI Release 5 Preview

PCEI Release 5 will feature expanded capabilities and introduce new features as shown below:

For a short video demonstration of PCEI Enabler Release 5 functionality based on ONAP/CDS with GIT integration, NBI API Handler, Helm Chart Processor, Terraform Plan Executor and EWBI Interconnect please refer to this link:

https://wiki.akraino.org/download/attachments/28968608/PCEI-ONAP-DDF-Terraform-Demo-Audio-v5.mp4?api=v2

Acknowledgements

Project Technical Lead: 

Oleg Berzin, Equinix

Committers:

Jian Li, China Mobile
Oleg Berzin, Equinix
Tina Tsou, Arm

Contributors: 

Mehmet Toy, Verizon
Tina Tsou, Arm
Gao Chen, China Unicom
Deepak Vij, Futurewei
Vivek Muthukrishnan, Aarna Networks
Kavitha Papanna, Aarna Networks
Amar Kapadia, Aarna Networks

Embedded IoT 2021 – Intro of the research insight from the presentation based on LF Edge HomeEdge

By Blog, Home Edge

The blog post initially appeared on the Samsung Research blog

By Suresh LC and Sunchit Sharma, of SRI-Bangalore

Introduction

Every home is flooded with consumer electronic devices which have capability to connect to internet. These devices can be accessed from anywhere in the world and thus making the home smart. As per a survey the size of smart home market would be $6 Billion by 2022. As per Gartner report 80% of all the IoT projects to include AI as a major component.

In conventional cloud based setup following two things need to be considered with utmost importance:

Latency – As per study there is latency of 0.82ms for every 100 miles travelled by data

Data Privacy – Average total cost of data breach is $3.92 Million

With devices of heterogeneous hardware capabilities the processing which were traditionally performed at cloud are being moved to devices if possible as shown below.

These lead us to work towards developing an edge platform for smart home system. As an initial step the Edge Orchestration project was developed under the umbrella of LF Edge.

Before getting into the details of Edge orchestration, let us understand few terms commonly used in this.

Edge computing – a distributed computing paradigm that brings computation and data storage closer to the location where it is needed, to improve response times and save bandwidth

Devices : Our phones, tablets, smart TVs and so that are now a part of our homes

Inferencing – act or process of reaching a conclusion from known facts.

ML perspective – act of taking decisions / giving predictions / aiding the user based on a pre trained model.

Home Edge – Target

Home Edge project defines uses cases, technical requirements for smart home platform. The project aims to develop and maintain features and APIs catering to the above needs in a manner of open source collaboration. The minimum viable features of the platform are below:

Dynamic device/service discovery at “Home Edge”

Service Offloading

Quality of Service guarantee in various dynamic conditions

Distributed Machine Learning

Multi-vendor Interoperability

User privacy

Edge Orchestration targets to achieve easy connection and efficient resource utilization among Edge devices. Currently REST based device discovery and service offloading is supported. Score is calculated based on resource information (CPU/Memory/Network/Context) by all the edge devices. The score is used for selecting the device for service offloading when more than one device offer same service. Home Edge also supports Multi Edge Communication for multi NAT device discovery.

Problem Statement

As mentioned above the service offloading is based on score of all the devices. Score is in-turn is calculated using CPU/Memory/Network/Context. These values are populated in the db initially when home edge is installed in the device. And hence these are more static values.

Static Score is a function of

Number of CPU cores in a machine

Bandwidth of each CPU core

Network bandwidth

Round trip time in communication between devices

For example say at time t1, device B has available memory ‘X’ and hence this value is stored in db. Say at time t2 device A requests device B for score and at t2 the available memory in device B is ‘X-Y’. But the score is calculated using ‘X’ which has been stored initially.

Say in another scenario device A requests device B for service offloading. Device B starts executing the service. But say in the mid of execution moves away from the network. In such cases it would be unsuccessful offloading.

We could see in the above situations that the main purpose of Home Edge is gone for a task.
And these lead to the proposal of dynamic score calculation which is more effective and efficient.
Properties of a home based Distributed Inferencing System

We propose an effective method for score calculation. The proposed method for score calculation is based on following points:

1. Model Specific Device Selection

2. Efficient Data Distribution

3. Efficient Churn Prediction

4. Non Rigid and Self Improving

Based on the above four properties we defined a feedback based system that can help us achieve optimal data distribution for parallel inferencing in a heterogeneous system.

High Level System

Following diagram shows the high level architecture of the system. Let us take a use case and see the flow of tasks. For example when a user queries to Speaker “Let me know is any visitors today?”. The User query is indigested by the query manager to understand the query. The user query is to identify the number of visitors who visited today. The Data Source needs to be identified for query execution say the Doorbell Camera/Backyard CCTV. Then the model required for this purpose is identified and the devices which have the service (model). Based on the past performance, static score, trust of the device and number times previously the model has ran on the device, the devices are selected. Priority of the devices are calculated and based on the same the data is split among the devices. The result from the devices are aggregated to calculate the final output.

Understanding Score

Total Score is a function of Static score and Memory usage of the device

Changes in Score can help us predict estimated runtimes at the current time

Understanding Time Estimates

Time Estimates are calculated based on previous time records, average and current score

Data Points which are stored :

1. Average Score

2. Average time taken per inference

3. Number of times a model has been deployed on a device

4. Success probability of a model running on a device – Trust

Estimating runtime on Devices

Runtime estimation is done post sufficient data points being collected in the system

Delta Factor – A liner relationship factor to approximate the change in runtimes when score changes is calculated

Estimated time – using the Delta factor, Average Score and Average Runtime and the Current score of the device

Understanding Churn Probability Estimation

Logical buckets for every 5 minute frames is created to know the availability of devices

In case a bucket is polling a device for nth time and the was available k times before the new probability would be

1. (K+1)/(n+1), if device is available

2. K/(n+1), if device unavailable

Calculating Priority

Post estimation of runtimes, The probability of devices moving out of the system are checked and devices whose probability falls low are not kept in the consideration list

All devices in the consideration list are sorted based on the product of the trust and inverse of runtimes

Top K devices out of N are selected and data is distributed among them

Data Distribution

Data on the devices is distributed in the inverse ratio of the estimated runtimes to minimize the total time the system takes to make the complete inference

Feedback

Post execution the actual runtimes and scores of the devices are used to adjust the average in the databases

In case of failure next device is selected and the trust value of the device is adjusted in database

Test Run on Simulator

Aim – To demonstrate heuristic based data distribution among devices
To Test – Note the devices that work best for a model, checking the selection irrespective of device scores

Web based simulator was developed to demonstrate the proposed method. Following is a sample screenshot of the simulator

 

In the due course of our work we also found few models execute better in specific devices. To check up on this we had run two use case – Camera Anomaly detection and Person identification. And we found that the models used in both the scenarios gave better execution results in different devices as shown below.