Post-Doc at Orange Labs   

A key objective of the Internet of objects (IoT) is to transform the masses of data generated by connected objects into knowledge and services useful to users. However, the management of large volumes of densely distributed data at the IoT level requires the installation of new infrastructures and solutions to efficiently process and store the data.
  % Data is traditionally processed and stored in a centralized infrastructure of \ textit {Cloud Computing}. While the data is being transferred to \ textit {Cloud}, the reaction to a critical event might be too late. To solve this problem and allow for prompt data processing, it is necessary to make these calculations near the place where the data are collected. In this context, where the size of the data to be manipulated is considerable, it is important to move and execute calculations alongside the data (instead of transferring large volumes of data over the network). It is within this framework that the application of the \ textit {Computational Storage} concept for IoT takes place. This concept enables data processing and storage to be initialized as close as possible to connected objects, thereby reducing network traffic and improving data security and privacy. This reflection is in a straight line with the \ textit {Fog Computing} paradigm that extends the \ textit {Cloud Computing} to the edge of the network. Fog Computing provides a widely distributed computing and storage infrastructure, providing ideal support for applying the concept of Computational Storage in the context of IoT. In this context, I have implemented a middleware platform called Computational Storage Prostor, which provides: (i) a programming model for IoT processing data flow applications, and (ii) an execution environment for these Applications. Prostor can handle dynamic data stream (\ textit {data} fast) on short time scales, based on type of architectures \ textit {} Fog Computing. Prostor also allows the large-scale distribution of IoT applications on a Fog computing-type infrastructure as close as possible to the connected objects, while optimizing the data flows of these applications.
 
In this context, where the size of the data to be manipulated is considerable, it is important to move and execute calculations alongside the data (instead of transferring large volumes of data over the network). This is where the application of the Computational Storage concept for IoT takes place. This concept enables data processing and storage to be initialized as close as possible to connected objects, thereby reducing network traffic and improving data security and privacy.
 
This reflection is in line with the Fog Computing paradigm, which extends cloud computing to the edge of the network. Fog Computing provides a widely distributed computing and storage infrastructure, providing ideal support for applying the concept of Computational Storage in the context of IoT.
 
In this context, I have implemented a middleware platform called Computational Storage Prostor, which provides: (i) a programming model for IoT processing data flow applications, and (ii) an execution environment for these Applications. Prostor makes it possible to process fast data flows on short time scales, based on Fog Computing-type architectures. Prostor also allows the large-scale distribution of IoT applications on a Fog computing-type infrastructure as close as possible to the connected objects, while optimizing the data flows of these applications.
The Internet of Things connects large masses of smart objects that produce a deluge of heterogeneous and geographically dispersed data. Due to the large and densely distributed nature of IoT, the management of IoT data needs to be carried out in decentralized fashion as close to smart objects as possible, hence reducing the network traffic and latency, as well as improving data security and privacy. Toward this purpose, the concept of Computational Storage has been put forward. It advocates running data computations next to where the targeted data resides. 
 
In this context, we expressly focus on the opportunities offered by emerging large-scale computing infrastructures such as Fog Computing. Fog Computing provides a widely and densly distributed computing, storage and  networking infrastructure that represents the proper ground for applying the concept of Computational Storage in the context of IoT.
 
To fulfill the above purpose, I implemented a middleware platform for IoT called Prostor. It provides: (i) a programming model for IoT dataflow applications, and (ii) an execution environment for these Applications.  Prostor allows large-scale distribution of IoT dataflows on top of  Fog computing infrastructures, as close as possible to smart objects.
 

Prostor technical overview

Prostor is a location-aware distributed dataflow computation system that it is based on Apache Storm. Prostor makes it easy to process unbounded streams of IoT data, while deploying computations on specific locations defined by the developers (instead of the dynamic scheduling strategies inherently supported by Apache Storm).
 
Prostor presents the following features:
  • It is based on Apache Storm and accordingly supports all Storm features;
  • It comprehends a custom location-aware scheduler (Prostor-Scheduler), which enables deploying each component instance (of a Storm topology) on supervisor hosts specified by the developer;
  • The location-aware scheduler embeds a REST server and supports synchronous communication with Storm topologies (i.e., dataflow applications) at run-time; 
  • Prostor implements custom stream groupings (Prostor-Stream-Groupings), which promote (i) local streaming (between component instances running on the same hosts) and (ii) location-aware streaming (between components that run on specific hosts defined by the developer); 
  • Prostor is a virtualized framework (thanks to Docker technologies), i.e., all Prostor components run within Docker containers;
  • Prostor implements a Storm worker that is able to run on limited-resource devices (notably on raspberry pi).