Decision making support by the DISRUPT system is based on (1) the continuous monitoring of data from enterprise operations (plant floor and supply chain), and (2) the analysis of monitored data based on Machine Learning to detect deviations from the norm.
The European project DISRUPT (http://www.disrupt-project.eu/) aims to facilitate the transition into Industry 4.0 (I4.0) smart factories through developing an adaptable system that supports horizontal and vertical integration. The DISRUPT system is being designed to support, in close to real time, decision making and enactment of decisions on the production and scheduling of manufacturing under events that disrupt enterprise operations, such as delays in the supply chain or failure in a production line. Being developed with model-driven design principles, the DISRUPT system can be easily adapted/reconfigured to new scenarios or requirements in decision-making or handling of events, by simply changing the knowledge models at high level by domain experts in manufacturing.
We participated in the specification of the DISRUPT system software architecture, and were responsible for validating the architecture against the state of the art in IIoT/I4.0 systems. This validation was carried out through contrasting the functionality of the DISRUPT system to the functionality involved in the envisioned uses cases documented by the reference architectures IIRA and RAMI4.0.
IIRA by the Industrial Internet Consortium provides guidance for the entire development process of IIoT systems (in Energy, Healthcare and Manufacturing, among other domains): from the evaluation of business benefits and use cases analysis up to design an deployment of a generic software infrastructure.
RAMI4.0 by the German initiative Platform Industry 4.0 provides guidance for the design of virtual/digital representation of assets. RAMI4.0 proposes a cubic layer model and the I4.0 Component model. The I4.0 component model “constitutes a specific case of a cyber-physical system”. An I4.0 Component comprises an asset and a (software) wrapper, called Administration Shell, that enables remote access to the properties (data and functions) of the asset. Interaction with, and between, I4.0 components is to be through a service hierarchy comprising application, information and communication service layers.
The DISRUPT architecture was published thus:
Specification of a Software Architecture for an Industry 4.0 Environment. ES2018: 6th International Conference on Enterprise Systems, Limassol, Cyprus, October 1-2, 2018.
An Architecture for Disruption Management in Smart Manufacturing. Conf. SMARTCOMP 2018, Sicily, Italy, June 18-20, 2018.
A Proposal of Decentralised Architecture for Optimised Operations in Manufacturing Ecosystem Collaboration. Conf. PRO-VE 2017, Vicenza (Italy), September 18-20, 2017.
We also investigated the design of a methodology for RAMI4.0 services for different types of I4.0 components. This work can be found here:
Towards a Methodology for RAMI4.0 Service Design. ES2018: 6th International Conference on Enterprise Systems, Limassol, Cyprus, October 1-2, 2018.
Our approach to the design of RAMI4.0 services is based on Object-Oriented Analysis and Design (OOA&D) principles. The rationale for using OOA&D is the similarity that holds between objects and assets as I4.0 components. Data and functions within objects are accessible through method invocations; data and functions of assets are accessible through service invocations (managed by an Administration Shell). Although the similarity of objects and I4.0 components is apparent and SOA and OOA&D are well known, how to combine these technologies for the design of I4.0-Component systems in a systematic manner is not obvious. The paper discusses relevant relationships between those technologies and an approach to combining them.
Autonomic computing refers to self-managing computing resources that adapt to unpredictable changes in order to free operators and users from low-level task management, while delivering better system behavior in terms of performance or other criteria.
The goal of an autonomic load balancing system is to distribute load across multiple processors in such a way as to optimize system performance. An autonomic load balancing system will continuously monitor the performance of the system across all processors, and decide if a new policy for distributing work is warranted. For instance, one processor may be highly overloaded, and work will need to be redirected from that processor to other processors. It may require that some of the existing load on that processor be moved to another processor.
A key aspect for an autonomic load balancing system to show good performance is the criterion to decide when to adapt and how to adapt work distribution in the presence of work imbalance.
Similar criteria is used by Machine Learning AI solutions designed to identify/predict process outcomes that deviate from the norm. The following issues, among others, must be defined/designed: "what the norm (or normal operation) is", "when an outcome is out of the norm".
We designed and evaluated various such criteria to adapt work distribution of parallel distributed queries based on competitive algorithms, probability and the mathematical expectation. This work was published thus:
Applying Probabilistic Adaptation to Improve the Efficiency of Intra-Query Load Balancing. International Journal of Adaptive, Resilient and Autonomic Systems, January-March 2013, Vol. 4, No. 1, 26-59.
Autonomic Query Parallelization using Non-dedicated Computers: An Evaluation of Adaptivity Options. The VLDB Journal, Volume 18, Number 1 / January, 2009, pp. 119-140.
Probabilistic Adaptive Load Balancing for Parallel Queries. In Proceedings of the 3rd International Workshop on Self-Managing Database Systems (SMDB-2008), held on 7th April 2008 in conjunction with the 24th International Conference on Data Engineering (ICDE 2008). Cancún, México.
To facilitate the development of eLearning solutions, we designed the Agora Framework (AF), a visual tool and a method to help generate eLearning models. An eLearning model does not correspond to a high-level design of an eLearning system. Rather, it is similar to a business model, comprising both the principal components involved in delivering (selling) a particular type of eLearning and the relationships between the principal components.
As shown in the figure, an eLearning model consists of a Cost Structure, Key Resources (hardware, software, teachers, courses), Goals, Learners Segments (types of course consumers who seek a qualification or training), eLearning types, Pedagogical Basis for course design, etc.
AF is based on the ideas in the book Business Model Generation by A. Osterwalder and Y. Pigneur. The rationale behind AF (i.e., treating eLearning delivery as a business) is that content of courses in eLearning systems is a product on and by itself. Consumers/learners should learn from this content to be satisfied. In contrast, content in other types of web systems is only information about a product or service.
As a visual tool, AF has the purpose of helping to: i) visualise all the elements involved in running an eLearning model; ii) classify the options within each element, e.g., different types of eLearning to be delivered such as synchronous or asynchronous; and iii) reason about the relationships between element, i.e., how different options of one element affect the options of other elements, e.g., how different resource configurations affect cost. The figures below show how to use AF.
AF should be drawn on a blackboard or printed large and pinned to a wall, but before using AF the two following activities should have taken place: a) gathering and analysis of information about each element of AF, particularly current solutions, methods and tools, etc.; and b) organising meetings to include relevant people, e.g., teachers, experts in education, cloud computing solutions, eLearning platforms, etc.
The use of AF is a two-step iterative process until all the elements of AF are configured according to the Goals of the eLearning model, hence: 1) the Goals of the eLearning model should be discussed first in order to guide 2) the analysis and classification of the information previously gathered onto each element of AF, as shown in the figure. In practice, goals may change once information is analysed and this is alright; it is just that the goals are getting better understood.
Clearly, once an eLearning model has been configured and agreed on, its information pinned on the wall is of little use. This information should be made available to all project stakeholders in a readable format and, of course, it should guide the development of the eLearning project.
The Agora Framework is described in more detail here:
DESIGNING ELEARNING MODELS: The Agora Framework. CSEDU 2012: 4th International Conference on Computer Supported Education. April 16-18, 2012, Porto, Portugal, pp 265–270.
Therein we also highlight the point that, if each element of AF is sufficiently defined, then the eLearning model arrived at would comprise a good portion of the information typically found in requirements engineering documents that are used to guide software design and development. The information of an eLearning model only needs to be organised and extended as required.
Karl E. Wiegers suggests three requirements engineering documents in his book Software Requirements: The Vision and Scope document specifies the business requirements, the Use Cases document specifies the users requirements, and the System Requirements Specifications document specifies the functional and non-functional system requirements.
A method and tool (a document template) to derive the Vision and Scope document corresponding to an AF eLearning model is described here (with an example):
From eLearning Models to eLearning Requirements Engineering – The Vision and Scope Document. IASTED Int’l. Conf. Software Engineering (SE 2012), June 18-20, 2012, Crete, Greece, pp 7–14.
eLearning content that is effective must facilitate learning and engage learners — and it should be possible to determine whether learners do, or do not, learn from such content. Developing effective eLearning content is complex, as it involves subject matter experts (SMEs), instructional design experts (IDEs), technical design experts (TDEs), and production personnel (PP).
SMEs specify and write adequate content for the target learning unit such as a topic, lesson, etc.
IDEs design the best instructional experiences: sequence of activities learners should carry out for them to learn that content.
TDEs (aka techno-pedagogues) design those activities into/around digital resources such as graphs, plots, sound, videos, games, etc.
PPs are graphics designers and software programmers who develop those digital resources using software tools such as Flash, Photoshop, Powerpoint, etc. They also organise all eLearning content into web (HTML) pages within a learning management system (LMS) such as Moodle and talenLMS.
Overall the software tools used by PPs are rather good and flexible. PPs can develop practically whatever they can imagine regarding visual display. However, those software tools do not cater for effective eLearning content development as a whole. And unfortunately, assembling and running a team of specialists is also complex and costly, and team joint operation tends to be slow because participants’ expertise is usually in different areas. Thus many companies and institutions outsource eLearning content development.
BP4ED is a web system that we designed to facilitate the development of effective eLearning content. BP4ED embodies some of the best practices to organise eLearning content as suggested by W. Horton in his book E-Learning by Design, and to present (display) eLearning content as suggested by R. Clark and R. Mayer in their book e-Learning and the Science of Instruction: Proven Guidelines for Consumers and Designers of Multimedia Learning.
In the figure, Curricula are academic programmes composed of related courses that lead to a degree or certificate in a subject area; Courses are composed of Lessons, and a lesson is composed of Topics. Topics are designed to accomplish a single low-level learning objective using learning Activities that provoke, each, a specific learning experience. Learning activities are designed using multimedia: text, pictures, voice, video, etc. There are four types of activities: Absorb, Do, Connect, and Tests
Note that Topics are to be organised as Learning Objects, which Horton defines as “... a chunk of electronic content that can be accessed individually and that completely accomplishes a single learning objective and can prove it.”
The concept of Learning Objects (LOs) is powerful because it resembles software objects/programs. As LOs are composed of learning activities which include tests (to prove the learning objective has been accomplished), they show complex navigation and interactivity with learners. They are like programs whose behaviour depends on users input, the output corresponding to tests results.
BP4ED consists of a set of templates designed to correspond to the units of eLearning in the figure (curricula, courses, etc.). On selecting the desired units of eLearning, BP4ED automatically assembles the corresponding templates into the corresponding hierarchy — SMEs can then write or insert relevant content (text or multimedia resources) in the fields of the templates selected. This will tend to reduce the role of IDEs.
From the resulting structure of a course, lessons, topics and activities, BP4ED determines both navigation and interactivity with the user, thus eliminating the corresponding design and implementation work by TDEs and PPs.
Clark and Mayer have minutely analysed various issues involved in presenting multimedia content, and have synthesised a few principles for doing it effectively so that learners feel more comfortable and less overloaded. Some principles are:
“Place printing words near corresponding graphics”. Both should appear in the same screen – it should not be necessary to scroll the screen to read the text relevant to a graphic.
“Present words as audio narration rather than on-screen text.” Thus incoming information is “split across two separate cognitive channels — words in the auditory channel and pictures in the visual channel rather than concentrating both in the visual channel.”
“Explain Visuals with Words in Audio OR Text: Not Both ... learners may try to compare and reconcile on-screen text and the narration, which requires cognitive processing extraneous to learning the content ”.
BP4ED includes screen templates to organise eLearning content around Clark and Mayer’s principles that facilitate formatting of content for adequate display by browsers. This will reduce web page design and implementation work by TDEs and PPs.
There is more to BP4ED (e.g., the templates for the activities Absorb, Do, Connect, Tests ) here:
BP4ED: Best Practices Online for eLearning Content Development: Development based on Learning Objects. ICSOFT-EA 2014: the 9th International Conference on Software Engineering and Applications. Vienna, Austria, 29-31 August, 2014, pp 176-182.
Parallel computing has the purpose of improving application performance: reducing response/running time. To this end, the computation of the application has to be divided among several processors, which need to communicate with each other in order to coordinate their tasks and to share data and code. Communication can take place through shared memory or through message passing, or both, and is determined by the underlying hardware interconnect and software libraries used. Distributed systems has the main purpose of sharing resources through message passing. Shared resources include almost anything digital: files, devices (such as printers), database systems, processing nodes, entire clusters as in cloud computing, etc.
Our experience in parallel and distributed computing comprises software design and development for different parallel architectures, including shared memory multiprocessors and multicores, message passing multicomputers and clusters, GPUS, and Intel Movidius Neural Compute hardware for Deep Learning. We have also worked with scalable multiprocessors which support a shared address space on top of physically distributed main memory in order to increase memory bandwidth with the number of processors. In contrast, multiprocessors and multicores are not scalable beyond tens of processors/cores because memory, being centralised and accessed by all processors through a single shared interconnect such as a bus, causes more contention the more processors are used.
We have combined both shared memory and message passing programming in order to capitalise at the same time both on intra-node parallelism within a multicore and on inter-node parallelism between the multicore nodes in a cluster. We have developed with the Pthreads library, the OpenMP interface, the Message Passing Interface (MPI), the Hadoop platform (the free open version of MapReduce, Google's programming model and development environment), the GPU Cuda library, and Movidus.
The first DDM with buses as interconnect medium
Software runs on the leaf Processing nodes only. Directory Controller nodes above the leaves keep track of all data items below. The hierarchy helps to localise communication. If a processor accesses a data item that is not yet resident is its local memory, a request is put on the bus above, and thus snooped by all the memory controllers and the directory controller connected to the bus. If a memory controller has the item, it will provide a copy on the bus. Otherwise the directory controller will put the request on the bus above, and copy will eventually be received.
Our research on Virtual Memory on DDAs can be found here:
Virtual Memory on Data Diffusion Architectures. Parallel Computing 29 (2003) 1021-1052.
The Diffusion Space of Data Diffusion Architectures. Parallel Computing 30 (2004) 1169-1193.
Data diffusion architectures are scalable multiprocessors that provide a shared address space atop distributed main memory. The distinctive feature of DDAs is that data “diffuses”, or migrates and replicates, in main memory (by hardware) according to whichever processors are using the data; thus effective access time tends to be local access time. This property is key for both scalability and software portability, and is possible due to the associative organisation of main memory, which in effect decouples each address and its data item from any physical location. We investigated the design space of virtual memory organisations on DDAs, and evaluated two organisations using the Horn Data Diffusion Machine (Horn-DDM), a DDA designed and prototyped with a parallel emulation at Bristol University.
Non-uniform memory access (NUMA) machines and cache-coherent (CC) NUMAs also support a shared address space atop distributed main memory to cater for scalability. However, a key factor to achieve scalable performance is that each processor accesses code and data mostly locally, either in its cache or in its local memory. If a processor mostly incurs remote accesses, performance may degrade significantly. That is, the placement of data upon memory nodes in these architectures is an issue that must be addressed either by system software or application software, or both. In other words, NUMAs and CC-NUMAs do not support software portability.
DDAs do support software portability because placing data in memory is not a software issue. Effective access time tends to be local access time thanks to the hardware data diffusion capability. Hence software for (non-scalable) shared memory multiprocessors or multicores can readily be used in DDAs. We proved this by porting part of the Mach operating system virtual memory to the Horn-DDM emulator with minimal, low-level edits due to the port. In addition, DDAs offer new possibilities to organise virtual memory that are more efficient in terms of performance and that take advantage of the associative organisation of main memory.
Of course, associative main memory would be more expensive than traditional main memory. But hardware has steadily become cheaper, while software development cost has been increasing. In any case, the first scalable shared memory machine that was commercially available was a DDA, the KSR-1 from Kendall Square Inc, yet, for Exascale Computing, NUMA architectures are currently being investigated that comprise processing nodes that are configured, each, with a multicore and multiple GPUs.
Parallel computing has been motivated since the advent of early computers by the need to reduce response time of computation-intensive applications, such as the solution of mathematical models in science and engineering. It used to be so costly that it prompted the invention of the Internet, so that scientists in different locations could remotely share the most powerful parallel computers of the 1970s in the US. By the late 1990s, clusters of PCs made parallel computing ready to hand, and today are the most widely used parallel architecture. Built with commodity hardware and free software, PC clusters offer the most convenient cost-to-computation ratio. Even their forerunners, the well known local area networks (LANs), were affordable to small and medium enterprises. And as PCs, LANs and the Internet reached more people the world over, cluster components got into world-wide supply and demand, and just got better, faster and cheaper. In contrast, parallel architectures designed as such up to the 1990s, had a relatively small market, and were rather expensive as a result of using either custom design processors or communication hardware, or both, along specially tailored software. Hence only large universities or governments could afford them.
We investigated data diffusion capabilities for parallel computing in clusters, where communication between processing nodes is through message passing by software, TCP-IP or MPI (in DDAs, NUMAs and CC-NUMAs, communication is by hardware). This work was published thus:
Adaptive Parallel Matrix Computing through Compiler and Run-time Support. Intl. Conference ParCo 2009, 1-4 Sep 2009, Lyon, France. In Parallel Computing: From Multicores and GPU’s to Petascale. Volume 19 in Advances in Parallel Computing, pp. 359-368. B. Chapman et al. (Eds.). IOS Press, ISBN: 978-1-60750-529-7, April 2010.
The Data Diffusion Space for Parallel Computing in Clusters. Euro-Par 2005, 30 August – 2 September, Lisbon, Portugal. Lecture Notes in Computer Science (2005) 3648: pp 61–71.
Comparing Two Parallel File Systems: PVFS and FSDDS. ParCo 2005, 13 – 16 September 2005, Málaga, Spain. In Proceedings of Parallel Computing: Current & Future Issues of High-End Computing; edited John von Neumann Institute for Computing (NIC), Germany, NIC Series, Volume 33, ISBN 3-00-017352-8, October 2006.
Distributed Parallel File System for I/O Intensive Parallel Computing on Clusters. In Proceedings of International Conference on Electrical and Electronics Engineering and X Conference on Electrical Engineering, ICEEE/CIE 2004, Acapulco Guerrero, Mexico, September 8-10, 2004.
Overview of the DLML Protocol
DLML is a library that processes data lists in parallel. Users only need to organise their data into items to insert and get from a list using DLML functions. DLML applications run under the SPMD (Single Program Multiple Data) model: all processors run the same program but operate on distinct data lists. When a list becomes empty, it is refilled by DLML through fetching data items from another list transparently to the programmer. Only when DLML_get() does not return a data item the processing in all nodes is over. DLML functions hide synchronisation communication from users, while automatic list refilling tends to balance the workload according to the processing capacity of each processor, which is essential for good performance.
The response time of a parallel computation corresponds to the time of the slowest processor participating in the computation. All other processors must idly wait for the slowest processor. Hence, a key factor for a parallel computation to achieve good performance is that the workload be partitioned among the processors according to their processing capacity. This is not simple to solve when applications generate more work dynamically, or under multiprogramming which tends to unbalance the workload of processors participating in a parallel computation.
For cluster computing, we investigated how to dynamically balance the workload according to the processing capacity of the processors, using lists (pools) of work from which processors take work when needed and according to their processing capacity. The result of this work was a middleware for parallel computing called the Data List Management Library (DLML).
The first version of DLML was designed to run on clusters composed of uniprocessor nodes, and was based on multiprocess parallelism and message-passing communication with MPI.
We also developed MultiCore (MC) DLML, a multithreaded design of DLML to better capitalise on the intra-node parallelism of multicore nodes. MC-DLML uses shared memory (Pthreads library) for intra-node communication between sibling threads running in the same node, and message passing (MPI library) for inter-node communication between processes in different nodes. MC-DLML was also extended to make use of GPUs for processing data lists, improving performance significantly.
Our work on DLML was published thus:
Greedily Using GPU Capacity for Data List Processing in Multicore-GPU Platforms. CCE-2013: 10th International Conference on Electrical Engineering, Computing Science and Automatic Control. Mexico City, Mexico, 30 September to 4 October (2013), pp 195-200.
DLML-IO: a library for processing large data volumes. PDPTA-2013: 19th Int’l Conference on Parallel and Distributed Processing Techniques and Applications. Las Vegas, Nevada, 22-25 July, 2013, pp 699-705.
Parallel Data List Processing on Multicore-GPU Platforms. PDPTA-2012: 18th Int’l Conference on Parallel and Distributed Processing Techniques and Applications. Las Vegas, Nevada, USA, 16-19 July, 2012, pp. 324-330.
A Software Architecture for Parallel List Processing on Grids. LNCS (2012) 7203: pp 720–729. PPAM 2011: 9th Int’l Conference on Parallel Processing and Applied Mathematics. September 11-14, 2011, Torun, Poland.
Thread-Locking Work Stealing under Parallel Data List Processing in Multicores. PDCS 2011: 23rd IASTED International Conference on Parallel and Distributed Computing and Systems. Dallas, USA, December 14-16, 2011, pp 190-197.
Reducing Communication Overhead under Parallel List Processing in Multicore Clusters. In Proceedings of CCE 2011: 8th Int’l Conference on Electrical Engineering, Computing Science and Automatic Control. Mérida Yucatán, México, 26-28 October, 2011, pp. 780-785.
Low-synchronisation Work Stealing under Parallel Data-List Processing in Multicores. PDPTA-2011: 17th Int’l Conference on Parallel and Distributed Processing Techniques and Applications. Las Vegas, Nevada, USA, 18-21 July, 2011, pp 850-856.
Segmentation of Brain Image Volumes Using the Data List Management Library. In Proceedings of the 29th Annual International Conference of the IEEE EMBS, pp. 85–88. Lyon, France, August 23-26, 2007.
Simple, List-based Parallel Programming with Transparent Load Balancing. In Proceedings of PPAM 2005 (Sixth International Conference on Parallel Processing and Applied Mathematics), September 11–14, 2005, Poznan, Poland. Lecture Notes in Computer Science 3911 (2006) 920–927.
Easing Message-Passing Parallel Programming Through a Data Balancing Service. In Proceedings of Recent Advances in Parallel Virtual Machine and Message Passing Interface, 11th European PVM/MPI User’s Group Meeting, Budapest, Hungary, September 19 – 22, 2004. Lecture Notes in Computer Science 3241.
Integration of a Load Balancing Mechanism into a Parallel Evolutionary Algorithm. In Proceedings of Advanced Distributed Systems, Third International School and Symposium, ISSADS 2004, January 24-30, Guadalajara, México. Lecture Notes in Computer Science 3061/2004, pp. 219-230.
Feature extraction (FE) process is one of the most compute-intensive task of a search engine. The FE exploits two levels of parallelism, idle nodes in the network using the open source library parallel virtual machine (PVM) and the potential availability of multiprocessor nodes using multi-threading programming.
Two applications were developed for two different platforms: a network of Transputers using C language, and a network of SUN-SPARC workstations using PVM (parallel virtual machine). The first application consisted in developing and implementing an image filter to eliminate noise. It used synchronous and deterministic cellular automata. The second application used a parametric family of genetic algorithms to provide optimal solutions of bridge structures in the area of civil engineering. The algorithms were part of a decision-making support system to allow the evaluation of multiple configurations for aesthetic design of bridge structures.
VPPE: a novel Visual Parallel Programming Environment. Intl J Parallel Programming (2019) 47: pp 1117–1151.
Relational Learning with GPUs: Accelerating Rule Coverage. Intl J Parallel Programming (2016) 44: pp 663–685.
Processing Markov Logic Networks with GPUs. ILP-2015: 25th Intl. Conf. on Inductive Logic Programming, Kyoto, Japan, August 20-22, 2015. LNCS (2016) 9575: pp 122–136.
A Datalog Engine for GPUs. M. Hanus and R. Rocha (Eds.): Declarative Programming and Knowledge Management, KDPD 2013, LNAI (2014) 8349: pp 152-168.
This program allows several users to have an audio conversation. A server waits for incoming calls, when a call is received it sends back a Session Description message (SDP). The caller interprets it and establishes communication with the server. The audio packets transfer is carried out using the Real-time Transport Protocol (RTP). The audio from one user is coded and sent to the server. The server receives the audio from the connected users, mixes the audio from all the users and sends the stream to the connected users so everyone listens to what everyone is saying. This was implemented in C under Linux OS.
This program allows browsing the file directory of a Windows PC from a Linux PC. It was implemented using TCP/IP sockets using a client/server architecture. The Windows side plays the role of the server, it was programmed using Visual C. The Linux side plays the role of the client. It was programmed in C and the GUI was created using Gtk libraries.