Scaling and Self-repair of Linux Based Services Using a Novel Distributed Computing Model Exploiting Parallelism
2011 IEEE 20th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (2011)
June 27, 2011 to June 29, 2011
This paper describes a prototype implementing a high degree of fault tolerance, reliability and resilience in distributed software systems. The prototype incorporates fault, configuration, accounting, performance and security (FCAPS) management using a signaling network overlay and allows the dynamic control of a set of nodes called Distributed Intelligent Managed Elements (DIMEs) in a network. Each DIME is a computing entity (implemented in Linux and in the future will be ported to Windows) endowed with self-management and signaling capabilities to collaborate with other DIMEs in a network. The prototype incorporates a new computing model proposed by Mikkilineni in 2010, with signaling network overlay over the computing network and allows parallelism in resource monitoring, analysis and reconfiguration. A workflow is implemented as a set of tasks, arranged or organized in a directed acyclic graph (DAG) and executed by a managed network of DIMEs. Distributed DIME networks provide a network computing model to create distributed computing clouds and execute distributed managed workflows with high degree of agility, availability, reliability, performance and security.
Communication System Signaling, Distributed computing, Distributed Software Systems, Workflow Automation, Network Centric Computing Model
G. Morana and R. Mikkilineni, "Scaling and Self-repair of Linux Based Services Using a Novel Distributed Computing Model Exploiting Parallelism," 2011 IEEE 20th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises(WETICE), Paris, France, 2011, pp. 98-103.