Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service
2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW) (2018)
Memphis, TN, USA
Oct 15, 2018 to Oct 18, 2018
The advertising industry faces numerous challenges in achieving its goal of targeting a given audience dynamically and accurately in order to deliver a meaningful brand message. Near real-time, low latency delivery of dynamic content, the sheer volume of information processed, and the sparse geographic distribution of the intended eyeball traffic all drive the complexity of building a successful experience for the end user and the brand. Additionally, the competitiveness of the industry makes it critical to preserve low operational expenses while delivering reliably at scale. In attempting to address the above, we have found that a distributed infrastructure that leverages public cloud providers and a private cloud with open infrastructure technologies can deliver dynamic advertising content with low latency while preserving its high availability. But network or physical utility infrastructures can't be relied on to ensure the service dependability. We show that the complexity of the networks, the sparse geographic distribution of eyeballs, the risk of data center failures, and the increase of encrypted transactions call for thoughtful architectures. The introduction of modern practices, failure injections, and self-healing mechanisms allowed us to improve the service fault tolerance while optimizing for latency and significantly improve our service reliability.
advertising data processing, cloud computing, computer centres, Internet, software fault tolerance
N. Brousse and O. Mykhailov, "Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service," 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Memphis, TN, USA, 2018, pp. 1-5.