This week’s BEACON Researchers at Work post is by MSU graduate student Chad Byers.
Perhaps it is because 91% of US-based data center professionals checked “Yes” in a recent survey for whether their company had experienced an unplanned data outage in the past 24 months. Or, perhaps it is because the average outage roughly cost $7,900 per minute … and lasted for over 119 minutes. Either way, you have just been charged with the task of designing software that can respond to unanticipated conditions and changing user objectives, online at run time, and as quickly as possible because as they say, “time is [truly] money.”
The vision of software systems self-adapting and self-reconfiguring in response to adverse conditions has made steady progress from the realm of wishful thinking to realization in modern-day society as computing continues to become more pervasive. Smart energy grids, telecommunication systems, smart traffic systems and similar emerging applications necessitate the deployment of dynamically adaptive systems (DASs) to cope with the various forms of uncertainty these applications bring with them.
The question is: how does one build these self-* characteristics (self-adaptive, self-managing, self-healing, self-optimizing, etc.) into our software systems? One common approach, albeit not necessarily a good approach, is to try and consider every possible scenario that your system might encounter and design a set of strategies to address them. However, this prescriptive approach is often not responsive enough (e.g., 119+ minutes) and there will likely be scenarios that “slip through the cracks” of the developer’s mind. An alternative approach is to take a cue from a natural process that has produced solutions well-adapted to their environment for billions of years, evolution. Rather than preloading our software system with only a set of static reconfigurations that it can switch between, why not embed the process that is capable of generating new reconfigurations? This is precisely the research I have been focusing on here at MSU with Dr. Betty Cheng, namely, mitigating uncertainty by harnessing evolutionary search within DASs.
Recently, we have been investigating the role that genetic algorithms play in coping with uncertainty in an industrial DAS application for remote data mirroring. Remote data mirroring is a safeguard that many businesses use to protect critical data by storing remote copies at one or more secondary sites (mirrors) across a network. In the case of a site outage or a failed network link, the system must quickly reconfigure so that data can continue to be accessed while minimizing revenue loss. However, there are many competing trade-offs to consider among solutions such as their (1) cost, (2) performance in effectively distributing data, and (3) reliability in the face of new adverse conditions. It is too costly and time-consuming to produce these solutions by hand and easily prone to human error. Instead, we represent the free variables of the network in a digital encoding (“DNA”) and allow the evolutionary processes of crossover and mutation to produce new network configurations. Using current monitoring information about the network, the genetic algorithm aims to return solutions that match the user’s desired network qualities. As new environmental conditions arise or the company’s needs change, this process is repeated and has been demonstrated to return successful solutions to within minutes. Our future collaborative work with Dr. Kalyanmoy Deb aims to explore how to further mitigate uncertainty through the generation of a *diverse* Pareto-optimal suite of solutions.
So back to the initial question: how do you design a system that can dynamically respond to changing conditions and objectives as quickly as possible at run time? Evolve it!
For more information about Chad’s work, you can contact him at byerscha at msu dot edu.