This week’s BEACON Researchers at Work blog post is by UT Austin graduate student Jacob Schrum.
I often find myself running wildly through the darkened corridors of some decommissioned mining facility, rocket launcher in hand, leaping madly about the hostile arena while trying to dodge bolts of lightning and hot shards of shrapnel directed at me by my enemies. As these and other projectiles bite into my skin, I feel myself weakening, and decide to flee combat in search of medical aid or a shield to protect me. Once rejuvenated, I throw myself back into the fray, but must first seek out my opponents. I take advantage of the high ground to launch a surprise attack on several skirmishers, leaving only one with whom I have a protracted battle. We’re evenly matched, but luck is not on my side, so I eventually succumb. I find myself instantly resurrected and flung back into combat to repeat the process.
This is what it feels like to play the First-Person Shooter action game Unreal Tournament: at any given moment, some high-level strategy needs to be chosen (fight, seek aid, search for opponents) before the low-level details of that strategy can be carried out (shoot, move, jump, look). My name is Jacob Schrum, and along with Risto Miikkulainen at the University of Texas at Austin, I research methods for evolving such strategic, multimodal behavior in video games like Unreal Tournament. Specifically, I evolve neural networks, which are simple models of the brain that serve as universal function approximators, and use them as control policies for agents in video games. The networks I evolve consist of artificial neurons that can be linked together in arbitrarily complex topologies by synaptic connections of varying strengths. Knowledge of how to behave is stored in the structure of the network and the strength of its connections. Such networks start out very simple, but gradually complexify throughout the course of evolution; hence this method is known as Constructive Neuroevolution. My research focuses on improving Constructive Neuroevolution methods so that they can automatically learn multiple modes of behavior in complex video games. Video games are ideal environments in which to design and test evolutionary methods, since they still contain much of the complexity of real-world environments, yet are controlled in a way that the real-world is not. Such environments can serve as stepping stones to real-world applications in robotics. However, existing learning methods in both video games and robotics tend to require humans to specify some sort of task decomposition in order for them to have a chance at solving complex problems requiring multiple modes of behavior.
An agent in Unreal Tournament would generally need separate modules for combat, path finding, and retreating. These modules would themselves be made up of smaller behaviors, such as an action for approaching items which could be used to approach health items while retreating, or to approach weapons and ammo while exploring the level with the path-finding module. Some modules like the combat module could even be broken down into more modules, such as a module to use when sniping opponents from a distance and a separate module for attacking opponents using rapid fire weapons. This tangle of modules becomes complicated very quickly, and therefore harder to construct manually. Learning how to break up a task into multiple subtasks automatically would spare humans the hassle of designing the hierarchy manually, and could also result in unexpected ways of breaking up the task which are actually more effective than what a human would do.
My approach to evolving multimodal behavior involves allowing neural networks to possess multiple output modes, ideally such that one network mode corresponds to each mode of behavior required in the target task. Such modes can be manually assigned to each task, but as mentioned above, such manual assignment requires lots of human engineering, and can often divide the domain in a manner that does not represent the most effective task assignment for learning. Therefore, networks are allowed to evolve new output modes as needed, and also have control over which mode to use at any given time, thus producing multimodal behavior without any knowledge about how to break up the domain into component tasks. Furthermore, because these “multimodal” networks share common sub-structures, information that is needed in multiple tasks can be shared across modes, which in turn accelerates learning. Such methods will lead to complex multimodal behavior various domains: from classic games such as Ms. Pac-Man, all the way to complex modern games like Unreal Tournament.
To learn more about Jacob’s work, you can contact him at schrum2 at cs dot utexas dot edu.