BEACON Researchers at Work: Evolving Robot Behavior

This week’s BEACON Researchers at Work blog post is by MSU graduate student Chad Byers.

I have always been fascinated by the mechanisms that drive both the molecular and digital systems of our world and have been fortunate enough, through the BEACON Center, to work in an environment providing the resources to pursue knowledge in both.  A cell, the fundamental building block of an organism, performs a number of internal functions essential to the organism’s survival in order to produce control properties that we, as computer scientists, wish to incorporate into the design of our digital systems. Properties such as decentralization, resiliency to failure, and cooperation have been notoriously difficult to mimic from nature. One alternative is to capture several of the key components from the biological cell “model” into a digital model and allow these components to freely mutate, thereby using evolution to control the behaviors of a system, such as a wheeled robot. In this way, we allow evolution to naturally select for these same properties (robustness, resiliency, etc.) due to the selective advantage they provide over their competition in a digital population.

As most biologists would agree, control within a biological cell is both massively distributed and massively parallel, arising out of complex networks of interactions. When we first started down this route for finding a control system amenable to the process of evolution, we often found ourselves between a rock and a hard place. Many successful bio-inspired techniques, such as artificial neural networks, upheld the qualities of massive parallelism and distributed control, however, it was difficult to truly understand how an evolved network successfully performed the task. On the opposite end of the spectrum were techniques such as Genetic Programming whose genomes were a sequence of instructions that altered the robot itself (ex. “move-backward”) or interacted with the robot’s environment (ex. “if-sense-color”). However, evolved programs of this type often end up littered with regions that endlessly sense information from the environment or suffer from bloated instruction sets (ex. “if-sense-red”, “if-sense-blue”, “if-sense-green”, etc.) making control difficult.

With these difficulties in mind, we decided to design a digital model to map components from the biological cell model of control into the digital realm. To do this, we used the process of signal transduction from nature as inspiration where molecules form a universal medium of information that continually pass into a cell via receptors, encapsulating information concerning the cell’s current environment. These molecules are often manipulated, altered, and communicated internally in order to produce the cell’s response. Fortunately, the world of digital systems is not too different where the universal medium is instead based on sequences of bits (bitstrings) that are altered through instructions in the computer to produce the system’s response.

To begin with, we decomposed the process of biological signal transduction into a 3-step process: (1) Sense, (2) Compute, and (3) Respond, in order to provide control for a simulated wheeled robot. Similar to a cell’s receptors, a robot possesses various sensors for detecting stimuli such as obstacles, colors, and sounds. In the first step of our model, Sense, the robot uses its sensors to detect nearby stimuli and sets a corresponding bit within a bitstring to True, signifying the presence of the stimuli in one of the sensors. Once all of the stimuli in the robot’s environmemnt have been mapped to their corresponding bit(s), the second step, Compute, takes place. During this step, evolved computer programs called digital enzymes execute in parallel with one another by reading bitstrings from the environment, altering their stored bits, and using these altered bitstrings to guide the simulated robot’s behaviors. To relieve the human design bias discussed earlier, we allow the number of unique programs and the instructions contained within each program, to mutate and evolve freely. Finally, in the last step, Respond, we observe the bitstrings that were sent to guide the robot’s behaviors and determine which actions were “voted” upon by digital enzymes during the Compute step. The end result from this process is a majority-bit vote for how the robot should turn, move, emit color, and emit sound. With this design, we are able to obtain relief the bloated instruction set problem by instead mapping the sounds, colors, and actions of both the system and environment into bitstrings which can be exchanged and interpreted throughout the system.

In our preliminary work, we wanted to test our proof of concept design on a target problem that would allow properties such as robustness and cooperation to be selected for in evolved, simulated robot controllers. We decided that one critical task that nearly every organism on our planet faces at some point or another is foraging for food to sustain life. Each robot controller in the population starts as a blank slate, containing only one copy of empty program. To evaluate each robot design, we created a clonal colony of 6 robots, using a given controller, who were charged with the task of finding and returning 8 food items to a central home region in an unbounded world, as quickly as possible. After 1000 generations of evolution, mutation and natural selection built many surprising strategies, using both sound and color, in order to successfully forage in their digital environments.

One of these evolved strategies is shown in the video below where the robots (arrows) make circular movements near home, mimicking home’s color (red) and effectively broadening the sense of home in the environment. As they search around home and discover an item of food (blue), they immediately switch their behavior to act as a locator beacon, blinking their lights as a notification to others. Over time, this strategy allows the colony to successfully find and return the food items in their environment to home.

As we move forward in the future with this research, we are interested in looking at what aspects of our digital model are most important for driving behavior such as distributed control, interaction, memory, consensus, etc. and how these properties are influenced by the environment.

For more information about Chad’s work, you can contact him at byerscha at msu dot edu.

This entry was posted in BEACON Researchers at Work and tagged , , , , . Bookmark the permalink.

Comments are closed.