There have been more than four decades of computational systems inspired by natural evolution. It has become a major field of machine learning and optimization. Beyond AI, it has been used in hardware and circuit design, robotics, and more recently in industrial design and architecture. It has of course also been deeply explored in art and music.
The theory of natural evolution combines population, diversity, heredity and selection. Evolution requires a population of individuals that exhibit diversity (both similarities and variations between each other, both within and between species). These individuals can produce new individuals; offspring that exhibit similarites with the parent(s) through heredity. However not all of the population can successfully reproduce. Any factor that affects the possibility of an individual reproducing, thus also affects what characteristics are inherited in the population as a whole. Charles Darwin's theory of natural selection, proposed in 1859, is that the section of the population that can reproduce is not entirely random, but rather is regulated by interactions between inherited characteristics and environmental constraints (such as available food, populations of symbionts, predators and parasites, and so on). Accordingly, the characteristics of a species may change over time (evolution), forming a history that can be investigated through the fossil records.
Artificial evolution is a form of computational simulation whose process mirrors the abstract structure of natural evolution:
The main systematic differences are that the underlying mechanisms specified by us in advance, as are the initial populations and environmental conditions (if any). Most importantly, the mechanism of selection is usually predetermined.
This underscores a fundamental difference between artificial and natural evolution: most artificial systems are used toward a particular goal (teleological, even in artistic pursuits), natural evolution is adaptive in an open-ended, undirected, goal-less (non-teleological). In contrast, natural evolution does not imply progress, since the environmental conditions (and selective criteria) evolve along with each species. Since there is no absolute goal or progress, the most we can measure in natural evolution is the changing frequencies over time of individual species, or of individual characteristics of a population.
* A debate has arisen in recent years regarding evidence (or the lack thereof) for transcendental, teleological aspects of natural evolution; partly driven by conflicts between religous edicts and evolutionary science. Regardless of personal beliefs, in the context of this course we should approach the subject of evolution with a philosophical attitude that avoids essentialist thought.
The survival of a natural species depends on its viability; the ability of enough individuals to live long enough to reproduce within an unpredictable environment. There is no pre-defined (a priori) fitness measure. Most artificial evolutionary systems however impose a fitness measure that is extremely unnatural, gearing evolution toward a desired, fixed metric (some critics compare it more to selective breeding). Such artificial evolution will not match the creative diversity of natural evolution.
Nevertheless, an artificial, viability-oriented form of evolution may be used for more theoretical and aesthetic branches of artificial life research. In these cases the viability measure arises as an emergent property of underlying laws of the world, such as the requirement to maintain energetic/metabolic balance or to maintain structural integrity, as well as the collective effects of multiple species and non-living dynamics within the environment. For this reason a viability-oriented approach is sometimes referred to as ecosystemic selection. See discussion here
Darwin's theory is sometimes misrepresented as "survival of the fittest" or even the competitive "law of the jungle". We have already seen that the notion of "fittest" is misguided, since it implies a static absolute measure for something that is both dynamic and highly contextual. An individual or species does not need to be the fittest, merely fit enough to be viable. Nor is competition the prime mode of interaction between species; most species are relatively independent, and the ones that do closely interact are more likely to be collaborative (symbiotic, parasitic, etc.) than competitive.
Evolution does not imply that individuals display selfish, competitive behavior. When Dawkins described evolution in terms of selfish genes, it indicates a gene-centric perspective on evolution that implies selfless and sometimes altruistic behavior in organisms.
There is no reason to suppose that every variation affects viability (neither positively nor negatively). If a variation occurs that does not negatively or positively affect the reproductive capability of an individual in the environment, this variation is called neutral. Such neutral variations can tend to be accumulated over time (since there is a chance of variation at each reproduction), whose overall effect is to spread the gene pool of a population. If small changes can be accumulated in this way, over time the gene pool may even move quite far from the origin without major changes in selective fitness; this is called neutral drift.
It may be an important mechanism to escape evolutionary dead-ends (local minima in the fitness landscape). This is certainly true for many artificial evolutionary systems. It has also been hypothesized as an explanation for the long chunks of apparently unused DNA in our own genome.
Furthermore, natural evolution apperas to progress not in a smooth movement, but rather with periods of intense diversity and instability followed by extended periods of relative stability (a "punctuated equilibrium"). Neutral drift has also been proposed as a possible explanation for this behavior.
A continuous generation of novel diversity, new characteristics, is essential to the theory of natural selection. However the theory does not account for how diversity arises, simply that there must be a mechanism, which usually operates during reproduction.
In 1865 Mendel proposed that characteristics are transmitted to offspring through particles of matter (which we now call genetic material). Schroedinger conjectured that these materials must be aperiodic crystals, and the actual structure of DNA was identified several years later. The "modern synthesis" in biology today has integrated genetics with natural evolution, through the interaction of genotypes and phenotypes:
Hence the modern synthesis requires not only a model for how variation is introduced, but also how genetic material is transfered, how the phenotype accordingly emerges from the genotype (developmental models), and what other roles it plays. These mechanisms are complex and not yet fully understood, but much progress has been made.
Briefly: a biological cell contains a vast array of different proteins, whose concentrations determine structures and behaviors of the cell. The proteins are specifed by information in the DNA genetic material (grouped physically into chromosomes). When a cell reproduces by mitosis, a copy of the DNA is made in the new cell. The sections of a DNA chromosome that code for behavior are called genes. These regions are constantly being transcribed, producing a specific RNA strand for each coding gene region which is in turn used to produce a specific protein; the protein string immediately folds up (in a way we cannot yet simulate) into a particular reactive shape which specifies the protein's behavioral role in the cell. This is a one-directional flow of information: Coding DNA -> RNA -> folding -> active protein. In addition to coding regions genes may also have regulatory region which can react with specific proteins to activate or inhibit the coding-protein system, forming a complex regulatory network of interactions by which one gene can activate or inhibit another, and also specify changes of behavior of a cell according to environmental conditions such as chemical signals. These networks can be fantastically complex even in very simple organisms, according to the scientific results of functional genomics. Between the coding and regulatory regions of DNA, there are huge sections of nongenic DNA, whose role (or lack thereof) is not yet understood.
The current theory of cell replication and DNA transcription been beautifully illustrated by Drew Berry; and more of his animations here
Genetic variation can occur during replication of the genome, such as copying-error mutations (reversals of segments, insertion & removal of segments, changing individual elements in the sequence, and pair-wise substitution over whole sections) and recombination (taking sections from two different parent genes to construct a new child gene).
An artificial evolutionary system thus requires:
The system is then run by these steps:
Steps 2-5 may be run in lock-step, or asynchronously with overlapping individual life-spans.
Artificial evolution generates results with randomness, without formal proofs. It may take a long time or a lot of processing power to find a satisfactory result, or may not reach a result at all.
The representation of the genotype, and mechanisms of development, genetic transfer and variation, must be provided by the author. Many systems represent genetic information as a sequence of data, such as a string of characters or binary digits. Some systems use more elaborate structures (trees, networks) that are reducible to sequences. In artificial evolution an obvious analogy of the genotype-as-code and phenotype-as-running-program underlies most systems. Fewer systems provide full models of development and genetic transfoer, assuming instead a relatively predictable translation. Some systems explicity encode numeric values in the genotype (this is not naturalistic).
The mechanisms of variation possible partly depend on the representation chosen. The two most common principles of variation in artificial evolution are naturally inspired:
Why use reproduction for evolution? In the face of an unpredictable environment, we cannot know which strategy will be best; we can try small variations, and hedge our bets by making very many of them (population diversity). An individual loss is not catastrophic, but a few successes can be learned from. Furthermore, the face of unpredictibility implies that what was true today may not be tomorrow, so the flexibility to avoid timeless commitment is also a good strategy; but the inheritance of choices is a useful option when the environment retains some stability. If the world were fully predictable, a rational, teleological, monothematic strategy would be preferable. But the world isn't totally random either (if it was, there would be no valid strategy worth pursuing.)
See the red queen problem.
One of the central problems in evolutionary art and music is how to implement selection.
Some projects have also proposed a form of artificial co-evolution, where one population represents the candidate products, and the other population represents artificial critics.
Genetic Programming was invented by Nigel Cramer in 1985, but greatly expanded through the work of John Koza. GP evolves programs; it is an example of metaprogramming.
GP has been used to generate programs to solve hard problems, and to evolve control systems for artificial agents and robots. The central concept is that the generating a phenotype is a process of generating code. Populations of generated programs can then be selected and evolved as usual.
Typically the programs for GP follow a tree-like structure. The leaves of the tree are terminals and the branches are functions. Terminals have no inputs; typical terminals are constant numbers and global variable names. Non-terminal functions are specified according to their operator (such as mathematical addition, multiplication, cosine, etc.); they have one or more inputs, which maybe terminals or other functions. This structure is natural to LISP programs:
(* 6 (sin (+ x 2)))
The above would be represented in Lua code as follows:
return 6 * (math.sin(x + 2))
The Linear Genetic Programming variant which represents programs as sequences of instructions rather than trees. It more closely resembles the procedural nature of widely-used programming languages, virtual machines, assembly and machine code. Each instruction uses a single function, with zero or more arguments (constants or registers), and assigns to a register.
The above program could be linearized to normal form in Lua as follows:
local r1 = 2
local r2 = x
local r3 = r1 + r2
local r4 = math.sin(r3)
local r5 = 6
local r6 = r5 * r4
return r6
The linear structure may appear more flexible, since it allows branches to reconnect; however the functional programming interpretation of the tree is provably equivalent. Ultimately the choice depends on practical rather than theoretical questions.
It is convenient to represent these programs as a data structure, from which the phenotype code is generated. Doing so makes crossover and mutation much easier.
For example, integers can be used to specify which function or terminal type a node contains, and which nodes are used as arguments. Tree structures can nest their data directly, while linear structures can refer to nodes by integer register id (or register stack offset).
For example, here is the above program as a list of instruction codes:
-- operator IDs: { [1] = "constant", [2] = "global", [3] = "add", [4] = "mul", [5] = "sin" }
-- global IDs: { [1] = "x", }
local geno = {
{ 1, 2 },
{ 2, 1 },
{ 3, 1, 2 },
{ 5, 3 },
{ 1, 6 },
{ 4, 5, 4 },
}
Generating a seed tree can follow a recursive structure, starting from the root at depth 0, up to a maximum depth m
:
m
choose a terminal at random,An alternative algorithm for linear GP, with n
operations:
n
:n
is less than the maximum number of arguments (typically 2 or 3), create a random terminal node.n
. For a linearized genotype representation, the genetic mutations and crossover operators are similar to other sequence-based evolutionary systems.
For tree-like representations it becomes more complex and interesting: Mutations on a tree can include modifcation of instruction arguments, replacement of sub-trees, function mutation, etc. Cross-over can be implemented as swapping sub-trees of parents.
Karl Sims used GP for his genetic images, and for his evolving virtual creaturs.
Jurgen Schmidhuber proposed using GP to evolve GP (Meta-GP), since things like chromosomes, crossover etc. are themselves phenomena that have evolved.
A field guide to GP An overview paper Very short tutorial
See Karl Sims' Genetic Images. 1991 Siggraph Paper
Scott Draves, “Evolution and Collective Intelligence of the Electric Sheep,” The Art of Artificial Evolution, 2008.
An excellent discussion of the genetic algorithm in art and its relation to Deleuze, by Manuel Delanda