a_field_guide_to_genetic_programming.pdf

(3769 KB) Pobierz
A Field Guide to
Genetic Programming
Riccardo Poli
Department of Computing and Electronic Systems
University of Essex – UK
rpoli@essex.ac.uk
William B. Langdon
Departments of Biological and Mathematical Sciences
University of Essex – UK
wlangdon@essex.ac.uk
Nicholas F. McPhee
Division of Science and Mathematics
University of Minnesota, Morris – USA
mcphee@morris.umn.edu
with contributions by
John R. Koza
Stanford University – USA
john@johnkoza.com
March 2008
c Riccardo Poli, William B. Langdon, and Nicholas F. McPhee, 2008
This work is licensed under the
Creative Commons Attribution-
Noncommercial-No Derivative Works 2.0 UK: England & Wales License
(see
http://creativecommons.org/licenses/by-nc-nd/2.0/uk/).
That
is:
You are free:
to copy, distribute, display, and perform the work
Under the following conditions:
Attribution.
You must give the original authors credit.
Non-Commercial.
You may not use this work for commercial
purposes.
No Derivative Works.
You may not alter, transform, or build
upon this work.
For any reuse or distribution, you must make clear to others the licence
terms of this work. Any of these conditions can be waived if you get
permission from the copyright holders. Nothing in this license impairs
or restricts the authors’ rights.
Non-commercial uses are thus permitted without any further authorisation
from the copyright owners. The book may be freely downloaded in electronic
form at
http://www.gp-field-guide.org.uk.
Printed copies can also
be purchased inexpensively from
http://lulu.com.
For more information
about Creative Commons licenses, go to
http://creativecommons.org
or send a letter to Creative Commons, 171 Second Street, Suite 300, San
Francisco, California, 94105, USA.
To cite this book, please see the entry for (Poli, Langdon, and McPhee,
2008) in the bibliography.
ISBN 978-1-4092-0073-4 (softcover)
Preface
Genetic programming (GP) is a collection of evolutionary computation tech-
niques that allow computers to solve problems automatically. Since its in-
ception twenty years ago, GP has been used to solve a wide range of prac-
tical problems, producing a number of human-competitive results and even
patentable new inventions. Like many other areas of computer science, GP
is evolving rapidly, with new ideas, techniques and applications being con-
stantly proposed. While this shows how wonderfully prolific GP is, it also
makes it difficult for newcomers to become acquainted with the main ideas
in the field, and form a mental map of its different branches. Even for people
who have been interested in GP for a while, it is difficult to keep up with
the pace of new developments.
Many books have been written which describe aspects of GP. Some
provide general introductions to the field as a whole. However, no new
introductory book on GP has been produced in the last decade, and anyone
wanting to learn about GP is forced to map the terrain painfully on their
own. This book attempts to fill that gap, by providing a modern field guide
to GP for both newcomers and old-timers.
It would have been straightforward to find a traditional publisher for such
a book. However, we want our book to be as accessible as possible to every-
one interested in learning about GP. Therefore, we have chosen to make it
freely available on-line, while also allowing printed copies to be ordered in-
expensively from
http://lulu.com.
Visit
http://www.gp-field-guide.
org.uk
for the details.
The book has undergone numerous iterations and revisions. It began as
a book-chapter overview of GP (more on this below), which quickly grew
to almost 100 pages. A technical report version of it was circulated on the
GP mailing list. People responded very positively, and some encouraged us
to continue and expand that survey into a book. We took their advice and
this field guide is the result.
Acknowledgements
We would like to thank the University of Essex and the University of Min-
nesota, Morris, for their support.
Many thanks to Tyler Hutchison for the use of his cool drawing on the
cover (and elsewhere!), and for finding those scary pinks and greens.
We had the invaluable assistance of many people, and we are very grateful
for their individual and collective efforts, often on very short timelines. Rick
Riolo, Matthew Walker, Christian Gagne, Bob McKay, Giovanni Pazienza,
and Lee Spector all provided useful suggestions based on an early techni-
cal report version. Yossi Borenstein, Caterina Cinel, Ellery Crane, Cecilia
Di Chio, Stephen Dignum, Edgar Galv´n-L´pez, Keisha Harriott, David
a
o
Hunter, Lonny Johnson, Ahmed Kattan, Robert Keller, Andy Korth, Yev-
geniya Kovalchuk, Simon Lucas, Wayne Manselle, Alberto Moraglio, Oliver
Oechsle, Francisco Sepulveda, Elias Tawil, Edward Tsang, William Tozier
and Christian Wagner all contributed to the final proofreading festival.
Their sharp eyes and hard work did much to make the book better; any
remaining errors or omissions are obviously the sole responsibility of the
authors.
We would also like to thank Prof. Xin Yao and the School of Computer
Science of The University of Birmingham and Prof. Bernard Buxton of Uni-
versity College, London, for continuing support, particularly of the genetic
programming bibliography. We also thank Schloss Dagstuhl, where some of
the integration of this book took place.
Most of the tools used in the construction of this book are open source,
1
and we are very grateful to all the developers whose efforts have gone into
building those tools over the years.
As mentioned above, this book started life as a chapter. This was
for a forthcoming handbook on computational intelligence
2
edited by John
Fulcher and Lakhmi C. Jain. We are grateful to John Fulcher for his useful
comments and edits on that book chapter. We would also like to thank most
warmly John Koza, who co-authored the aforementioned chapter with us,
and for allowing us to reuse some of his original material in this book.
This book is a summary of nearly two decades of intensive research in
the field of genetic programming, and we obviously owe a great debt to all
the researchers whose hard work, ideas, and interactions ultimately made
this book possible. Their work runs through every page, from an idea made
somewhat clearer by a conversation at a conference, to a specific concept
or diagram. It has been a pleasure to be part of the GP community over
the years, and we greatly appreciate having so much interesting work to
summarise!
March 2008
Riccardo Poli
William B. Langdon
Nicholas Freitag McPhee
the colophon (page 235) for more details.
entitled
Computational Intelligence: A Compendium
and to be pub-
lished by Springer in 2008.
2
Tentatively
1
See
What’s in this book
The book is divided up into four parts.
Part I covers the basics of genetic programming (GP). This starts with a
gentle introduction which describes how a population of programs is stored
in the computer so that they can evolve with time. We explain how programs
are represented, how random programs are initially created, and how GP
creates a new generation by mutating the better existing programs or com-
bining pairs of good parent programs to produce offspring programs. This
is followed by a simple explanation of how to apply GP and an illustrative
example of using GP.
In Part II, we describe a variety of alternative representations for pro-
grams and some advanced GP techniques. These include: the evolution of
machine-code and parallel programs, the use of grammars and probability
distributions for the generation of programs, variants of GP which allow the
solution of problems with multiple objectives, many speed-up techniques
and some useful theoretical tools.
Part III provides valuable information for anyone interested in using GP
in practical applications. To illustrate genetic programming’s scope, this
part contains a review of many real-world applications of GP. These in-
clude: curve fitting, data modelling, symbolic regression, image analysis,
signal processing, financial trading, time series prediction, economic mod-
elling, industrial process control, medicine, biology, bioinformatics, hyper-
heuristics, artistic applications, computer games, entertainment, compres-
sion and human-competitive results. This is followed by a series of recom-
mendations and suggestions to obtain the most from a GP system. We then
provide some conclusions.
Part IV completes the book. In addition to a bibliography and an index,
this part includes two appendices that provide many pointers to resources,
further reading and a simple GP implementation in Java.
Zgłoś jeśli naruszono regulamin