Efficient Algorithms For Sorting and Synchronization.pdf

(419 KB) Pobierz
Efficient Algorithms for Sorting and
Synchronization
Andrew Tridgell
A thesis submitted for the degree of
Doctor of Philosophy at
The Australian National University
April 2000
Except where otherwise indicated, this thesis is my own original work.
Andrew Tridgell
25 April 2000
Till min k¨ ra fru, Susan
a
Acknowledgments
The research that has gone into this thesis has been thoroughly enjoyable. That en-
joyment is largely a result of the interaction that I have had with my supervisors, col-
leagues and, in the case of rsync, the people who have tested and used the resulting
software.
I feel very privileged to have worked with my supervisors, Richard Brent, Paul
Mackerras and Brendan McKay. To each of them I owe a great debt of gratitude for
their patience, inspiration and friendship. Richard, Paul and Brendan have taught
me a great deal about the field of Computer Science by sharing with me the joy of
discovery and investigation that is the heart of research.
I would also like to thank Bruce Millar and Iain Macleod, who supervised my
initial research in automatic speech recognition. Through Bruce, Iain and the Com-
puter Sciences Laboratory I made the transition from physicist to computer scientist,
turning my hobby into a career.
The Department of Computer Science and Computer Sciences Laboratory have
provided an excellent environment for my research. I spent many enjoyable hours
with department members and fellow students chatting about my latest crazy ideas
over a cup of coffee. Without this rich environment I doubt that many of my ideas
would have come to fruition.
The Australian Telecommunications and Electronics Research Board, the Com-
monwealth and the Australian National University were very generous in providing
me with scholarship assistance.
Thanks also to my family who have been extremely understanding and support-
ive of my studies. I feel very lucky to have a family that shares my enthusiasm for
academic pursuits.
Finally I’d like to thank my wonderful wife, Susan, who has encouraged me so
much over the years. Many people told us that having two PhD students under the
one roof is a recipe for disaster. Instead it has been fantastic.
iii
Abstract
This thesis presents efficient algorithms for internal and external parallel sorting and
remote data update. The sorting algorithms approach the problem by concentrat-
ing first on highly efficient but incorrect algorithms followed by a cleanup phase that
completes the sort. The remote data update algorithm, rsync, operates by exchang-
ing block signature information followed by a simple hash search algorithm for block
matching at arbitrary byte boundaries. The last chapter of the thesis examines a num-
ber of related algorithms for text compression, differencing and incremental backup.
iv
Zgłoś jeśli naruszono regulamin