OReilly Agile Data Science (2014).pdf

(16010 KB) Pobierz
Agile Data Science
Russell Jurney
Agile Data Science
by Russell Jurney
Copyright © 2014 Data Syndrome LLC. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/
institutional sales department: 800-998-9938 or
corporate@oreilly.com.
Editors:
Mike Loukides and Mary Treseler
Production Editor:
Nicole Shelby
Copyeditor:
Rachel Monaghan
Proofreader:
Linley Dolby
October 2013:
First Edition
Cover Designer:
Karen Montgomery
Interior Designer:
David Futato
Illustrator:
Kara Ebrahim
Revision History for the First Edition:
2013-10-11:
First release
See
http://oreilly.com/catalog/errata.csp?isbn=9781449326265
for release details.
Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly
Media, Inc.
Agile Data Science
and related trade dress are trademarks of O’Reilly Media, Inc.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as
trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trade‐
mark claim, the designations have been printed in caps or initial caps.
While every precaution has been taken in the preparation of this book, the publisher and author assume no
responsibility for errors or omissions, or for damages resulting from the use of the information contained
herein.
ISBN: 978-1-449-32626-5
[LSI]
Table of Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Part I.
Setup
3
4
5
6
8
11
12
13
14
14
14
15
17
18
18
18
20
24
24
25
26
27
iii
1. Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Agile Big Data
Big Words Defined
Agile Big Data Teams
Recognizing the Opportunity and Problem
Adapting to Change
Agile Big Data Process
Code Review and Pair Programming
Agile Environments: Engineering Productivity
Collaboration Space
Private Space
Personal Space
Realizing Ideas with Large-Format Printing
2. Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Email
Working with Raw Data
Raw Email
Structured Versus Semistructured Data
SQL
NoSQL
Serialization
Extracting and Exposing Features in Evolving Schemas
Data Pipelines
Data Perspectives
Zgłoś jeśli naruszono regulamin