Bad Data Handbook: Cleaning Up the Data So You Can Get Back to Work

Author:   Q. Ethan Mccallum
Publisher:   O'Reilly Media
ISBN:  

9781449321888


Pages:   250
Publication Date:   18 December 2012
Format:   Paperback
Availability:   In Print   Availability explained
This item will be ordered in for you from one of our suppliers. Upon receipt, we will promptly dispatch it out to you. For in store availability, please contact us.

Our Price $105.57 Quantity:  
Add to Cart

Share |

Bad Data Handbook: Cleaning Up the Data So You Can Get Back to Work


Add your own review!

Overview

Welcome to data science's dirty secret: real-world data is messy. Data scientists must spend a good deal of time playing software developer, writing code to clean up data before they can actually do anything constructive with it. It's a necessary evil, but you can still make the most of it. This practical book walks you through several real-world examples to demonstrate the theory and practice behind working with and cleaning up dirty data. No one tool solves all of the problems well. Wise data scientists learn many tools and learn where each one shines. To that end, this book takes a polyglot approach: most examples will involve R and Python, but expect the occasional smattering of Groovy and sed/awk fun.

Full Product Details

Author:   Q. Ethan Mccallum
Publisher:   O'Reilly Media
Imprint:   O'Reilly Media
Dimensions:   Width: 17.80cm , Height: 1.40cm , Length: 23.30cm
Weight:   0.426kg
ISBN:  

9781449321888


ISBN 10:   1449321887
Pages:   250
Publication Date:   18 December 2012
Audience:   Professional and scholarly ,  General/trade ,  Professional & Vocational
Format:   Paperback
Publisher's Status:   Active
Availability:   In Print   Availability explained
This item will be ordered in for you from one of our suppliers. Upon receipt, we will promptly dispatch it out to you. For in store availability, please contact us.

Table of Contents

Reviews

Author Information

Q Ethan McCallum is a consultant, writer, and technology enthusiast, though perhaps not in that order. His work has appeared online on The O'Reilly Network and Java.net, and also in print publications such as C/C++ Users Journal, Doctor Dobb's Journal, and Linux Magazine. In his professional roles, he helps companies to make smart decisions about data and technology.

Tab Content 6

Author Website:  

Customer Reviews

Recent Reviews

No review item found!

Add your own review!

Countries Available

All regions
Latest Reading Guide

MRG2025CC

 

Shopping Cart
Your cart is empty
Shopping cart
Mailing List