Moving Data: How to Move, Share, and Integrate SQL, NoSQL, and Big Data

Author: Martin Brown
Publisher: Springer-Verlag Berlin and Heidelberg GmbH & Co. KG
ISBN:

9781484201978

Pages: 350
Publication Date: 31 March 2016
Format: Paperback
Availability: In Print

This item will be ordered in for you from one of our suppliers. Upon receipt, we will promptly dispatch it out to you. For in store availability, please contact us.

Our Price $76.33 Quantity:

Share |

Overview

Full Product Details

Author: Martin Brown
Publisher: Springer-Verlag Berlin and Heidelberg GmbH & Co. KG
Imprint: APress
ISBN:

9781484201978

ISBN 10: 1484201973
Pages: 350
Publication Date: 31 March 2016
Audience: Professional and scholarly , Professional & Vocational
Format: Paperback
Publisher's Status: Active
Availability: In Print

This item will be ordered in for you from one of our suppliers. Upon receipt, we will promptly dispatch it out to you. For in store availability, please contact us.

Chapter 1: Understanding the Challenges of Data Migration This chapter helps the reader understand the different components of any migration of data. This includes changing the format, changing the way the data is referenced and referred to internally, and the basic mechanics of getting the data in in the first place. Chapter 2: Data Mapping and Transformations There are two key elements to the exchange of any information between databases. One is the data structure used for the exchange, and the other is the transformation required to reach those structures. Some of these considerations are driven by the source database, and others by the target database. Moving data from RDBMS to a NoSQL database, for example, generally requires constructing documents from what might be tabular, or joined-tabular, data. The other aspect is the difference between source and target data types. Document databases and Big Data stores tend not to care about the data type, whereas RDBMS cannot live without them. In this chapter we'll examine some of the key differences and problems with transferring data that transcend the mechanics of the process, and how to deal with them effectively. Chapter 3: Moving Data for RDBMS RDBMS systems have a unique place in the world of data; they are universally accepted and very popular (Oracle, MySQL, Microsoft SQL Server). Their tabular structure makes them look easily exchangeable. but they require careful techniques when sharing the data. In this chapter, we'll examine some of the key issues with migrating data to and from RDBMS, including: * Exchanging table data* Exchanging complex queries* Preparing for two-way transfers* Row, Statement, or OtherChapter 4: Migrating for RDBMS using Export/Import Export and import is the simplest and most readily used method for sharing and exchanging data, but there is more to it than just dumping the information. You need to consider formatting and structure, and whether you want identical tables, or a complex structure exchanged. We'll examine a variety of techniques, from base export, character separated/delimited types, and structured organization like JSON. The chapter also covers choosing a file format and dealing with raw table data or joined structures Chapter 5: Sharing Data for RDBMS through Replication Replication is the purest and simplest form of sharing data, but it is not without its limitations or problems. Most replication is designed to handle scale-out environments, not data sharing, but there are solutions and tricks that make replication a suitable alternative for exchanging data. But how do you cope with changes to the original schema once it reaches its target database. How do you make it usable and match the target environment? That's what this chapter explains. It will cover: * Replicating between RDBMS* Replicating out of RDBMS* Replicating in to RDBMSChapter 6: Integrating Data for RDBMS through ReplicationData integration is about actively sharing and exchanging data-not just replicating it, but actually using different formats and data sources. Using the methods described in this chapter, you'll learn how to exchange data and even share information between an RDBMS and other data types, including methods for running queries and joins across the different types. We'll also look at solutions for building applications that can natively integrate and share this information according to the application needs. This allows an application to combine the RDBMS data alongside an internal representation that may be suitable for document-based storage and has the capability to merge that back again. Chapter 7: Moving Data for NoSQL Databases NoSQL databases cause problems for traditional RDBMS types because the data is not often defined or stored in the same structured format. NoSQL encompasses everything from key/value stores through to document and graph-based databases designed to find relationships and distances between data points. Replicating data into NoSQL requires knowing what you want to keep and how to structure it, and migrating out is about organizing the data to be usable and recognizable by the target database. For example, is a document one table, or multiple tables? This chapter covers the basics of the NoSQL platform and the data challenges it presents before getting at the specifics. Chapter 8: Migrating Data for NoSQL If you are migrating, permanently, data into NoSQL, then you must determine how that information should be transferred, transformed, and ultimately used. There are different ways to do this based on whether this is a one-time move, or whether it is a temporary move to make use of special feature. Also of special note is that many NoSQL databases have very specific or very limited methods for searching and extracting the information that has been inserted; careful selection of the data as it is migrated will make the data more usable in the NoSQL store. In this chapter, we will examine some of the key considerations, different environments, and limitations of each specific NoSQL database. Chapter 9: Sharing for NoSQL Regularly transferring or exchanging information between another database and NoSQL can be achieved in different ways, depending on your use case and environment. For example, Couchbase, CouchDB, and MongoDB all provide solutions for watching the changes to the underlying database that make sharing the data really easy. Others, like Memcached or Riak do not provide such ready access, so different techniques need to be employed. But NoSQL is rarely used as the only solution for data storage, so we can usually take advantage of the application structure to do some of the hard work for us. This chapter will examine these techniques, along with specific data flow migrations. Chapter 9: Integrating for NoSQL NoSQL generally has some performance advantages over RDBMS solutions, and so therefore we can use that to our advantage and combine the information from the RDBMS and NoSQL environments together. For example, why not use a caching system like Memcached with a MySQL backing store? As this chapter shows, applications can handle the basics, but they can also be modified to handle a more efficient workflow for writing, storing, and executing updates. These same techniques also affect NoSQL-like databases such as object stores and large columnar stores including BigTable, Cassandra, or HBase. Chapter 11: Moving Data for Big Data Sources Big Data sources encompass a very wide range of different known databases and stores, but many share a similar set of goals and structures. Best known is the underlying technology behind Hadoop (including HBase and Pig) and Google's BigTable, while others look more like a very large RDBMS with a SQL, such as HP's Vertica, or Hive and Impala. This chapter explains that all of these different solutions require careful handling of the data and structure to make the data usable when it is moved. You need to consider the data structure, format, and usability as the data is moved. You also need to think about how it might be used and integrated with other sources to make it usable in the typical environment. Chapter 12: Migrating Data for Big Data Sources If you are migrating data permanently into a Big Data store, you have a wide variety of considerations. Mostly with Big data this is about how you will get the data back out again, and, more importantly, how to take advantage of the Big Data architecture to get the best out of the structure. For example, writing data into Hadoop is easy. Writing data into Hadoop so that it can be efficiently processed and distributed across the Hadoop cluster is a different matter. In this chapter, we'll examine methods for permanently moving data into Big Data stores for archival, storage, and long-term analysis needs. We'll also look at whether a simple dump/export and import is the easiest and most efficient methods, or whether there are better solutions to the direct data exchange. Chapter 13: Sharing for Big Data Sources Sharing data for Big Data sources, for example by regularly replicating information from an existing database into a Big Data store, has specific problems. For example, Big Data encompasses both structured and unstructured storage formats. Knowing how to use these environments to your advantage, and how the data can be efficiently transferred, is critical to making the Big Data database work for you and not against you. In this chapter we'll examine specific tricks, such as using specialist tools like Sqoop, specialist replication tools like Oracle GoldenGate or Tungsten Replicator, and also tools and methods for handling incremental and staged data both in and out of Big Data sources. Chapter 14: Integrating for Big Data Sources Big Data sources such as Hadoop are no longer distant silos at the end of an existing data chain. Frequently, Hadoop is being brought up to the same architectural level as the RDBMS stores that used to be the source for their data. In this chapter, we're going to look at the most effective ways to integrate a Big Data store into your applications and database needs. This will enable you to use Big Data both as a data store, and to process short- and long-term information by leveraging the data transformations we have already used to make the Big Data compatible with solutions such as Spark and MillWheel. These will enable data to be readily swapped both to and from Big Data sources into a cohesive part of the heterogeneous environment.

Reviews

Author Information

A professional writer for over 15 years, Martin (MC) Brown is the author and contributor to more than 26 books covering an array of topics, including the recently published Getting Started with CouchDB. His expertise spans myriad development languages and platforms: Perl, Python, Java, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows, Solaris, Linux, BeOS, Microsoft WP, Mac OS and more. He is currently senior information architect for Continuent.

Tab Content 6

Author Website:

Customer Reviews

Recent Reviews

No review item found!

Add your own review!

Countries Available

All regions

Latest Reading Guide

Shopping Cart

Your cart is empty

Mailing List

Moving Data: How to Move, Share, and Integrate SQL, NoSQL, and Big Data

9781484201978

Availability Information

Overview

Full Product Details

9781484201978

Table of Contents

Reviews

Author Information

Tab Content 6

Customer Reviews

Recent Reviews

Countries Available

Sign up now