Apache Iceberg: The Definitive Guide: Data Lakehouse Functionality, Performance, and Scalability on the Data Lake

Author:   Tomer Shiran ,  Jason Hughes ,  Alex Merced ,  Dipankar Mazumdar
Publisher:   O'Reilly Media
ISBN:  

9781098148621


Pages:   300
Publication Date:   29 March 2024
Format:   Paperback
Availability:   Not yet available   Availability explained
This item is yet to be released. You can pre-order this item and we will dispatch it to you upon its release.

Our Price $184.77 Quantity:  
Add to Cart

Share |

Apache Iceberg: The Definitive Guide: Data Lakehouse Functionality, Performance, and Scalability on the Data Lake


Add your own review!

Overview

Traditional data architecture patterns are severely limited. To use these patterns, you have to ETL data into each tool-a cost-prohibitive process for making warehouse features available to all of your data. This lack of flexibility forces you to adjust your workflow to the tool your data is locked in, which creates data silos and data drift. This book shows you a better way. Apache Iceberg provides the capabilities, performance, scalability, and savings that fulfill the promise of an open data lakehouse. By following the lessons in this book, you'll be able to achieve interactive, batch, machine learning, and streaming analytics with this lakehouse. Authors Tomer Shiran, Jason Hughes, Alex Merced, and Dipankar Mazumdar from Dremio guide you through the process. With this book, you'll learn: The architecture of Apache Iceberg tables What happens under the hood when you perform operations on Iceberg tables How to further optimize Apache Iceberg tables for maximum performance How to use Apache Iceberg with popular data engines such as Apache Spark, Apache Flink, and Dremio Sonar How Apache Iceberg can be used in streaming and batch ingestion Discover why Apache Iceberg is a foundational technology for implementing an open data lakehouse.

Full Product Details

Author:   Tomer Shiran ,  Jason Hughes ,  Alex Merced ,  Dipankar Mazumdar
Publisher:   O'Reilly Media
Imprint:   O'Reilly Media
ISBN:  

9781098148621


ISBN 10:   1098148622
Pages:   300
Publication Date:   29 March 2024
Audience:   Professional and scholarly ,  Professional & Vocational
Format:   Paperback
Publisher's Status:   Active
Availability:   Not yet available   Availability explained
This item is yet to be released. You can pre-order this item and we will dispatch it to you upon its release.

Table of Contents

Reviews

Author Information

"Tomer Shiran is the Founder and Chief Product Officer of Dremio, an open data lakehouse platform that enables companies to run analytics in the cloud without the cost, complexity and lock-in of data warehouses. As the company's founding CEO, Tomer built a world-class organization that has raised over $400M and now serves hundreds of the world's largest enterprises, including 3 of the Fortune 5. Prior to Dremio, Tomer was the 4th employee and VP Product of MapR, a Big Data analytics pioneer. He also held numerous product management and engineering roles at Microsoft and IBM Research, founded several websites that have served millions of users and hundreds of thousands of paying customers, and is a successful author and presenter on a wide range of industry topics. He holds an MS in Computer Engineering from Carnegie Mellon University and a BS in Computer Science from Technion - Israel Institute of Technology. Jason Hughes is the Director of Technical Advocacy at Dremio. Previously at Dremio, he's been a Product Director, Technical Director and a Senior Solutions Architect. He's been working in technology and data for over a decade, including roles as tech lead for the field at Dremio, the pre-sales and post-sales lead for Presto and QueryGrid for the Americas at Teradata, and leading the development, deployment, and management of a custom CRM system for multiple auto dealerships. He is passionate about making customers and individuals successful and self-sufficient. When he's not working, he's usually taking his dog to the dog park, playing hockey, or cooking (when he feels like it). He lives in San Diego, California. Alex Merced is a developer advocate for Dremio and has worked as a developer and instructor for companies like GenEd Systems, Crossfield Digital, CampusGuard and General Assembly. Alex is passionate about technology and has put out tech content on outlets such as blogs, videos and his podcasts Datanation and Web Dev 101. Alex Merced has contributed a variety of libraries in the Javascript & Python worlds including SencilloDB, CoquitoJS, dremio-simple-query and more. Dipankar Mazumdar is currently a Data Eng/Science Advocate at Dremio where his primary focus is advocating data practitioners on Dremio's open lakehouse platform and various open-sourced projects, such as Apache Iceberg. Dipankar is also interested in Visual Analytics research, and his latest work was on ""Explainability of ensemble models"" using multidimensional projection techniques."

Tab Content 6

Author Website:  

Customer Reviews

Recent Reviews

No review item found!

Add your own review!

Countries Available

All regions
Latest Reading Guide

wl

Shopping Cart
Your cart is empty
Shopping cart
Mailing List