Building Better Distributed Data Pipelines /

Patrick McFadin explains the basics of how to build more efficient data pipelines, using Apache Kafka to organize, Apache Cassandra to store, and Apache Spark to analyze. Patrick offers an overview of how Cassandra works and why it can be a perfect fit for data-driven projects. Patrick then demonstr...

Full description

Bibliographic Details
Main Author:	McFadin, Patrick (Author)
Corporate Author:	Safari, an O'Reilly Media Company
Format:	eBook
Language:	English
Published:	O'Reilly Media, Inc., 2017.
Edition:	1st edition.
Subjects:	Electronic videos.
Online Access:	Connect to this electronic resource

Description
Summary:	Patrick McFadin explains the basics of how to build more efficient data pipelines, using Apache Kafka to organize, Apache Cassandra to store, and Apache Spark to analyze. Patrick offers an overview of how Cassandra works and why it can be a perfect fit for data-driven projects. Patrick then demonstrates that with the addition of Spark and Kafka, you can maintain a highly distributed, fault-tolerant, and scaling solution. You'll leave with a comprehensive view of the many options to make considered choices in your data pipeline projects.
Item Description:	Videorecording.
Physical Description:	1 online resource (1 video file, approximately 54 min.)
Format:	Mode of access: World Wide Web.