万本电子书0元读

万本电子书0元读

顶部广告

Building Data Streaming Applications with Apache Kafka电子书

售       价:¥

4人正在读 | 0人评论 9.8

作       者:Manish Kumar,Chanchal Singh

出  版  社:Packt Publishing

出版时间:2017-08-18

字       数:34.6万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
Design and administer fast, reliable enterprise messaging systems with Apache Kafka About This Book ? Build efficient real-time streaming applications in Apache Kafka to process data streams of data ? Master the core Kafka APIs to set up Apache Kafka clusters and start writing message producers and consumers ? A comprehensive guide to help you get a solid grasp of the Apache Kafka concepts in Apache Kafka with pracitcalpractical examples Who This Book Is For If you want to learn how to use Apache Kafka and the different tools in the Kafka ecosystem in the easiest possible manner, this book is for you. Some programming experience with Java is required to get the most out of this book What You Will Learn ? Learn the basics of Apache Kafka from scratch ? Use the basic building blocks of a streaming application ? Design effective streaming applications with Kafka using Spark, Storm &, and Heron ? Understand the importance of a low -latency , high- throughput, and fault-tolerant messaging system ? Make effective capacity planning while deploying your Kafka Application ? Understand and implement the best security practices In Detail Apache Kafka is a popular distributed streaming platform that acts as a messaging queue or an enterprise messaging system. It lets you publish and subscribe to a stream of records, and process them in a fault-tolerant way as they occur. This book is a comprehensive guide to designing and architecting enterprise-grade streaming applications using Apache Kafka and other big data tools. It includes best practices for building such applications, and tackles some common challenges such as how to use Kafka efficiently and handle high data volumes with ease. This book first takes you through understanding the type messaging system and then provides a thorough introduction to Apache Kafka and its internal details. The second part of the book takes you through designing streaming application using various frameworks and tools such as Apache Spark, Apache Storm, and more. Once you grasp the basics, we will take you through more advanced concepts in Apache Kafka such as capacity planning and security. By the end of this book, you will have all the information you need to be comfortable with using Apache Kafka, and to design efficient streaming data applications with it. Style and approach A step-by –step, comprehensive guide filled with practical and real- world examples
目录展开

Title Page

Copyright

Building Data Streaming Applications with Apache Kafka

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Why subscribe?

Customer Feedback

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Introduction to Messaging Systems

Understanding the principles of messaging systems

Understanding messaging systems

Peeking into a point-to-point messaging system

Publish-subscribe messaging system

Advance Queuing Messaging Protocol

Using messaging systems in big data streaming applications

Summary

Introducing Kafka the Distributed Messaging Platform

Kafka origins

Kafka's architecture

Message topics

Message partitions

Replication and replicated logs

Message producers

Message consumers

Role of Zookeeper

Summary

Deep Dive into Kafka Producers

Kafka producer internals

Kafka Producer APIs

Producer object and ProducerRecord object

Custom partition

Additional producer configuration

Java Kafka producer example

Common messaging publishing patterns

Best practices

Summary

Deep Dive into Kafka Consumers

Kafka consumer internals

Understanding the responsibilities of Kafka consumers

Kafka consumer APIs

Consumer configuration

Subscription and polling

Committing and polling

Additional configuration

Java Kafka consumer

Scala Kafka consumer

Rebalance listeners

Common message consuming patterns

Best practices

Summary

Building Spark Streaming Applications with Kafka

Introduction to Spark

Spark architecture

Pillars of Spark

The Spark ecosystem

Spark Streaming

Receiver-based integration

Disadvantages of receiver-based approach

Java example for receiver-based integration

Scala example for receiver-based integration

Direct approach

Java example for direct approach

Scala example for direct approach

Use case log processing - fraud IP detection

Maven

Producer

Property reader

Producer code

Fraud IP lookup

Expose hive table

Streaming code

Summary

Building Storm Applications with Kafka

Introduction to Apache Storm

Storm cluster architecture

The concept of a Storm application

Introduction to Apache Heron

Heron architecture

Heron topology architecture

Integrating Apache Kafka with Apache Storm - Java

Example

Integrating Apache Kafka with Apache Storm - Scala

Use case – log processing in Storm, Kafka, Hive

Producer

Producer code

Fraud IP lookup

Storm application

Running the project

Summary

Using Kafka with Confluent Platform

Introduction to Confluent Platform

Deep driving into Confluent architecture

Understanding Kafka Connect and Kafka Stream

Kafka Streams

Playing with Avro using Schema Registry

Moving Kafka data to HDFS

Camus

Running Camus

Gobblin

Gobblin architecture

Kafka Connect

Flume

Summary

Building ETL Pipelines Using Kafka

Considerations for using Kafka in ETL pipelines

Introducing Kafka Connect

Deep dive into Kafka Connect

Introductory examples of using Kafka Connect

Kafka Connect common use cases

Summary

Building Streaming Applications Using Kafka Streams

Introduction to Kafka Streams

Using Kafka in Stream processing

Kafka Stream - lightweight Stream processing library

Kafka Stream architecture

Integrated framework advantages

Understanding tables and Streams together

Maven dependency

Kafka Stream word count

KTable

Use case example of Kafka Streams

Maven dependency of Kafka Streams

Property reader

IP record producer

IP lookup service

Fraud detection application

Summary

Kafka Cluster Deployment

Kafka cluster internals

Role of Zookeeper

Replication

Metadata request processing

Producer request processing

Consumer request processing

Capacity planning

Capacity planning goals

Replication factor

Memory

Hard drives

Network

CPU

Single cluster deployment

Multicluster deployment

Decommissioning brokers

Data migration

Summary

Using Kafka in Big Data Applications

Managing high volumes in Kafka

Appropriate hardware choices

Producer read and consumer write choices

Kafka message delivery semantics

At least once delivery

At most once delivery

Exactly once delivery

Big data and Kafka common usage patterns

Kafka and data governance

Alerting and monitoring

Useful Kafka matrices

Producer matrices

Broker matrices

Consumer metrics

Summary

Securing Kafka

An overview of securing Kafka

Wire encryption using SSL

Steps to enable SSL in Kafka

Configuring SSL for Kafka Broker

Configuring SSL for Kafka clients

Kerberos SASL for authentication

Steps to enable SASL/GSSAPI - in Kafka

Configuring SASL for Kafka broker

Configuring SASL for Kafka client - producer and consumer

Understanding ACL and authorization

Common ACL operations

List ACLs

Understanding Zookeeper authentication

Apache Ranger for authorization

Adding Kafka Service to Ranger

Adding policies

Best practices

Summary

Streaming Application Design Considerations

Latency and throughput

Data and state persistence

Data sources

External data lookups

Data formats

Data serialization

Level of parallelism

Out-of-order events

Message processing semantics

Summary

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部