Kafka 0.8

简介:

0.8 is a huge step forward in functionality from 0.7.x

 

This release includes the following major features:

  • Partitions are now replicated. 支持partition的复本, 避免broker失败导致的数据丢失 
    Previously the topic would remain available in the case of server failure, but individual partitions within that topic could disappear when the server hosting them stopped. If a broker failed permanently any unconsumed data it hosted would be lost. 
    Starting with 0.8 all partitions have a replication factor and we get the prior behavior as the special case where replication factor = 1. 
    Replicas have a notion of committed messages and guarantee that committed messages won't be lost as long as at least one replica survives. Replica logs are byte-for-byte identical across replicas.
  • Producer and consumer are replication aware. 支持replica的Producer和Consumer 
    When running in sync mode, by default, the producer send() request blocks until the messages sent is committed to the active replicas. As a result the sender can depend on the guarantee that a message sent will not be lost. 
    Latency sensitive producers have the option to tune this to block only on the write to the leader broker or to run completely async if they are willing to forsake this guarantee. 
    The consumer will only see messages that have been committed. 
  • The consumer has been moved to a "long poll" model where fetch requests block until there is data available. 
    This enables low latency without frequent polling. In general end-to-end message latency from producer to broker to consumer of only a few milliseconds is now possible.
  • We now retain the key used in the producer for partitioning with each message, so the consumer knows the partitioning key. 
    会保存producer用于partitioning的key, 并让consumer知道这个key
  • We have moved from directly addressing messages with a byte offset to using a logical offset (i.e. 0, 1, 2, 3...). 使用逻辑offset代替之前的物理offset 
    The offset still works exactly the same - it is a monotonically increasing number that represents a point-in-time in the log - but now it is no longer tied to byte layout. 
    This has several advantages: 
    (1) it is aesthetically (美学观点上地) nice, 
    (2) it makes it trivial to calculate the next offset or to traverse messages in reverse order, 
    (3) it fixes a corner case (极端情况) interaction between consumer commit() and compressed message batches. Data is still transferred using the same efficient zero-copy mechanism as before. 
  • We have removed the zookeeper dependency from the producer and replaced it with a simple cluster metadata api.
  • We now support multiple data directories (i.e. a JBOD setup).
  • We now expose both the partition and the offset for each message in the high-level consumer. 
    在high-level consumer中expose具体的partition和offset信息
  • We have substantially improved our integration testing, adding a new integration test framework and over 100 distributed regression and performance test scenarios that we run on every checkin.

 

在我看来, 主要的改动

1. 增加broker的安全性, 原来的方案, broker的fail就会导致数据丢失, 确实有点太说不过去, 所以replica feature是必须的

2. 使用逻辑offset, 上面说了些优点, 但是之前使用物理offset时, 也说了一堆优点 
    其实就是效率和易用性的balance, 之前出于对效率的追求, 所以使用物理offset 
    而现在考虑到物理offset实在用的太麻烦, 做出妥协, 改为逻辑offset, 本质没有区别, 只是需要增加一个逻辑offset到物理offset的映射, 以使物理offset对用户透明

3. 对python更好的支持, kafka-python

Pure Python implementation with full protocol support. Consumer and Producer implementations included, GZIP and Snappy compression supported.

Maintainer: David Arthur 
License: Apache v.2.0

https://github.com/mumrah/kafka-python

 

Kafka Replication High-level Design

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Replication

参考,Apache Kafka Replication Design – High level


本文章摘自博客园,原文发布日期:2013-05-08

目录
相关文章
|
12天前
|
存储 关系型数据库 分布式数据库
PostgreSQL 18 发布,快来 PolarDB 尝鲜!
PostgreSQL 18 发布,PolarDB for PostgreSQL 全面兼容。新版本支持异步I/O、UUIDv7、虚拟生成列、逻辑复制增强及OAuth认证,显著提升性能与安全。PolarDB-PG 18 支持存算分离架构,融合海量弹性存储与极致计算性能,搭配丰富插件生态,为企业提供高效、稳定、灵活的云数据库解决方案,助力企业数字化转型如虎添翼!
|
11天前
|
存储 人工智能 搜索推荐
终身学习型智能体
当前人工智能前沿研究的一个重要方向:构建能够自主学习、调用工具、积累经验的小型智能体(Agent)。 我们可以称这种系统为“终身学习型智能体”或“自适应认知代理”。它的设计理念就是: 不靠庞大的内置知识取胜,而是依靠高效的推理能力 + 动态获取知识的能力 + 经验积累机制。
379 133
|
11天前
|
存储 人工智能 Java
AI 超级智能体全栈项目阶段二:Prompt 优化技巧与学术分析 AI 应用开发实现上下文联系多轮对话
本文讲解 Prompt 基本概念与 10 个优化技巧,结合学术分析 AI 应用的需求分析、设计方案,介绍 Spring AI 中 ChatClient 及 Advisors 的使用。
474 131
AI 超级智能体全栈项目阶段二:Prompt 优化技巧与学术分析 AI 应用开发实现上下文联系多轮对话
|
5天前
|
存储 安全 前端开发
如何将加密和解密函数应用到实际项目中?
如何将加密和解密函数应用到实际项目中?
212 138
|
11天前
|
人工智能 Java API
AI 超级智能体全栈项目阶段一:AI大模型概述、选型、项目初始化以及基于阿里云灵积模型 Qwen-Plus实现模型接入四种方式(SDK/HTTP/SpringAI/langchain4j)
本文介绍AI大模型的核心概念、分类及开发者学习路径,重点讲解如何选择与接入大模型。项目基于Spring Boot,使用阿里云灵积模型(Qwen-Plus),对比SDK、HTTP、Spring AI和LangChain4j四种接入方式,助力开发者高效构建AI应用。
455 122
AI 超级智能体全栈项目阶段一:AI大模型概述、选型、项目初始化以及基于阿里云灵积模型 Qwen-Plus实现模型接入四种方式(SDK/HTTP/SpringAI/langchain4j)
|
5天前
|
存储 JSON 安全
加密和解密函数的具体实现代码
加密和解密函数的具体实现代码
230 136
|
22天前
|
机器学习/深度学习 人工智能 前端开发
通义DeepResearch全面开源!同步分享可落地的高阶Agent构建方法论
通义研究团队开源发布通义 DeepResearch —— 首个在性能上可与 OpenAI DeepResearch 相媲美、并在多项权威基准测试中取得领先表现的全开源 Web Agent。
1548 87
|
23天前
|
弹性计算 关系型数据库 微服务
基于 Docker 与 Kubernetes(K3s)的微服务:阿里云生产环境扩容实践
在微服务架构中,如何实现“稳定扩容”与“成本可控”是企业面临的核心挑战。本文结合 Python FastAPI 微服务实战,详解如何基于阿里云基础设施,利用 Docker 封装服务、K3s 实现容器编排,构建生产级微服务架构。内容涵盖容器构建、集群部署、自动扩缩容、可观测性等关键环节,适配阿里云资源特性与服务生态,助力企业打造低成本、高可靠、易扩展的微服务解决方案。
1368 8