笔记:Ceph: A Scalable, High-Performance Distributed File System

本文涉及的产品
对象存储 OSS,20GB 3个月
对象存储 OSS,内容安全 1000次 1年
对象存储 OSS,恶意文件检测 1000次 1年
简介:

关于Ceph的名篇。Ceph是现在很火的一个存储系统,不同于HDSF主要是面向大数据应用,Ceph是立志要做一个通用的存储解决方案,要同时很好的支持对象存储(Object Storage),块存储(Block Storage)以及文件系统(File System) 。现在很多Openstack私有云的存储都是基于Ceph的。Ceph就是基于这篇论文做得。

摘要
很明确的指出了Ceph的使命:
We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability.
以及关键方法和技术:
Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs).
We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system.
A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific computing file system workloads.
然后就是性能
Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supportingmore than 250,000metadata operations per second.

介绍:
先把NFS和传统OSD的问题说了一下。
然后介绍Ceph:
We present Ceph, a distributed file system that provides excellent performance and reliability while promising unparalleled scalability.
这句是一个关键:Our architecture is based on the assumption that systems at the petabyte scale are inherently dynamic: large systems are inevitably built incrementally, node failures are the norm rather than the exception, and the quality and character of workloads are constantly shifting over time.
Ceph的架构如下:

系统介绍:
Ceph分3部分:
the client, each instance of which exposes a near-POSIX file system interface to a host or process;
a cluster of OSDs, which collectively stores all data and metadata;
A metadata server cluster, which manages the namespace (file names and directories) while coordinating security, consistency and coherence (see Figure 1).
如下图所示:
screenshot

主要做法:
Decoupled Data and Metadata
Dynamic Distributed Metadata Management
Reliable Autonomic Distributed Object Storage

后面几章是对每部分具体实现的介绍,没有什么太高深的公式和理论,大家一般都能看明白,挺有意思的。
原文链接:
http://www.ece.eng.wayne.edu/~sjiang/ECE7650-winter-15/topic5B-S.pdf
如果下不了可以去百度学术上再搜一下。

相关实践学习
借助OSS搭建在线教育视频课程分享网站
本教程介绍如何基于云服务器ECS和对象存储OSS,搭建一个在线教育视频课程分享网站。
相关文章
|
7月前
|
Oracle 关系型数据库 Linux
Disable NUMA on database servers to improve performance of Linux file system utilities
Disable NUMA on database servers to improve performance of Linux file system utilities
48 3
|
运维 监控 网络协议
译|llustrated Guide to Monitoring and Tuning the Linux Networking Stack: Receiving Data
译|llustrated Guide to Monitoring and Tuning the Linux Networking Stack: Receiving Data
153 0
|
存储 缓存 网络协议
译|High-Performance Server Architecture(下)
译|High-Performance Server Architecture(下)
86 0
|
缓存 前端开发 安全
译|High-Performance Server Architecture(上)
译|High-Performance Server Architecture
80 0
|
移动开发 C语言
PAT (Advanced Level) Practice 1042 Shuffling Machine (20 分)
PAT (Advanced Level) Practice 1042 Shuffling Machine (20 分)
100 0
PAT (Advanced Level) Practice - 1129 Recommendation System(25 分)
PAT (Advanced Level) Practice - 1129 Recommendation System(25 分)
109 0
|
存储 缓存 分布式计算