4 Things You Can Do with Alibaba Cloud PolarDB

云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
简介: In this article, we spoke with He Jun, Alibaba Cloud Technical Expert, to learn about the key features and common use cases of PolarDB.

During the PolarDB session of the 2017 Computing Conference, Alibaba Cloud's high level Technical Expert He Jun delivered a speech on the features and common use cases of PolarDB. In his speech, He Jun discussed the structure of PolarDB, introduced its features, and finally shared insights on some common use cases.

The following sections highlights the main points from his speech.

Product Architecture

I was pleasantly surprised when I first encountered PolarDB, as in my understanding, it represents a cross-generational milestone product that combines innovations in computing, storage, networking, and more. It implements a new design concept called Cloud Native, which is far different from the database design concepts we spoke about before. The earliest relation to modern databases is the relational database produced by the computing power available in the IT era. However, while moving computing capability onto the publicly accessible cloud and connecting it to user businesses generated a number of new innovations, they are far from sufficient in the long term. Why? Today, we are required to develop a cloud-based relational database targeted at public cloud environments and the user businesses that run in them. This is no small task.

PolarDB utilizes a structure that separates computing and storage, which is much easier said than done. The reason for combining computing and storage, after all, is to improve performance. The primary consideration in building a relational database is performance, so while separating storage and computing seems like an easy concept, actually doing it without sacrificing performance is quite difficult.

Today, the separation of computing and storage in PolarDB is a bold innovation that's no longer stuck in the concept phase, but has been both realized and implemented. Where is the difficulty in building a relational database? It needs to be compatible with ACID semantics, otherwise it will be unable to support business situations that require online operations. If ACID compatibility, performance, and flexibility on the public cloud are all crucial, then we also need to take into consideration performance to cost ratio. Looking at commercial databases on the market, most of them are more or less a fantasy. Is it even possible to combine all required functionality, capability, and acceptable performance to cost ratio in a framework that sufficiently supports all necessary business scenarios? We have, through superior understanding of business applications and accumulated experience on the public cloud, implemented a single write multiple read database framework to significantly simplify the complexity of previous multiple write databases. Furthermore, we are able to satisfy the needs of the vast majority of use cases. We have implemented a proprietary distributed storage engine as the core of our arsenal, allowing PolarDB to provide flexibility on multiple dimensions.


The system has three layers, as we can see in this figure. The top layer is DBserver, which implements a single master, multiple slave framework whereby other nodes are able to expand or contract as needed to support any request. The lowest layer is distributed, fast storage devices.

PolarDB Features

What makes PolarDB special? First, a relational database absolutely must have high performance. If a relational database has poor performance, it will have difficulty satisfying the need to process the explosive growth of data characteristic of the current Internet era. So when I say that PolarDB performance is high, what exactly does that mean?

  • High speed Single Point QBS can easily reach 500,000
    Because PolarDB uses shared distributed storage, performance when adding a new read-only node is quite high, and when sharing data, we don't have to add a new read-only instance and replicate the data. This reduces overhead from replicating data, as adding a new read-only instance only takes 1-5 minutes. It is also completely unaffected by the size of the data in the database. What's more, with a single master multiple read structure, we are able to keep latency down to a matter of milliseconds. We can also create backups in seconds. Each of these functions features extremely high performance.
  • Super high capacity
    Using data to a certain point, it seems that once the size reaches around 2TB most databases become useless. Today, PolarDB is capable of providing capacity of up to 100TB, which, from the perspective of relational frameworks, is an enormous amount of data.
  • Automatic scaling according to necessity
    The PolarDB data structure makes full use of the flexibility offered by the cloud, enabling the system to scale flexibly according to changes in the user's application.
  • MySQL compatibility
    There are already more open source database instances combined than Oracle instances, and this trend is increasing every year. We are already nearing 100% compatibility, and will continue to improve support for SQL standards as quickly as possible.
  • High reliability and availability
    PolarDB uses a one master many slaves framework, which naturally offers high availability. If the master node crashes, it will automatically be directed to the command node. At the same time, the existence of multiple data copies means that the data is naturally more reliable.

PolarDB in Production Scenarios


When talking about the capabilities of PolarDB as a product, remember that the birth of a product, its value, and its reputation, are all dependent on the services it provides. If users don't use it and it doesn't solve pain points in their application scenarios, then it's difficult to say that the product has any value at all. For a user on the public cloud, the product must first take into consideration whether or not a cloud database can solve the user's needs. If I have a new service, or even an existing service that I want to move to the cloud, then I want to use a database with a high performance to cost ratio, and it should be a next gen database. Moving my data to the cloud involves the cost of migrating all of my users to the cloud as well.

This migration cost is quite low if all users are very easy to migrate. However, if migrating users involves changing business procedures, then the process becomes quite painful and brings with it hidden dangers according to what the user does. We have to provide strong performance if we are to satisfy the needs of high end users. From business to the cloud, I trust the public cloud, and in turn Alibaba Cloud. When you provide services 24/7, you can't afford any interruptions. As users increase, it becomes crucially important that your database be flexible enough, expandable enough to satisfy the needs of every business scenario.

Finally, data must be reliable. It is only once these needs are met that a database service is able to provide real value to the user. Next I will introduce and analyze four use cases to illustrate the capabilities and services offered by PolarDB.

Use Case 1: High Throughput Processing of Big Data


High throughput processing capability of large data volumes. In its earliest days, the public cloud serviced website users. As the public cloud improved and software on it continued to evolve, it gradually grew to become something very different. With the introduction of large users, medium users, and even smaller users with high growth potential, the services and data running on the cloud have become exponentially larger. We know that, in the mobile Internet era, data is used not only to solve users' needs, but it may very well become much more important, serving as a balance between supply and demand. Because of today's calculations, we know how to increase production efficiency, and as production becomes more and more efficient, so does the efficiency of user service scenarios as well as performance to cost ratios. Because we have gathered knowledge of user needs by servicing them and collecting their data, we have a much better understanding of what we need to provide. This allows us to react to changing needs and even become aware of changes in the collected data itself. Data has the possibility of changing the balance between supply and demand, which is a major contribution of the big data era. As data grows infinitely, databases become the supporting computing power that enables commercial civilization on the backend. Similarly, with the addition of data, the database requires more computing power to be able to process and utilize the data.

We utilize an architecture that separates reads and writes in order to accommodate more user processing systems. At the same time, we implement a shared storage system that allows us to provide storage of over 100TB and respond to the explosive growth of web-scale data.

Use Case 2: High Availability and Business Flexibility


A few years ago, when I was a developer, I was involved in developing high availability software. At the time, we wanted to install open source MySQL with two single nodes, purchase another piece of high availability software, and learn how to configure it in order to make the LAMP architecture highly available on two machines. Today, on the public cloud, we can use technology at a lower cost, and use it to serve more users cheaply. The value brought by the cloud is enormous.

Looking at this image, we see that when the CPU and memory on a computing node in PolarDB is insufficient, we can quickly and easily expand accordingly. Today we can use a shared storage framework to scale up or scale in. When there aren't many read tasks, we can even delete some read nodes. Because of today's competition, marketing, and changes in the Internet ecology, the time frame for our services could be reduced to a matter of hours or even minutes. For example, in e-commerce you sometimes have to deal with bid sniping, where data could surge in just an hour. However, if we're able to add a read-only node each minute, this kind of load poses much less of a problem.

Use Case 3: Cloudification and Migration


When something new and more advanced comes on the market, we naturally want to give it a try, but that becomes quite difficult if we have to change our business processes. If we have MySQL compatibility, then putting our business on the cloud is quite simple. Then, if we use cloudification tools and perform logical migration, then the entire cloudification and cloud migration process is quite smooth.

Today we have already entered an age of cloud computing, IoT, and artificial intelligence. Before, we used to say that the Internet would move from online to offline, maybe some traditional businesses would move to the cloud, and maybe artificial intelligence would open up new forms of business. It's possible that industry + the Internet will embrace the high performance to cost ratio, flexible, easily deployable cloud. With these kinds of migration tools, issues of compatibility are easily solved and the cost of the entire process of migrating to the cloud is reduced greatly.

Use Case 4: High Reliability and Backups for Disaster Recovery


The last point is high reliability and backups for disaster recovery. The above diagram shows a framework diagram of PolarDB with PolarDB as a cluster architecture on the DBserver layer. For a cluster architecture, network connectivity can be considered a mission critical application service. Because of PolarDB's high reliability, it is ideal to be used for backups and disaster recovery scenarios.


Looking back, as I have personally come to understand PolarDB, I see it as a database product that combines imagination with creativity and adaptability. We believe that the spirit of PolarDB is one of faith combined with hard work and effort, and that is why we are able to present such a product to you all today.

阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
Java 关系型数据库 数据库
五分钟带你玩转spring cloud alibaba(三)定制postgresql版本的nacos
五分钟带你玩转spring cloud alibaba(三)定制postgresql版本的nacos
369 0
五分钟带你玩转spring cloud alibaba(三)定制postgresql版本的nacos
关系型数据库 PostgreSQL RDS
Cloud Massive Task Scheduling System Database Design - Alibaba Cloud RDS PostgreSQL Cases
PostgreSQL is crucial to cloud massive task scheduling system. Here we will describe how to design a system database for cloud massive task scheduling.
1210 0
Cloud Massive Task Scheduling System Database Design - Alibaba Cloud RDS PostgreSQL Cases
关系型数据库 MySQL 分布式数据库
The Evolution of Alibaba Cloud's Relational Database Services Architecture – PolarDB
This article discusses the history of Alibaba Cloud's RDS architecture, as well as the motivation behind the development of PolarDB.
4807 0
The Evolution of Alibaba Cloud's Relational Database Services Architecture – PolarDB
固态存储 关系型数据库 分布式数据库
100TB Capacity and 6x Performance Improvement with Alibaba Cloud PolarDB
This article focuses on the optimizations of Alibaba Cloud PolarDB's compute and storage engines to offer an unparalleled performance.
5908 0
100TB Capacity and 6x Performance Improvement with Alibaba Cloud PolarDB
关系型数据库 RDS
Internet of Vehicles – Window Querying with Alibaba Cloud RDS for PostgreSQL
Internet of Vehicles (IoV) is a popular topic of research in the field of IoT. One of the biggest issues facing IoV is collecting vehicle's travel tracks in real time.
2044 0
Internet of Vehicles – Window Querying with Alibaba Cloud RDS for PostgreSQL
关系型数据库 PostgreSQL RDS
Partitioned Index - Alibaba Cloud RDS PostgreSQL Best Practices
When should you partition a table in your database? Learn how to split tables with partial index.
2054 0
SQL 关系型数据库 C语言
Full-text Search Index Optimization - Alibaba Cloud RDS PostgreSQL Best Practices
Will indexes be used in full-text searches that do not contain a certain keyword? Learn more about GIN, Generalized Inverted Index.
3942 0
JSON 安全 关系型数据库
Massive Parallel Processing with Alibaba Cloud HybridDB for PostgreSQL
When you have massive amounts of data and the need for data analytics, or you have high availability requirements, or security and backup protocols to.
2045 0
关系型数据库 分布式数据库 数据库
数据库内核那些事|PolarDB IMCI让你和复杂低效的子查询说拜拜
PolarDB IMCI(In-Memory Column Index)确实是数据库领域的一项重要技术,特别是当它面对复杂和低效的子查询时,表现尤为出色。以下是关于PolarDB IMCI如何助力解决
SQL 关系型数据库 数据库