数据库测试一些知识

简介: 1.测试覆盖率的几个衡量指标2.传统数据库测试3.新兴的NewSQL数据库

作者:亮 亮言

1 测试覆盖率的几个衡量指标
2 传统数据库测试

2.1 SQLite
2.2 Oracle
3 新兴的NewSQL数据库
3.1 比较常见的设计
3.2 Tidb
3.3 FoundationDB

1 测试覆盖率的几个衡量指标
常见的几种衡量测试覆盖率方法
函数覆盖(Function Coverage)
语句覆盖(Statement Coverage)
决策覆盖(Decision Coverage)Branch Coverage
条件覆盖(Condition Coverage)
Modified Condition/Decision Coverage (DC + CC + 每一个condition能独立影响decision结果的测试用例)

        主要用在 safety-critical system (航空航天器件)
int foo( int x ,int y)
{
   int  z = 0;
   if ( (x>0) && (y>0) ) {
        z = x;
   }

   return z;
}
if ( a or b ) and c then

condition/decision criteria 满足:

a = true , b = true, c = true

a = false, b = false, c= false

第一个Test中,b的值 ;第二个test中,c的值 都不会影响 decision的值。

a=false, b=true, c=false
a=false, b=true, c=true
a=false, b=false, c=true
a=true, b=false, c=true

2 传统数据库测试
2.1 SQLite

https://www.sqlite.org/testing.html

源码:128.9 KSLOC of C code

测试代码:91772.0 KSLOC of test code

比例:711:1

PR测试示例: https://sqlite.org/src/info/940f2adc8541a838

主要测试集合列表:
Three independently developed test harnesses
100% branch test coverage in an as-deployed configuration
Millions and millions of test cases
Out-of-memory tests
I/O error tests
Crash and power loss tests
Fuzz tests
Boundary value tests
Disabled optimization tests
Regression tests
Malformed database tests
Extensive use of assert() and run-time checks
Valgrind analysis
Undefined behavior checks

其中三块独立开发的测试用例集合:

1 TCL 脚本写的Test (最开始的测试用例):开发测试时候用

26.1 KSLOC of C code

million 级别的case

2 C语言的测试用例:100% MC/DC test coverage

792.3 KSLOC of C code

hundreds of millions of tests.

3 SQL Logic Test (逻辑测试)

SQLite同时和PostgreSQL, MySQL,Oracle 等数据库跑相同功能的SQL语句,用来确认各个语句的执行结果是一致的。
7.2 million

就算是跑了这么多的测试,还是不能阻止bug的发生,腾讯的安全平台部门发现了SQLite的一个远程代码执行漏洞。

SQLite的作者也谈到了最开始开发SQLite的一段故事:

当SQLite运行的场景越来越多的时候(million 级别的应用,billion个设备上的时候),他会稳定的收到bug报告。

当bug数累计的越来越多的时候,作者花了十个月 (2008-09-25 through 2009-07-25) 的时间编写测试用例,使

SQLite能达到100% MC/DC指标。在这之后,bug报告数目就迅速降低下来了。

SQLite是如何做测试的

https://news.ycombinator.com/item?id=18685296

Remote Code Execution vulnerability in SQLite

100% branch, line coverage means nothing. It's about logical coverage. What are you testing for? You are not testing lines of code, but logic.

Right. The actual standard is called "modified condition/decison coverage" or MC/DC. In languages like C, MC/DC and branch coverage, though not exactly the same, are very close.

Achieving 100% MC/DC does not prove that you always get the right answer. All it means is that your tests are so extensive that you managed to get every machine-code branch to go in both directions at least once. It is a high standard and is difficult to achieve. It does not mean that the software is perfect.

But it does help. A lot. When I was young, I used to think I could right flawless code. Then I wrote SQLite, and it got picked up and used by lots of applications. It will amaze you how many problems will crop up when your code runs on in millions of application on billions of devices.

I was getting a steady stream of bug reports against SQLite. Then I took 10 months (2008-09-25 through 2009-07-25) to write the 100% MC/DC tests for SQLite. And after that, the number of bug reports slowed to a trickle. There still are bugs. But the number of bugs is greatly reduced. (Note that 100% MC/DC was first obtained on 2009-07-25, but the work did not end there. I spend most of my development time adding and enhancing test cases to keep up with changes in the deliverable SQLite code.)

100% MC/DC is just an arbitrary threshold - a high threshold and one that is easy to measure and difficult to cheat - but it is just a threshold at which we say "enough". You could just as easily choose a different threshold, such as 100% line coverage. The higher the threshold, the fewer bugs will slip through. But there will always be bugs.

My experience is that the weird tests you end up having to write just to cause some obscure branch to go one way or another end up finding problems in totally unrelated parts of the system. One of the chief benefits of 100% MC/DC is not so much that every branch is tested, but rather that you have to write so many tests, and such strange, weird, convoluted, and stressful tests, that you randomly stumble across (and fix) lots of problems you would have never thought about otherwise.

Another big advantage of 100% MC/DC is that once they are in place, you can change anything, anywhere in the code, and if the tests all still pass, you have high confidence that you didn't break anything. This enables us to evolve the SQLite code much faster than we could otherwise, using relatively few eyeballs.

Yet another advantage of 100% MC/DC is that you are really testing compiled machine code, not source code. So you worry less about compiler bugs. "Undefined behavior" is a big bugbear with C. We worry less than others about UB because we have tested the output of the compiler and we know that the compiler did what we wanted, even if the official C-language spec didn't require it to. We still avoid UB, and SQLite does not currently contain any UB as far as we know. But is is nice to know that even if we missed some UB in the code someplace, it probably doesn't matter.

2.2 Oracle

https://news.ycombinator.com/item?id=18442941

Oracle Database 12.2

image.png

Oracle Database developer 的工作内容:

1 开始处理新的bug

2 花费两周时间理清引起这个bug的代码逻辑,这个涉及到需要了解20多个标志位在不同场景下的组合情况,有时候多达上百个

3 为了处理bug,新增加一个标志位和几行代码规避这个bug

4 将Oracle DB代码打包,提交到有100-200台机器组成的测试集群里测试代码

5 可以回家了,测试需要跑20-30个小时

6 3-4-5的情况要重复几次,如果不顺利的话。

7 在大约两周之后,你终于把几十个标志位的排列组合情况摸清楚了,测试用例有都OK了

8 再给自己新增加的标志位写测试用例,再来一轮测试。然后提交代码review

9 代码review过程持续2周到2个月不等。

所以大概一个bug的解决时间在 2w-2m不等。

A bug's Odyssey

Here is how the life of an Oracle Database developer is:

  • Start working on a new bug.
  • Spend two weeks trying to understand the 20 different flags
    that interact in mysterious ways to cause this bag.
  • Add one more flag to handle the new special scenario. Add a few more lines of code that
    checks this flag and works around the problematic situation and avoids the bug.
  • Submit the changes to a test farm consisting of about 100 to 200 servers that would compile the code,
    build a new Oracle DB, and run the millions of tests in a distributed fashion.
  • Go home. Come the next day and work on something else. The tests can take 20 hours to 30 hours to complete.
  • Go home. Come the next day and check your farm test results. On a good day, there would be about 100 failing tests.
    On a bad day, there would be about 1000 failing tests.

Pick some of these tests randomly and try to understand what went wrong with your assumptions.
Maybe there are some 10 more flags to consider to truly understand the nature of the bug.

  • Add a few more flags in an attempt to fix the issue. Submit the changes again for testing. Wait another 20 to 30 hours.
  • Rinse and repeat for another two weeks until you get the mysterious incantation of the combination of flags right.
  • Finally one fine day you would succeed with 0 tests failing.
  • Add a hundred more tests for your new change to ensure that the next developer who has the misfortune of touching
    this new piece of code never ends up breaking your fix.
  • Submit the work for one final round of testing. Then submit it for review.
    The review itself may take another 2 weeks to 2 months. So now move on to the next bug to work on.
  • After 2 weeks to 2 months, when everything is complete, the code would be finally merged into the main branch.

The above is a non-exaggerated description of the life of a programmer in Oracle fixing a bug. Now imagine what horror it is going to be to develop a new feature. It takes 6 months to a year (sometimes two years!) to develop a single small feature (say something like adding a new mode of authentication like support for AD authentication).

I don't work for Oracle anymore. Will never work for Oracle again!

3 新兴的NewSQL数据库

3.1 比较常见的设计

单机版的KV store 设计:

image.png

在单机KV store基础上 通过分布式一致性协议(Paxos极其变种(Raft,ZK,VR等)

3.2 Tidb

我们现在有六百多万个 Test

1 自动化测试

怎么去自动生成测试Case:

1.1 fault injection 测试异常分支

Hardware : disck,cpu,network card,

Software: os,network protocol,file system

kill -9

1.2 fuzz testing

1.3 上层协议兼容Mysql,把Mysql的测试用例直接迁移过来

1.3 通过解析语法树

https://www.pingcap.com/blog-cn/golang-failpoint/

1.4 Jespen测试:

验证分布式系统一致性的测试框架

https://pingcap.com/blog-cn/tidb-jepsen/

2 所有出现过的 bug,历史上只要出现过一次,你一定要写一个 Test 去 cover 它 目前主流开源社区都在坚持的做法

3.3 FoundationDB
FoundationDB开源产品,后来被Apple收购又不开源了,用在Apple的Icloud存储后台场景下,

现在又被重新开源了。

https://apple.github.io/foundationdb/testing.html

重要思想:Simulation

a deterministic simulation of an entire FoundationDB cluster within a single-threaded process.

用单个进程(single-thread)确定性的(deterministic)模拟出整个集群(entire cluster)

确定性:可以复现问题

同时可以方面的模拟各个层面的Failure Mode

在C++11 基础上加入了actor-based-concurrency的能力,

开发出了一个新的Flow语言

1 high performance

2 actor-based concurrency

3 Simulation

one trillion CPU-hours of simulation on FoundationDB

Reference:

https://www.infoq.cn/article/test-coverage-rate-role

https://www.sqlite.org/testing.html

https://www.sqlite.org/th3/doc/trunk/www/th3.wiki

https://www.pingcap.com/blog-cn/distributed-system-test-1/

https://www.pingcap.com/blog-cn/distributed-system-test-2/

https://www.pingcap.com/blog-cn/distributed-system-test-3/

https://apple.github.io/foundationdb/testing.html

https://apple.github.io/foundationdb/engineering.html#simulation

https://jepsen.io/

Remote code execution vulnerability in SQLite

https://news.ycombinator.com/item?id=18685296

Ask HN: What's the largest amount of bad code you have ever seen work?

目录
相关文章
|
2月前
|
JavaScript 前端开发 数据库
测试开发之路--Flask 之旅 (三):数据库
本文介绍了在 Flask 应用中实现权限管理的过程,包括使用 Flask-SQLAlchemy、Flask-MySQLdb、Flask-Security 和 Flask-Login 等扩展模块进行数据库配置与用户权限设置。首先创建数据库并定义用户、环境和角色模型,接着通过 Flask-Security 初始化用户和角色,并展示了如何便捷地管理权限。后续将深入探讨权限控制的具体应用。
65 4
测试开发之路--Flask 之旅 (三):数据库
|
2月前
|
关系型数据库 MySQL 数据库
6-2|测试连接数据库的命令
6-2|测试连接数据库的命令
|
2月前
|
关系型数据库 MySQL 测试技术
《性能测试》读书笔记_数据库优化
《性能测试》读书笔记_数据库优化
30 7
|
6月前
|
JavaScript Java 测试技术
大学生体质测试|基于Springboot+vue的大学生体质测试管理系统设计与实现(源码+数据库+文档)
大学生体质测试|基于Springboot+vue的大学生体质测试管理系统设计与实现(源码+数据库+文档)
99 0
|
3月前
|
安全 测试技术 网络安全
深入理解数据库黑盒测试
【8月更文挑战第31天】
43 0
|
4月前
|
监控 Oracle 关系型数据库
关系型数据库Oracle恢复测试
【7月更文挑战第20天】
76 7
|
4月前
|
关系型数据库 MySQL 测试技术
数据库升级是一个涉及数据备份、新版本安装、数据迁移和测试等关键环节的复杂过程
【7月更文挑战第21天】数据库升级是一个涉及数据备份、新版本安装、数据迁移和测试等关键环节的复杂过程
96 1
|
4月前
|
中间件 Java 测试技术
单元测试问题之编写单元测试时运行环境、数据库、中间件问题如何解决
单元测试问题之编写单元测试时运行环境、数据库、中间件问题如何解决
|
4月前
|
测试技术 数据库 容器
开发与运维测试问题之操作数据库进行DAO层测试如何解决
开发与运维测试问题之操作数据库进行DAO层测试如何解决
|
6月前
|
NoSQL 算法 测试技术
图数据库基准测试 LDBC SNB 系列讲解:Schema 和数据生成的机制
作为大多数图数据库性能测试标配的 LDBC SNB 它是如何保障不同系统环境之间的测评比较公平且基准测试结果可重复的呢?本文从数据和 Schema 生成入手同你讲解它的原理。
164 2
图数据库基准测试 LDBC SNB 系列讲解:Schema 和数据生成的机制