磁盘的存取
B-树的性质 :
一颗M阶B树T,满足以下条件
1.每个结点至多拥有M课子树
2.根结点至少拥有两颗子树
3.除了根结点以外,其余每个分支结点至少拥有M/2课子树
4.所有的叶结点都在同一层上
5.有k课子树的分支结点则存在k-1个关键字,关键字按照递增顺序进行排序
6.关键字数量满足ceil(M/2)-1 <= n <= M-1
B-树的用途:
降低了层高,方便磁盘访问。
常用的B/B+ 树,他通过对每个节点存储个数的扩展,使得对连续的数据能够进行较快的定位和访问,能够有效减少查找时间,提高存储的空间局部性从而减少IO操作。他广泛用于文件系统及数据库中如:
- Windows:HPFS文件系统
- Mac:HFS,HFS+文件系统
- Linux:ResiserFS,XFS,Ext3FS,JFS文件系统
- 数据库:ORACLE,MYSQL,SQLSERVER等中
本段转载自:http://www.cnblogs.com/yangecnu/p/Introduce-B-Tree-and-B-Plus-Tree.html
多叉树与B-树的比较:
- 多叉树没有约束平衡。
- 多叉树没有约束每个叶子结点的数量。
- B树的数据是有规律的。
B+树在B-树的基础上:
- 所有的数据存储在叶子节点。
- 所有的叶子节点通过前后之间链起来。
- B+tree 纯做索引,在数据库存储。
- B+树的应用场景:
B-树的实现
结点定义不需要定义父结点 ,B-树是自上而下的,结点满了就分裂。
#define M 3 typedef int KEY_VALUE; typedef struct _btree_node { struct _btree_node **children; //性质1.每个节点至多拥有2*M颗子树 KEY_VALUE *keys; //性质5,关键字的数量 int num; //key 有多少个 int leaf; //是否为叶子节点 } btree_node; typedef struct _btree { btree_node *root; int t; //节点的数量 } btree; //1 is leaf ,0 is not leaf
- 结点创建:
btree_node *btree_create_node(int t, int leaf) { btree_node *node = (btree_node*)calloc(1, sizeof(btree_node)); if (node == NULL) assert(0); node->leaf = leaf; node->keys = (KEY_VALUE*)calloc(1, (2*t-1)*sizeof(KEY_VALUE)); node->childrens = (btree_node**)calloc(1, (2*t) * sizeof(btree_node)); node->num = 0; return node; }
- 结点的销毁:
void btree_destory_node(struct btree_node *node){ if(node){ if(node->key) free(node->key); if(node->children) free(node->children); free(node); } }
- btree的创建:
1. void btree_create(btree *T, int t) { 2. T->t = t; 3. 4. btree_node *x = btree_create_node(t, 1); 5. T->root = x; 6. 7. }
- 节点的分裂 :
//参数1 哪颗b树,2 b树的哪个节点 3 该结点的哪个子树 void btree_split_child(btree *T, btree_node *x, int i) { int t = T->t; btree_node *y = x->childrens[i]; btree_node *z = btree_create_node(t, y->leaf); z->num = t - 1; //z 节点的修改 int j = 0; for (j = 0;j < t-1;j ++) { z->keys[j] = y->keys[j+t]; } if (y->leaf == 0) { for (j = 0;j < t;j ++) { z->childrens[j] = y->childrens[j+t]; } } //y 节点的修改 y->num = t - 1; //x 节点的修改 for (j = x->num;j >= i+1;j --) { x->childrens[j+1] = x->childrens[j]; } x->childrens[i+1] = z; for (j = x->num-1;j >= i;j --) { x->keys[j+1] = x->keys[j]; } x->keys[i] = y->keys[t-1]; x->num += 1; }
- 节点的插入:
- 找到对应节点,并且未满。
- 找到对应节点,且已满。 (1)找内结点已满,内节点分裂。(2) 找到叶子节点已满,叶子节点分裂。
插入未满的结点:
void btree_insert_nonfull(btree *T, btree_node *x, KEY_VALUE k) { int i = x->num - 1; if (x->leaf == 1) { while (i >= 0 && x->keys[i] > k) { x->keys[i+1] = x->keys[i]; i --; } x->keys[i+1] = k; x->num += 1; } else { while (i >= 0 && x->keys[i] > k) i --; if (x->childrens[i+1]->num == (2*(T->t))-1) { btree_split_child(T, x, i+1); if (k > x->keys[i+1]) i++; } btree_insert_nonfull(T, x->childrens[i+1], k); } }
void btree_insert(btree *T, KEY_VALUE key) { //int t = T->t; btree_node *r = T->root; if (r->num == 2 * T->t - 1) { btree_node *node = btree_create_node(T->t, 0); T->root = node; node->childrens[0] = r; btree_split_child(T, node, 0); int i = 0; if (node->keys[0] < key) i++; btree_insert_nonfull(T, node->childrens[i], key); } else { btree_insert_nonfull(T, r, key); } }
- B-树的删除:
判断子树 key 数量 M/2-1
1.相邻两颗树都是 M/2-1,合并;
2.左子树的大于 M/2-1,借节点;
3.右子树的大于 M/2-1,借节点。
分析:M = 6
第一步:I 的左子树key 数量 2 = M/2-1,向它的左边子树合并,没有左子树向右边的子树 (LORU)合并,右边的子树 key = 4 > M/2-1,借节点,将根节点 I 放在节点(CF)的后面 组成节点(CFI)。
第二步:将 L 放在根结点的位置。
第三部:将节点(LORU)的零号子树(JK)放在节点(CFI)三号节点的位置。(保证变换后还 是B-树)。
if ((left && left->num >= T->t) || (right && right->num >= T->t)) { int richR = 0; if (right) richR = 1; if (left && right) richR = (right->num > left->num) ? 1 : 0; if (right && right->num >= T->t && richR) { //borrow from next child->keys[child->num] = node->keys[idx]; child->childrens[child->num+1] = right->childrens[0]; child->num ++; node->keys[idx] = right->keys[0]; for (i = 0;i < right->num - 1;i ++) { right->keys[i] = right->keys[i+1]; right->childrens[i] = right->childrens[i+1]; } right->keys[right->num-1] = 0; right->childrens[right->num-1] = right->childrens[right->num]; right->childrens[right->num] = NULL; right->num --;
只看红框内的子树,节点(CFI)为根结点,重复上面的三个判断。
节点(CFI)的 0 号子树节点(AB)key = 2 = M/2-1;1 号子树节点(DE)的key = 2 = M/2-1。进行合并,将C下沉,根结点变为FI,该节点的 0 号子树节点变为ABCDF,依次更改右边子树的num。进而删除A。就完成了A 的删除。
void btree_delete_key(btree *T, btree_node *node, KEY_VALUE key) { if (node == NULL) return ; int idx = 0, i; while (idx < node->num && key > node->keys[idx]) { idx ++; } if (idx < node->num && key == node->keys[idx]) { if (node->leaf) { for (i = idx;i < node->num-1;i ++) { node->keys[i] = node->keys[i+1]; } node->keys[node->num - 1] = 0; node->num--; if (node->num == 0) { //root free(node); T->root = NULL; } return ; } else if (node->childrens[idx]->num >= T->t) { btree_node *left = node->childrens[idx]; node->keys[idx] = left->keys[left->num - 1]; btree_delete_key(T, left, left->keys[left->num - 1]); } else if (node->childrens[idx+1]->num >= T->t) { btree_node *right = node->childrens[idx+1]; node->keys[idx] = right->keys[0]; btree_delete_key(T, right, right->keys[0]); } else { btree_merge(T, node, idx); btree_delete_key(T, node->childrens[idx], key); } } else { btree_node *child = node->childrens[idx]; if (child == NULL) { printf("Cannot del key = %d\n", key); return ; } if (child->num == T->t - 1) { btree_node *left = NULL; btree_node *right = NULL; if (idx - 1 >= 0) left = node->childrens[idx-1]; if (idx + 1 <= node->num) right = node->childrens[idx+1]; if ((left && left->num >= T->t) || (right && right->num >= T->t)) { int richR = 0; if (right) richR = 1; if (left && right) richR = (right->num > left->num) ? 1 : 0; if (right && right->num >= T->t && richR) { //borrow from next child->keys[child->num] = node->keys[idx]; child->childrens[child->num+1] = right->childrens[0]; child->num ++; node->keys[idx] = right->keys[0]; for (i = 0;i < right->num - 1;i ++) { right->keys[i] = right->keys[i+1]; right->childrens[i] = right->childrens[i+1]; } right->keys[right->num-1] = 0; right->childrens[right->num-1] = right->childrens[right->num]; right->childrens[right->num] = NULL; right->num --; } else { //borrow from prev for (i = child->num;i > 0;i --) { child->keys[i] = child->keys[i-1]; child->childrens[i+1] = child->childrens[i]; } child->childrens[1] = child->childrens[0]; child->childrens[0] = left->childrens[left->num]; child->keys[0] = node->keys[idx-1]; child->num ++; node->key[idx-1] = left->keys[left->num-1]; left->keys[left->num-1] = 0; left->childrens[left->num] = NULL; left->num --; } } else if ((!left || (left->num == T->t - 1)) && (!right || (right->num == T->t - 1))) { if (left && left->num == T->t - 1) { btree_merge(T, node, idx-1); child = left; } else if (right && right->num == T->t - 1) { btree_merge(T, node, idx); } } } btree_delete_key(T, child, key); } }
- 面试涉及到的问题 :
1. 如何确保B-树的线程安全
答:a. 在根结点加锁。b. 在字数上加锁。