首先我们生成一个随机数列，用于执行查找算法。

# 生成随机数列
import random

data = [0]*50
for i in range(50):
    data[i] = random.randint(1,100)

print('the random data is:')
for i in range(10):
    for j in range(5):
        print('%2d[%3d] ' % (i*5+j+1,data[i*5+j]), end='')
    print('\n')

the random data is:
 1[  1]  2[ 68]  3[ 14]  4[ 83]  5[ 94] 

 6[ 71]  7[ 13]  8[ 47]  9[ 43] 10[ 43] 

11[ 43] 12[ 29] 13[ 65] 14[ 47] 15[ 31] 

16[100] 17[  5] 18[ 77] 19[ 99] 20[ 46] 

21[ 17] 22[ 56] 23[ 38] 24[ 65] 25[  4] 

26[ 14] 27[  4] 28[ 68] 29[ 61] 30[ 42] 

31[ 97] 32[ 81] 33[ 68] 34[ 91] 35[ 47] 

36[ 31] 37[ 51] 38[ 16] 39[ 18] 40[ 10] 

41[ 14] 42[ 63] 43[ 47] 44[ 47] 45[ 86] 

46[ 58] 47[ 86] 48[ 23] 49[ 37] 50[ 59]

1.顺序查找算法

线性查找，从头到尾遍历。

优点：在查找前，不需要对其进行任何处理。
缺点：查找速度慢。

num = 0
while num !=-1:
    find = 0
    num = int(input("please input the number you want to search(input -1 to exit):"))
    for i in range(50):
        if data[i] == num:
            print(f'find the number in the data[{i}]')
            find += 1
    if find ==0 and num!= -1:
        print('not found the number')

please input the number you want to search(input -1 to exit):-1

2.二分查找算法

折半查找。

优点：查找速度快。
缺点：只适用于已经排序好的数列。

# 增序列查找
def search(data, num):
    low = 0
    high = len(data)-1
    print('searching...')
    while low<=high and num!=-1:
        mid = int((low+high)/2)
        print(mid)
        if num < data[mid]:
            print(f'{num} is between {data[low]} and {data[mid]}')
            high = mid - 1
        elif num > data[mid]:
            print(f'{num} is between {data[mid]} and {data[high]}')
            low = mid + 1
        else:
            return mid
    return -1

data = [12,45,56,66,77,80,97,101,120]
while True:
    loc = 0
    num = int(input("please input the number you want to search(input -1 to exit):"))
    if num == -1:
        break
    loc = search(data, num)
    if loc == -1:
        print('not found')
    else:
        print(f'find the number in data[{loc}]')

please input the number you want to search(input -1 to exit):12
searching...
4
12 is between 12 and 77
1
12 is between 12 and 45
0
find the number in data[0]
please input the number you want to search(input -1 to exit):120
searching...
4
120 is between 77 and 120
6
120 is between 97 and 120
7
120 is between 101 and 120
8
find the number in data[8]
please input the number you want to search(input -1 to exit):-1

3.插补查找算法

插值查找，是二分查找算法的改进，按照数据分布，利用公式预测键值所在位置。

middle = left+(target-data[left])/(data[right]-data[left])*(right-left)

middle:所求边界索引
left:最左侧数据的索引
target:键值（目标数据）
data[left]:最左侧数据
data[right]:最右侧数据
right:最右侧数据的索引`

优点：比二分查找算法快。

# 增序列查找
def search(data, num):
    low = 0
    high = len(data)-1
    print('searching...')
    while low<=high and num!=-1:
        mid = low+int((num-data[low])*(high-low)/(data[high]-data[low]))
        print(mid)
        if num < data[mid]:
            print(f'{num} is between {data[low]} and {data[mid]}')
            high = mid - 1
        elif num > data[mid]:
            print(f'{num} is between {data[mid]} and {data[high]}')
            low = mid + 1
        else:
            return mid
    return -1

data = [12,45,56,66,77,80,97,101,120]
while True:
    loc = 0
    num = int(input("please input the number you want to search(input -1 to exit):"))
    if num == -1:
        break
    loc = search(data, num)
    if loc == -1:
        print('not found')
    else:
        print(f'find the number in data[{loc}]')

please input the number you want to search(input -1 to exit):12
searching...
0
find the number in data[0]
please input the number you want to search(input -1 to exit):-1

4.哈希查找算法

哈希查找算法是使用哈希函数来计算一个键值所对应的地址，进而建立哈希表格，然后利用哈希函数来查找各个键值存放在表格中的地址。

特点：保密性高，存在碰撞与溢出问题。

常用哈希算法有：

除留余数法：
h(item)=item%num

item:每个数据
num:一个常数，一般会选择一个质数

对给定的数据集，哈希函数将每个元素映射为单个槽位，称为“完美哈希函数”，但是对于任意一个数据集合，没有系统能构造出完美哈希函数。除留余数法就可能遇到两个数据的哈希值相同，解决办法之一就是扩大哈希表，但是这种做法太浪费空间，因此有了折叠法。

折叠法：
将数据分成一串数据，先将数据拆成几部分，再把它们加起来作为哈希地址，设定好槽位后，用除留余数法得到这串数据的哈希值，称为“移动折叠法”。
有些折叠法多了一步，在相加前，对数据进行奇数/偶数翻转，这种称为“边界折叠法”。
平方取中法：
先将各个数据平方，将平方后数据取中间的某段数字作为索引，得到哈希地址，然后用除留余数法得到哈希值。

碰撞与溢出问题：

碰撞问题：
哈希算法的理想情况是所有数据经过哈希函数运算后得到不同的值，但是在实际情况中即使得到哈希值不同，也可能得到相同的地址，这种问题称为“碰撞问题”。
溢出问题：
使用哈希算法，将数据放到哈希表中存储数据的对应位置，若该位置满了，就会溢出，这种问题称为“溢出问题”。

class HashTable:
    def __init__(self):
        self.size = 11 # size of hash table
        self.throw = [None] * self.size # initialize the key of hash table
        self.data = [None] * self.size # initialize the value of hash table

    def put(self, key, value):
        hashvalue = self.hashfunction(key, self.size) # get the hashvalue
        # if this slot is None, set the key and value
        if self.throw[hashvalue] is None:
            self.throw[hashvalue] = key
            self.data[hashvalue] = value
        else:
            # if this slot has the same key, replace the old value with new one
            if self.throw[hashvalue] == key:
                self.data[hashvalue] = value
            else:
                # if this slot has another key, rehash the hashvalue, until find the empty slot or a slot with the same key
                nextslot = self.rehash(hashvalue, self.size)
                while self.throw[nextslot] is not None and self.throw[nextslot] != key:
                    nextslot = self.rehash(nextslot, self.size)
                # empty slot: set the key and value
                if self.throw[nextslot] is None:
                    self.throw[nextslot] = key
                    self.data[nextslot] = value
                # slot with the same key: replace the old value
                else:
                    self.data[nextslot] = value

    def hashfunction(self, key, size):
        return key % size

    def rehash(self, oldhash, size):
        return (oldhash + 1) % size

    def get(self, key):
        startslot = self.hashfunction(key, self.size)
        data = None
        found = None
        stop = None
        pos = startslot
        while pos is not None and not found and not stop:
            if self.throw[pos] == key:
                found = True
                data = self.data[pos]
            else:
                pos = self.rehash(pos, self.size)
                if pos == startslot:
                    stop = True
        return data

#     reload the __getitem__() and __setitem__() methods so that it can get and set value by []
    def __getitem__(self,key):
        return self.get(key)

    def __setitem__(self,key,value):
        return self.put(key, value)

# an instance of class HashTable
H = HashTable()
H[16] = 'red'
H[28] = 'orange'
H[32] = 'yellow'
H[14] = 'green'
H[56] = 'cyan'
H[36] = 'blue'
H[71] = 'purple'
H[71] = 'darkblue'
print(H.throw)
print(H.data)
print(H[28])
print(H[71])

[None, 56, None, 14, 36, 16, 28, 71, None, None, 32]
[None, 'cyan', None, 'green', 'blue', 'red', 'orange', 'darkblue', None, None, 'yellow']
orange
darkblue

5.分块查找算法

分块查找算法是二分查找算法和顺序查找算法的改进，要求索引表是有序的，块内节点没有排序要求，块内节点可以有序也可以无序。

特点：特别适合于节点动态变化的情况。

该算法将一个大的线性表分解成若干块，每块中的节点可以任意存放，块之间必须排序。与此同时，还要建立一个索引表，把每块中最大值作为索引表的索引值，此索引表需要按块的顺序存放到一个辅助数组中。查找时首先在索引表中进行查找，确定要找的节点所在块，可以采用二分查找算法或顺序查找算法，然后在块中采用顺序查找算法。

def search(data, key):  # 用二分查找 想要查找的数据在哪块内
    length = len(data)  # 数据列表长度
    first = 0  # 第一位数位置
    last = length - 1  # 最后一个数据位置
    print(f"长度:{length} 分块的数据是:{data}")  # 输出分块情况
    while first <= last:
        mid = (last + first) // 2  # 取中间位置
        if data[mid] > key:  # 中间数据大于想要查的数据
            last = mid - 1  # 将last的位置移到中间位置的前一位
        elif data[mid] < key:  # 中间数据小于想要查的数据
            first = mid + 1  # 将first的位置移到中间位置的后一位
        else:
            return mid  # 返回中间位置
    return False


# 分块查找
def block(data, count, key):  # 分块查找数据，data是列表，count是每块的长度，key是想要查找的数据
    length = len(data)  # 表示数据列表的长度
    block_length = length // count  # 一共分的几块
    if count * block_length != length:  # 每块长度乘以分块总数不等于数据总长度
        block_length += 1  # 块数加1
    print("一共分", block_length, "块")  # 块的多少
    print("分块情况如下：")
    for block_i in range(block_length):  # 遍历每块数据
        block_data = []  # 每块数据初始化
        for i in range(count):  # 遍历每块数据的位置
            if block_i * count + i >= length:  # 每块长度要与数据长度比较，一旦大于数据长度
                break  # 就退出循环
            block_data.append(data[block_i * count + i])  # 每块长度要累加上一块的长度
        result = search(block_data, key)  # 调用二分查找的值
        if result != False:  # 查找的结果不为False
            return block_i * count + result  # 就返回块中的索引位置
    return False


data = [23, 43, 56, 78, 97, 100, 120, 135, 147, 150, 155]  # 数据列表
result = block(data, 4, 150)  # 第二个参数是块的长度，最后一个参数是要查找的元素
print("查找的值得索引位置是:", result)  # 输出结果

一共分 3 块
分块情况如下：
长度:4 分块的数据是:[23, 43, 56, 78]
长度:4 分块的数据是:[97, 100, 120, 135]
长度:3 分块的数据是:[147, 150, 155]
查找的值得索引位置是: 9

6.斐波那契查找算法

斐波那契查找同样是查找算法家族中的一员，它要求数据是有序的（升序或降序）。斐波那契查找采用和二分查找/插值查找相似的区间分割策略，都是通过不断的分割区间缩小搜索的范围。

def fibonacci_search(data,key):
    fib = lambda n: n if n < 2 else fib(n-1) + fib(n-2)
    F = [fib(i) for i in range(20)]
    print(F)
    low = 0
    high = len(data) - 1

    k = 0
    while high > F[k] - 1:
        k += 1
    i = high
    while F[k] - 1 > i:
        data.append(data[high])
        i += 1
    print(data)

    while low <= high:
        if k < 2:
            mid = low
        else:
            mid = low + (F[k-1] -1) # 当前数列长度进行斐波那契数列分割，如当前寻找数列长度为13，则中间值为13=8+5中的8

        print("low:%s mid:%s high:%s k:%s" %(low,mid,high,k))
        if key < data[mid]:
            high = mid - 1
            k -= 1
        elif key > data[mid]:
            low = mid + 1
            k -= 2
        else:
            if mid <= high:
                return mid
            else:
                return high
    return False

data = [9,10,13,14,15,22,29,59,62]
key = int(input('input the number you want to find:'))
result = fibonacci_search(data, key)
print('find the number %s in data[%s]' % (key, result))

input the number you want to find:62
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
[9, 10, 13, 14, 15, 22, 29, 59, 62, 62, 62, 62, 62]
low:0 mid:7 high:8 k:7
low:8 mid:10 high:8 k:5
find the number 62 in data[8]

7.查找算法的时间复杂度

查找算法	时间复杂度
顺序查找算法	O(n)
二分查找算法	O(log(n))
插补查找算法	O(log log(n))
分块查找算法	O(log(m)+N/m)
斐波那契查找算法	O(log(n))
哈希查找算法	O(1)

8.如何解决散列表冲突

哈希查找算法中提到了冲突的碰撞和溢出问题，常用的解决冲突问题的方法有：

开放定址法

开放定址法就是当数据的散列地址遇到冲突，寻找下一个空的散列地址，只要散列表足够大，空的散列地址总能找到，将数据存入该散列地址。散列函数语法如下：

H(i)=(H(key)+d(i)) mod m

key: 要存入的数据
H(key): 散列函数
m: 散列表长度
d(i): 增长序列，d(i)有三种取法：线性探测法、平方探测法、伪随机探测法。
mod: 模，取余数

链地址法

链地址法就是将经过散列函数得到的相同的数值（索引值）放在同一个位置，索引值相同的数据以链表的形式存储。

再散列函数法

再散列函数法就是一开始就先设置一些散列函数（除留余数、平方取中、折叠法等），如果使用第一种散列函数有冲突就改用第二种...一直到没有冲突为止。

公共溢出法

公共溢出法就是将散列表分为基本表和溢出表，凡是与基本表发生冲突的，都存放在溢出表中。

【查找算法】6种常见的查找算法简述及Python代码实现

1.顺序查找算法

2.二分查找算法

3.插补查找算法

4.哈希查找算法

5.分块查找算法

6.斐波那契查找算法

7.查找算法的时间复杂度

8.如何解决散列表冲突

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

【查找算法】6种常见的查找算法简述及Python代码实现

1.顺序查找算法

2.二分查找算法

3.插补查找算法

4.哈希查找算法

5.分块查找算法

6.斐波那契查找算法

7.查找算法的时间复杂度

8.如何解决散列表冲突

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像