今天是又一年度的2月14日西方情人节,先说点与情人节有关的,但这绝不是狗粮,可以放心阅读。
讲真,如果你是单身狗,没事时还是要静下心学习以提升自己;如果你不是单身狗,没事时除了上一条还要多陪陪媳妇和家人。因为没有什么是比亲人和提升自己更重要的事了!无论是提升自己还是陪伴家人,不要浮于表面,就像今天过情人节一样,向对方表达爱并不是只有这一天和那几天,而是男女双方长久的坚持和包容。
用以前有人用过的句子说:
当你的才华撑不起你的野心,那你就应该静下心来学习;当你的金钱赶不上你的梦想,那你就应该坐下来好好工作;当你的能力还驾驭不了你的目标,那就应该沉下心来历练!
正文开始:
本文为使用Python脚本检验文件系统数据完整性和防止数据篡改提供一种简单且容易实现的思路(其实很简单,只需要了解Python基础+hashlib+文件操作等)。
虽然校验数据完整性这个话题已经由很多更加完美的解决办法,但依然可以作为Python新手练手内容之一,培养一下逻辑思维,防止“老年痴呆”。
目前已经在Windows 10以及Ubuntu(Python 2.7)下测试通过,其他的平台应该也可以,欢迎帮忙测试。
编写的思路和执行过程简要如下:
1.输入要检查数据完整性的目录的路径(也支持单个文件)和要保存文件hash值的校验文件的路径,如果路径不存在,则抛出异常或者创建,取决于具体情况;
参数传入(最新版本将参数传入通过命令行的方式传入了,下面图片中是老版本中的参数传入):
在刚更新的版本中,参数传入和命令帮助通过docopt模块实现,方便使用。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
Python script to check data integrity on UNIX
/
Linux
or
Windows
accept options using
'docopt'
module, using
'docopt'
to accept parameters
and
command switch
Usage:
checkDataIntegrity.py [
-
g
FILE
HASH_FILE]
checkDataIntegrity.py [
-
c
FILE
HASH_FILE]
checkDataIntegrity.py [
-
r HASH_FILE]
checkDataIntegrity.py generate
FILE
HASH_FILE
checkDataIntegrity.py validate
FILE
HASH_FILE
checkDataIntegrity.py reset HASH_FILE
checkDataIntegrity.py (
-
-
version |
-
v)
checkDataIntegrity.py
-
-
help
|
-
h |
-
?
Arguments:
FILE
the path to single
file
or
directory to data protect
HASH_FILE the path to
hash
data saved
Options:
-
?
-
h
-
-
help
show this
help
message
and
exit
-
v
-
-
version show version
and
exit
Example,
try
:
checkDataIntegrity.py generate
/
tmp
/
tmp
/
data.json
checkDataIntegrity.py validate
/
tmp
/
tmp
/
data.json
checkDataIntegrity.py reset
/
tmp
/
data.json
checkDataIntegrity.py
-
g
/
tmp
/
tmp
/
data.json
checkDataIntegrity.py
-
c
/
tmp
/
tmp
/
data.json
checkDataIntegrity.py
-
r
/
tmp
/
data.json
checkDataIntegrity.py
-
-
help
|
合法的参数和路径:
路径不存在时抛出异常:
其他异常处理可以通过脚本内容看到。
2.首次执行保存需要校验hash值的校验文件的内容,再次执行读取原先的文件与现在的待校验的目录中的文件的hash值做比对,如果hash值不一样,则显示出该文件路径,如果全部一致,则输出提示信息
首次执行:
再次执行(检验通过):
校验不通过:
3.当文件发生变更并且想更新校验文件数据时,可以使用remakeDataIntegrity()函数将已保存的校验文件删除
Linux上的测试:
最新的代码可以从GitHub获得,链接:https://github.com/DingGuodong/LinuxBashShellScriptForOps/blob/master/functions/security/checkDataIntegrity.py。
代码如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
|
#!/usr/bin/python
# encoding: utf-8
# -*- coding: utf8 -*-
"""
Created by PyCharm.
File: LinuxBashShellScriptForOps:checkDataIntegrity.py
User: Guodong
Create Date: 2017/2/14
Create Time: 14:45
Python script to check data integrity on UNIX/Linux or Windows
accept options using 'docopt' module, using 'docopt' to accept parameters and command switch
Usage:
checkDataIntegrity.py [-g FILE HASH_FILE]
checkDataIntegrity.py [-c FILE HASH_FILE]
checkDataIntegrity.py [-r HASH_FILE]
checkDataIntegrity.py generate FILE HASH_FILE
checkDataIntegrity.py validate FILE HASH_FILE
checkDataIntegrity.py reset HASH_FILE
checkDataIntegrity.py (--version | -v)
checkDataIntegrity.py --help | -h | -?
Arguments:
FILE the path to single file or directory to data protect
HASH_FILE the path to hash data saved
Options:
-? -h --help show this help message and exit
-v --version show version and exit
Example, try:
checkDataIntegrity.py generate /tmp /tmp/data.json
checkDataIntegrity.py validate /tmp /tmp/data.json
checkDataIntegrity.py reset /tmp/data.json
checkDataIntegrity.py -g /tmp /tmp/data.json
checkDataIntegrity.py -c /tmp /tmp/data.json
checkDataIntegrity.py -r /tmp/data.json
checkDataIntegrity.py --help
"""
from
docopt
import
docopt
import
os
import
sys
import
hashlib
def
get_hash_sum(filename, method
=
"sha256"
, block_size
=
65536
):
if
not
os.path.exists(filename):
raise
RuntimeError(
"cannot open '%s' (No such file or directory)"
%
filename)
if
not
os.path.isfile(filename):
raise
RuntimeError(
"'%s' :not a regular file"
%
filename)
if
"md5"
in
method:
checksum
=
hashlib.md5()
elif
"sha1"
in
method:
checksum
=
hashlib.sha1()
elif
"sha256"
in
method:
checksum
=
hashlib.sha256()
elif
"sha384"
in
method:
checksum
=
hashlib.sha384()
elif
"sha512"
in
method:
checksum
=
hashlib.sha512()
else
:
raise
RuntimeError(
"unsupported method %s"
%
method)
with
open
(filename,
'rb'
) as f:
buf
=
f.read(block_size)
while
len
(buf) >
0
:
checksum.update(buf)
buf
=
f.read(block_size)
if
checksum
is
not
None
:
return
checksum.hexdigest()
else
:
return
checksum
def
makeDataIntegrity(path):
path
=
unicode
(path,
'utf8'
)
# For Chinese Non-ASCII character
if
not
os.path.exists(path):
raise
RuntimeError(
"Error: cannot access %s: No such file or directory"
%
path)
elif
os.path.isfile(path):
dict_all
=
dict
()
dict_all[os.path.abspath(path)]
=
get_hash_sum(path)
return
dict_all
elif
os.path.isdir(path):
dict_nondirs
=
dict
()
dict_dirs
=
dict
()
for
top, dirs, nondirs
in
os.walk(path, followlinks
=
True
):
for
item
in
nondirs:
# Do NOT use os.path.abspath(item) here, else it will make a serious bug because of
# os.path.abspath(item) return "os.getcwd()" + "filename" in some case.
dict_nondirs[os.path.join(top, item)]
=
get_hash_sum(os.path.join(top, item))
for
item
in
dirs:
dict_dirs[os.path.join(top, item)]
=
r""
dict_all
=
dict
(dict_dirs,
*
*
dict_nondirs)
return
dict_all
def
saveDataIntegrity(data, filename):
import
json
data_to_save
=
json.dumps(data, encoding
=
'utf-8'
)
if
not
os.path.exists(os.path.dirname(filename)):
os.makedirs(os.path.dirname(filename))
with
open
(filename,
'wb'
) as f:
f.write(data_to_save)
def
readDataIntegrity(filename):
import
json
if
not
os.path.exists(filename):
raise
RuntimeError(
"cannot open '%s' (No such file or directory)"
%
filename)
with
open
(filename,
'rb'
) as f:
data
=
json.loads(f.read())
if
data:
return
data
def
remakeDataIntegrity(filename):
def
confirm(question, default
=
True
):
"""
Ask user a yes/no question and return their response as True or False.
:parameter question:
``question`` should be a simple, grammatically complete question such as
"Do you wish to continue?", and will have a string similar to " [Y/n] "
appended automatically. This function will *not* append a question mark for
you.
The prompt string, if given,is printed without a trailing newline before reading.
:parameter default:
By default, when the user presses Enter without typing anything, "yes" is
assumed. This can be changed by specifying ``default=False``.
:return True or False
"""
# Set up suffix
if
default:
# suffix = "Y/n, default=True"
suffix
=
"Y/n"
else
:
# suffix = "y/N, default=False"
suffix
=
"y/N"
# Loop till we get something we like
while
True
:
response
=
raw_input
(
"%s [%s] "
%
(question, suffix)).lower()
# Default
if
not
response:
return
default
# Yes
if
response
in
[
'y'
,
'yes'
]:
return
True
# No
if
response
in
[
'n'
,
'no'
]:
return
False
# Didn't get empty, yes or no, so complain and loop
print
(
"I didn't understand you. Please specify '(y)es' or '(n)o'."
)
if
os.path.exists(filename):
if
confirm(
"[warning] remake data integrity file \'%s\'?"
%
filename):
os.remove(filename)
print
"[successful] data integrity file \'%s\' has been remade."
%
filename
sys.exit(
0
)
else
:
print
"[warning] data integrity file \'%s\'is not remade."
%
filename
sys.exit(
0
)
else
:
print
>> sys.stderr,
"[error] data integrity file \'%s\'is not exist."
%
filename
def
checkDataIntegrity(path_to_check, file_to_save):
from
time
import
sleep
if
not
os.path.exists(file_to_save):
print
"[info] data integrity file \'%s\' is not exist."
%
file_to_save
print
"[info] make a data integrity file to \'%s\'"
%
file_to_save
data
=
makeDataIntegrity(path_to_check)
saveDataIntegrity(data, file_to_save)
print
"[successful] make a data integrity file to \'%s\', finished!"
%
file_to_save,
print
"Now you can use this script later to check data integrity."
else
:
old_data
=
readDataIntegrity(file_to_save)
new_data
=
makeDataIntegrity(path_to_check)
error_flag
=
True
for
item
in
old_data.keys():
try
:
if
not
old_data[item]
=
=
new_data[item]:
print
>> sys.stderr, new_data[item], item
sleep(
0.01
)
print
"\told hash data is %s"
%
old_data[item], item
error_flag
=
False
except
KeyError as e:
print
>> sys.stderr,
"[error]"
, e.message,
"Not Exist!"
error_flag
=
False
if
error_flag:
print
"[ successful ] passed, All files integrity is ok!"
if
__name__
=
=
'__main__'
:
arguments
=
docopt(__doc__, version
=
'1.0.0rc2'
)
if
arguments[
'-r'
]
or
arguments[
'reset'
]:
if
arguments[
'HASH_FILE'
]:
remakeDataIntegrity(arguments[
'HASH_FILE'
])
elif
arguments[
'-g'
]
or
arguments[
'generate'
]:
if
arguments[
'FILE'
]
and
arguments[
'HASH_FILE'
]:
checkDataIntegrity(arguments[
'FILE'
], arguments[
'HASH_FILE'
])
elif
arguments[
'-c'
]
or
arguments[
'validate'
]:
if
arguments[
'FILE'
]
and
arguments[
'HASH_FILE'
]:
checkDataIntegrity(arguments[
'FILE'
], arguments[
'HASH_FILE'
])
else
:
print
>> sys.stderr,
"bad parameters"
sys.stderr.flush()
print
docopt(__doc__, argv
=
"--help"
)
|
tag:Python校验文件完整性,文件完整性,哈希校验
这个世界属于有天赋的人,
也属于认真的人,
更属于那些
在有天赋的领域认真钻研的人。
加油,together!
--end--
本文转自 urey_pp 51CTO博客,原文链接:http://blog.51cto.com/dgd2010/1897799,如需转载请自行联系原作者