使用Python脚本快速批量删除OSS Bucket

本文涉及的产品
对象存储 OSS,20GB 3个月
对象存储 OSS,恶意文件检测 1000次 1年
对象存储 OSS,内容安全 1000次 1年
简介: 要用Python删除OSS Bucket,似乎直接调用delete_bucket()方法就可以了。然而,在实际删除时,常常会遇到各种报错。这是因为OSS为了防止误操作,要求在删除Bucket之前必须清空其中的所有数据,包括对象(Objects)、多版本对象(Multi-version Objects)、碎片(Parts)、LiveChannels。针对需要快速批量删除OSS Bucket的场景,本文提供了一个Python脚本,用于先批量清除Bucket中的上述资源,然后再删除Bucket。

声明

本脚本用于自动化删除对象存储(OSS)中的存储桶(Bucket)。删除存储桶是一项高风险操作,可能会导致存储在桶中的所有数据永久丢失。在执行该脚本之前,请确保您已经理解了所有潜在的风险,并且做出了明智的决定。此外,请确保已经采取了必要的数据备份措施,以避免不可逆转的数据损失。您必须自行承担使用本脚本导致的所有后果。脚本作者或提供者不对任何因使用或滥用本脚本而导致的直接或间接损失、数据丢失、财务损失或任何种类的损害负责。使用本脚本即表明您已阅读并同意本免责声明。如果您不同意这些条件,或者对脚本的功能和风险有任何疑问,请不要执行此脚本。

限制

  • 处于保留策略生效期间的Bucket。必需保留策略失效后,才能删除。
  • 创建了接入点的Bucket。可以在OSS控制台手动删除掉接入点,然后使用脚本删除。
  • 开启了OSS-HDFS服务的Bucket。仅支持在OSS控制台删除。
  • 单个Bucket的删除任务超时时间为1小时。经测试,删除150 GB的文件约消耗半个小时。因此,对于数据量特别大的情况,建议在OSS控制台配置生命周期进行T+1自动删除。

脚本

importoss2importtimeimportsignalfromtqdmimporttqdmclassTimeoutException(Exception):
passdeftimeout_handler(signum, frame):
raiseTimeoutException()
signal.signal(signal.SIGALRM, timeout_handler)
defwrite_results_to_file(result_filename, bucket_count, buckets_detail, delete_success, delete_failed, delete_timings, include_buckets, exclude_buckets):
withopen(result_filename, 'w') asf:
f.write(f"Total buckets listed: {bucket_count}\n")
f.write("Bucket Details (Name, Region, Status):\n")
forbucketinbuckets_detail:
status='Included'ifbucket['bucket_name'] ininclude_bucketselse'Excluded'ifbucket['bucket_name'] inexclude_bucketselse'Not Specified'f.write(f"{bucket['bucket_name']}: {bucket['region']} (status: {status})\n")
f.write(f"\nTotal buckets successfully deleted: {len(delete_success)}\n")
f.write("Successfully Deleted Buckets:\n")
forbucket_nameindelete_success:
f.write(f"{bucket_name} (time taken: {delete_timings.get(bucket_name, 'N/A'):.2f}s)\n")
total_failed=len(delete_failed)
f.write(f"\nTotal buckets failed to delete: {total_failed}\n")
f.write("Failed to Delete Buckets:\n")
forfailed_bucketindelete_failed:
bucket_name=failed_bucket['bucket_name']
error_details=failed_bucket['error']
time_taken=delete_timings.get(bucket_name, 'N/A')
# Customize the error message based on the error code (EC)if'0024-00000001'inerror_details:
reason="Data lake storage is disabled. Please delete using the console."elif'0055-00000011'inerror_details:
reason="Bucket still binding access points. Please delete access points, then delete the bucket."elif"WORM Locked state"inerror_details:
reason="WORM enabled; not allowed to delete before expiration."else:
reason=error_details# Default reason if none of the specific EC codes are present# Write the failure reason to the filef.write(f"{bucket_name} (time taken: {time_taken}s, reason: {reason})\n")
defcheck_bucket_worm_status(bucket):
try:
worm_info=bucket.get_bucket_worm()
ifworm_info.state=='Locked':
print(f"Bucket '{bucket.bucket_name}' is in WORM state 'Locked', skipping deletion.")
returnFalseexceptoss2.exceptions.NoSuchWORMConfiguration:
print(f"Bucket '{bucket.bucket_name}' does not have a WORM configuration.")
exceptoss2.exceptions.RequestErrorase:
# 处理网络相关的异常print(f"Failed to check WORM status for bucket '{bucket.bucket_name}' due to network error: {e}")
returnFalseexceptoss2.exceptions.OssErrorase:
# 处理其他 OSS API 错误print(f"Failed to check WORM status for bucket '{bucket.bucket_name}' due to OSS error: {e}")
returnFalsereturnTruedefdelete_all_objects(bucket):
objects_to_delete= []
object_count=0# 初始化对象计数器forobjinoss2.ObjectIterator(bucket):
objects_to_delete.append(obj.key)
object_count+=1# 如果积累了足够多的对象,进行批量删除iflen(objects_to_delete) >=1000:
print(f"Deleting batch of 1000 objects...")  # 添加此行以打印正在执行的批量删除操作bucket.batch_delete_objects(objects_to_delete)
objects_to_delete= []
print(f"Deleted 1000 objects, continuing...")  # 添加此行以显示当前进度# 执行剩余的删除操作(如果有)ifobject_count>0andobjects_to_delete:
print(f"Deleting final batch of {len(objects_to_delete)} objects...")  # 添加此行以打印最后一批对象的删除操作bucket.batch_delete_objects(objects_to_delete)
elifobject_count==0:
print(f"No objects to delete in bucket '{bucket.bucket_name}'.")
defdelete_all_live_channels(bucket):
live_channel_count=0forlive_channel_infoinoss2.LiveChannelIterator(bucket):
name=live_channel_info.namebucket.delete_live_channel(name)
live_channel_count+=1iflive_channel_count>0:
print(f"All live channels deleted in bucket '{bucket.bucket_name}'.")
else:
print(f"No live channels to delete in bucket '{bucket.bucket_name}'.")
defdelete_all_multipart_uploads(bucket):
multipart_upload_count=0forupload_infoinoss2.MultipartUploadIterator(bucket):
key=upload_info.keyupload_id=upload_info.upload_idbucket.abort_multipart_upload(key, upload_id)
multipart_upload_count+=1ifmultipart_upload_count>0:
print(f"All multipart uploads aborted in bucket '{bucket.bucket_name}'.")
else:
print(f"No multipart uploads to abort in bucket '{bucket.bucket_name}'.")
defdelete_all_object_versions(bucket):
next_key_marker=Nonenext_versionid_marker=NonewhileTrue:
result=bucket.list_object_versions(key_marker=next_key_marker, versionid_marker=next_versionid_marker)
# 判断是否存在任何版本或删除标记ifnotresult.versionsandnotresult.delete_marker:
print(f"No object versions or delete markers to delete in bucket '{bucket.bucket_name}'.")
breakversions_to_delete=oss2.models.BatchDeleteObjectVersionList()
# 追加待删除的对象版本和删除标记forversion_infoinresult.versions:
versions_to_delete.append(oss2.models.BatchDeleteObjectVersion(version_info.key, version_info.versionid))
fordel_maker_infoinresult.delete_marker:
versions_to_delete.append(oss2.models.BatchDeleteObjectVersion(del_maker_info.key, del_maker_info.versionid))
# 执行批量删除操作versions_count=len(result.versions) +len(result.delete_marker)  # 计算将要删除的数量print(f"Deleting {versions_count} object versions and/or delete markers...")  # 使用计算出的数量bucket.delete_object_versions(versions_to_delete)
# 更新下一轮迭代的标记next_key_marker=result.next_key_markernext_versionid_marker=result.next_versionid_marker# 如果没有更多的版本或删除标记,则退出循环ifnotresult.is_truncated:
print(f"All object versions and delete markers deleted.")
breakcurrent_timestamp=time.strftime("%Y%m%d%H%M%S")
# 添加 include_buckets 集合,列出不删除的 bucket。exclude_buckets=set(['<your-bucket>'])
# 添加 include_buckets 集合,列出要删除的 bucket。如果为空,则应用 exclude。如果不为空,则只应用 include。include_buckets=set([])
auth=oss2.ProviderAuth(EnvironmentVariableCredentialsProvider())
service=oss2.Service(auth, 'oss-cn-hangzhou.aliyuncs.com')
bucket_list=list(oss2.BucketIterator(service))
ifinclude_buckets:
buckets_to_process= [bforbinbucket_listifb.nameininclude_buckets]
else:
buckets_to_process= [bforbinbucket_listifb.namenotinexclude_buckets]
bucket_count=len(buckets_to_process)
buckets_detail= [{'bucket_name': b.name, 'region': b.location} forbinbuckets_to_process]
delete_success= []
delete_failed= []
delete_timings= {b.name: 'Not Attempted'forbinbuckets_to_process}
pbar=tqdm(total=bucket_count, desc="Deleting Buckets", unit="bucket", leave=False)
try:
forbucket_infoinbuckets_to_process:
bucket_name=bucket_info.nameregion=bucket_info.locationendpoint=f'https://{region}.aliyuncs.com'bucket=oss2.Bucket(auth, endpoint, bucket_name)
# 调用 check_bucket_worm_status 函数来检查 WORM 状态can_delete=check_bucket_worm_status(bucket)
ifnotcan_delete:
delete_failed.append({
'bucket_name': bucket_name,
'error': "Skipped due to WORM Locked state or unable to check status."            })
delete_timings[bucket_name] ='Skipped'pbar.update(1)
continue# 如果 WORM 状态允许,继续执行删除操作print(f"Processing bucket '{bucket_name}'...")
try:
signal.alarm(3600)
start_time=time.time()
# 删除所有直播频道delete_all_live_channels(bucket)
# 删除所有分片上传delete_all_multipart_uploads(bucket)
# 删除所有对象delete_all_objects(bucket)
# 删除所有对象版本delete_all_object_versions(bucket)
# 尝试删除bucketprint(f"Deleting bucket '{bucket_name}'...")
bucket.delete_bucket()
signal.alarm(0)
end_time=time.time()
time_taken=end_time-start_timedelete_timings[bucket_name] =time_takendelete_success.append(bucket_name)
print(f"Bucket '{bucket_name}' deleted successfully in {time_taken:.2f}s")
pbar.update(1)
exceptTimeoutException:
print(f"Deleting bucket '{bucket_name}' timed out after 300 seconds.")
delete_failed.append({'bucket_name': bucket_name, 'error': 'Timeout after 300 seconds'})
delete_timings[bucket_name] ='Timeout'pbar.update(1)
exceptoss2.exceptions.OssErrorase:
signal.alarm(0)
end_time=time.time()
time_taken=end_time-start_timedelete_timings[bucket_name] =time_takenerror_message=str(e)
print(f"Failed to delete bucket '{bucket_name}' in {time_taken:.2f}s: {error_message}")
delete_failed.append({'bucket_name': bucket_name, 'error': error_message})
pbar.update(1)
exceptKeyboardInterrupt:
print("\nOperation interrupted by user.")
finally:
pbar.close()
result_filename=f"{current_timestamp}_bucket_delete_result.txt"write_results_to_file(result_filename, bucket_count, buckets_detail, delete_success, delete_failed, delete_timings, include_buckets, exclude_buckets)
print(f"Script execution completed. Check {result_filename} for details.")

运行

  1. 在脚本中设置 exclude_buckets 变量为您不想删除的存储桶名称集合。
  2. (可选)设置 include_buckets 变量为您想要删除的存储桶名称集合。当设置了 include_buckets 变量时,exclude_buckets 将被忽略。
  3. 运行以下命令以执行脚本:
pip install oss2
pip install tqdm
exportOSS_ACCESS_KEY_ID=<your_ak_id>
exportOSS_ACCESS_KEY_SECRET=<your_ak_secret>
python3 delete_buckets.py

输出

Total buckets listed: 19
Bucket Details (Name, Region, Status):
bucket1: oss-region1 (status: Not Specified)
bucket2: oss-region1 (status: Not Specified)
bucket3: oss-region1 (status: Not Specified)
bucket4: oss-region1 (status: Not Specified)
bucket5: oss-region2 (status: Not Specified)
bucket6: region1 (status: Not Specified)
bucket7: oss-region3 (status: Not Specified)
bucket8: oss-region3 (status: Not Specified)
bucket9: oss-region4 (status: Not Specified)
bucket10: oss-region2 (status: Not Specified)
bucket11: oss-region1 (status: Not Specified)
bucket12: oss-region5 (status: Not Specified)
bucket13: oss-region6 (status: Not Specified)
bucket14: oss-region3 (status: Not Specified)
bucket15: oss-region4 (status: Not Specified)
bucket16: oss-region1 (status: Not Specified)
bucket17: oss-region1 (status: Not Specified)
bucket18: oss-region4 (status: Not Specified)
bucket19: oss-region7 (status: Not Specified)
Total buckets successfully deleted: 0
Successfully Deleted Buckets:
Total buckets failed to delete: 19
Failed to Delete Buckets:
bucket1 (time taken: 0.082s, reason: Data lake storage is disabled. Please delete using the console.)
bucket2 (time taken: 0.116s, reason: Data lake storage is disabled. Please delete using the console.)
bucket3 (time taken: 0.083s, reason: Data lake storage is disabled. Please delete using the console.)
bucket4 (time taken: 0.070s, reason: Data lake storage is disabled. Please delete using the console.)
bucket5 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket6 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket7 (time taken: 1.126s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket8 (time taken: 1.024s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket9 (time taken: 0.213s, reason: Data lake storage is disabled. Please delete using the console.)
bucket10 (time taken: 0.108s, reason: Data lake storage is disabled. Please delete using the console.)
bucket11 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket12 (time taken: 0.156s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket13 (time taken: 1.533s, reason: Data lake storage is disabled. Please delete using the console.)
bucket14 (time taken: 1.021s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket15 (time taken: 0.212s, reason: Data lake storage is disabled. Please delete using the console.)
bucket16 (time taken: 0.083s, reason: Data lake storage is disabled. Please delete using the console.)
bucket17 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket18 (time taken: 0.203s, reason: Data lake storage is disabled. Please delete using the console.)
bucket19 (time taken: 0.320s, reason: Data lake storage is disabled. Please delete using the console.)



相关实践学习
借助OSS搭建在线教育视频课程分享网站
本教程介绍如何基于云服务器ECS和对象存储OSS,搭建一个在线教育视频课程分享网站。
目录
相关文章
|
2天前
|
开发工具 git Python
通过Python脚本git pull 自动重试拉取代码
通过Python脚本git pull 自动重试拉取代码
83 4
|
3天前
|
数据挖掘 数据库 数据安全/隐私保护
有这10个Python脚本加持,这才是网工的生产力!
有这10个Python脚本加持,这才是网工的生产力!
|
3天前
|
网络协议 安全 Unix
6! 用Python脚本演示TCP 服务器与客户端通信过程!
6! 用Python脚本演示TCP 服务器与客户端通信过程!
|
12天前
|
机器学习/深度学习 XML 搜索推荐
图像自动化保存工具:Python脚本开发指南
图像自动化保存工具:Python脚本开发指南
|
18天前
|
运维 监控 网络安全
自动化运维:使用Python脚本实现服务器批量管理
【8月更文挑战第2天】在现代IT运维领域,效率和准确性是衡量工作质量的关键指标。手动管理大量服务器不仅耗时且容易出错,因此自动化运维工具的开发变得至关重要。本文将介绍如何利用Python编写一个简单的自动化脚本,实现对多台服务器的批量管理,包括自动更新、配置同步以及日志收集等功能。通过实际案例展示,我们能够看到自动化运维如何提升工作效率并减少人为错误。
|
17天前
|
运维 监控 网络安全
自动化运维:使用Python脚本简化日常任务
【8月更文挑战第3天】在本文中,我们将探讨如何通过编写简单的Python脚本来优化和自动化常见的系统运维任务。文章将展示具体的代码示例,并解释如何在真实环境中应用这些脚本以提升效率和减少人为错误。
35 6
|
18天前
|
运维 安全 网络安全
自动化运维:使用Python脚本实现批量部署
【8月更文挑战第2天】在现代IT基础设施管理中,自动化运维成为提升效率、减少人为错误的关键。本文将通过一个实际的Python脚本示例,展示如何实现服务器的批量部署,包括环境准备、代码实现及执行过程。文章旨在为运维工程师提供一种简化日常任务的方法,同时强调安全性和可维护性的重要性。
|
3天前
|
运维 监控 测试技术
5个常见运维场景,用这几个Python脚本就够了!
5个常见运维场景,用这几个Python脚本就够了!
|
3天前
|
Python
原创 | 如何在H3C设备上执行Python脚本实现配置文件的替换?
原创 | 如何在H3C设备上执行Python脚本实现配置文件的替换?
|
7天前
|
数据可视化 测试技术 数据安全/隐私保护
​十个常见的 Python 脚本 (详细介绍 + 代码举例)
​十个常见的 Python 脚本 (详细介绍 + 代码举例)
11 0

相关产品

  • 对象存储