使用Python脚本快速批量删除OSS Bucket

本文涉及的产品
对象存储 OSS,20GB 3个月
对象存储 OSS,恶意文件检测 1000次 1年
对象存储 OSS,内容安全 1000次 1年
简介: 要用Python删除OSS Bucket,似乎直接调用delete_bucket()方法就可以了。然而,在实际删除时,常常会遇到各种报错。这是因为OSS为了防止误操作,要求在删除Bucket之前必须清空其中的所有数据,包括对象(Objects)、多版本对象(Multi-version Objects)、碎片(Parts)、LiveChannels。针对需要快速批量删除OSS Bucket的场景,本文提供了一个Python脚本,用于先批量清除Bucket中的上述资源,然后再删除Bucket。

声明

本脚本用于自动化删除对象存储(OSS)中的存储桶(Bucket)。删除存储桶是一项高风险操作,可能会导致存储在桶中的所有数据永久丢失。在执行该脚本之前,请确保您已经理解了所有潜在的风险,并且做出了明智的决定。此外,请确保已经采取了必要的数据备份措施,以避免不可逆转的数据损失。您必须自行承担使用本脚本导致的所有后果。脚本作者或提供者不对任何因使用或滥用本脚本而导致的直接或间接损失、数据丢失、财务损失或任何种类的损害负责。使用本脚本即表明您已阅读并同意本免责声明。如果您不同意这些条件,或者对脚本的功能和风险有任何疑问,请不要执行此脚本。

限制

  • 处于保留策略生效期间的Bucket。必需保留策略失效后,才能删除。
  • 创建了接入点的Bucket。可以在OSS控制台手动删除掉接入点,然后使用脚本删除。
  • 开启了OSS-HDFS服务的Bucket。仅支持在OSS控制台删除。
  • 单个Bucket的删除任务超时时间为1小时。经测试,删除150 GB的文件约消耗半个小时。因此,对于数据量特别大的情况,建议在OSS控制台配置生命周期进行T+1自动删除。

脚本

importoss2importtimeimportsignalfromtqdmimporttqdmclassTimeoutException(Exception):
passdeftimeout_handler(signum, frame):
raiseTimeoutException()
signal.signal(signal.SIGALRM, timeout_handler)
defwrite_results_to_file(result_filename, bucket_count, buckets_detail, delete_success, delete_failed, delete_timings, include_buckets, exclude_buckets):
withopen(result_filename, 'w') asf:
f.write(f"Total buckets listed: {bucket_count}\n")
f.write("Bucket Details (Name, Region, Status):\n")
forbucketinbuckets_detail:
status='Included'ifbucket['bucket_name'] ininclude_bucketselse'Excluded'ifbucket['bucket_name'] inexclude_bucketselse'Not Specified'f.write(f"{bucket['bucket_name']}: {bucket['region']} (status: {status})\n")
f.write(f"\nTotal buckets successfully deleted: {len(delete_success)}\n")
f.write("Successfully Deleted Buckets:\n")
forbucket_nameindelete_success:
f.write(f"{bucket_name} (time taken: {delete_timings.get(bucket_name, 'N/A'):.2f}s)\n")
total_failed=len(delete_failed)
f.write(f"\nTotal buckets failed to delete: {total_failed}\n")
f.write("Failed to Delete Buckets:\n")
forfailed_bucketindelete_failed:
bucket_name=failed_bucket['bucket_name']
error_details=failed_bucket['error']
time_taken=delete_timings.get(bucket_name, 'N/A')
# Customize the error message based on the error code (EC)if'0024-00000001'inerror_details:
reason="Data lake storage is disabled. Please delete using the console."elif'0055-00000011'inerror_details:
reason="Bucket still binding access points. Please delete access points, then delete the bucket."elif"WORM Locked state"inerror_details:
reason="WORM enabled; not allowed to delete before expiration."else:
reason=error_details# Default reason if none of the specific EC codes are present# Write the failure reason to the filef.write(f"{bucket_name} (time taken: {time_taken}s, reason: {reason})\n")
defcheck_bucket_worm_status(bucket):
try:
worm_info=bucket.get_bucket_worm()
ifworm_info.state=='Locked':
print(f"Bucket '{bucket.bucket_name}' is in WORM state 'Locked', skipping deletion.")
returnFalseexceptoss2.exceptions.NoSuchWORMConfiguration:
print(f"Bucket '{bucket.bucket_name}' does not have a WORM configuration.")
exceptoss2.exceptions.RequestErrorase:
# 处理网络相关的异常print(f"Failed to check WORM status for bucket '{bucket.bucket_name}' due to network error: {e}")
returnFalseexceptoss2.exceptions.OssErrorase:
# 处理其他 OSS API 错误print(f"Failed to check WORM status for bucket '{bucket.bucket_name}' due to OSS error: {e}")
returnFalsereturnTruedefdelete_all_objects(bucket):
objects_to_delete= []
object_count=0# 初始化对象计数器forobjinoss2.ObjectIterator(bucket):
objects_to_delete.append(obj.key)
object_count+=1# 如果积累了足够多的对象,进行批量删除iflen(objects_to_delete) >=1000:
print(f"Deleting batch of 1000 objects...")  # 添加此行以打印正在执行的批量删除操作bucket.batch_delete_objects(objects_to_delete)
objects_to_delete= []
print(f"Deleted 1000 objects, continuing...")  # 添加此行以显示当前进度# 执行剩余的删除操作(如果有)ifobject_count>0andobjects_to_delete:
print(f"Deleting final batch of {len(objects_to_delete)} objects...")  # 添加此行以打印最后一批对象的删除操作bucket.batch_delete_objects(objects_to_delete)
elifobject_count==0:
print(f"No objects to delete in bucket '{bucket.bucket_name}'.")
defdelete_all_live_channels(bucket):
live_channel_count=0forlive_channel_infoinoss2.LiveChannelIterator(bucket):
name=live_channel_info.namebucket.delete_live_channel(name)
live_channel_count+=1iflive_channel_count>0:
print(f"All live channels deleted in bucket '{bucket.bucket_name}'.")
else:
print(f"No live channels to delete in bucket '{bucket.bucket_name}'.")
defdelete_all_multipart_uploads(bucket):
multipart_upload_count=0forupload_infoinoss2.MultipartUploadIterator(bucket):
key=upload_info.keyupload_id=upload_info.upload_idbucket.abort_multipart_upload(key, upload_id)
multipart_upload_count+=1ifmultipart_upload_count>0:
print(f"All multipart uploads aborted in bucket '{bucket.bucket_name}'.")
else:
print(f"No multipart uploads to abort in bucket '{bucket.bucket_name}'.")
defdelete_all_object_versions(bucket):
next_key_marker=Nonenext_versionid_marker=NonewhileTrue:
result=bucket.list_object_versions(key_marker=next_key_marker, versionid_marker=next_versionid_marker)
# 判断是否存在任何版本或删除标记ifnotresult.versionsandnotresult.delete_marker:
print(f"No object versions or delete markers to delete in bucket '{bucket.bucket_name}'.")
breakversions_to_delete=oss2.models.BatchDeleteObjectVersionList()
# 追加待删除的对象版本和删除标记forversion_infoinresult.versions:
versions_to_delete.append(oss2.models.BatchDeleteObjectVersion(version_info.key, version_info.versionid))
fordel_maker_infoinresult.delete_marker:
versions_to_delete.append(oss2.models.BatchDeleteObjectVersion(del_maker_info.key, del_maker_info.versionid))
# 执行批量删除操作versions_count=len(result.versions) +len(result.delete_marker)  # 计算将要删除的数量print(f"Deleting {versions_count} object versions and/or delete markers...")  # 使用计算出的数量bucket.delete_object_versions(versions_to_delete)
# 更新下一轮迭代的标记next_key_marker=result.next_key_markernext_versionid_marker=result.next_versionid_marker# 如果没有更多的版本或删除标记,则退出循环ifnotresult.is_truncated:
print(f"All object versions and delete markers deleted.")
breakcurrent_timestamp=time.strftime("%Y%m%d%H%M%S")
# 添加 include_buckets 集合,列出不删除的 bucket。exclude_buckets=set(['<your-bucket>'])
# 添加 include_buckets 集合,列出要删除的 bucket。如果为空,则应用 exclude。如果不为空,则只应用 include。include_buckets=set([])
auth=oss2.ProviderAuth(EnvironmentVariableCredentialsProvider())
service=oss2.Service(auth, 'oss-cn-hangzhou.aliyuncs.com')
bucket_list=list(oss2.BucketIterator(service))
ifinclude_buckets:
buckets_to_process= [bforbinbucket_listifb.nameininclude_buckets]
else:
buckets_to_process= [bforbinbucket_listifb.namenotinexclude_buckets]
bucket_count=len(buckets_to_process)
buckets_detail= [{'bucket_name': b.name, 'region': b.location} forbinbuckets_to_process]
delete_success= []
delete_failed= []
delete_timings= {b.name: 'Not Attempted'forbinbuckets_to_process}
pbar=tqdm(total=bucket_count, desc="Deleting Buckets", unit="bucket", leave=False)
try:
forbucket_infoinbuckets_to_process:
bucket_name=bucket_info.nameregion=bucket_info.locationendpoint=f'https://{region}.aliyuncs.com'bucket=oss2.Bucket(auth, endpoint, bucket_name)
# 调用 check_bucket_worm_status 函数来检查 WORM 状态can_delete=check_bucket_worm_status(bucket)
ifnotcan_delete:
delete_failed.append({
'bucket_name': bucket_name,
'error': "Skipped due to WORM Locked state or unable to check status."            })
delete_timings[bucket_name] ='Skipped'pbar.update(1)
continue# 如果 WORM 状态允许,继续执行删除操作print(f"Processing bucket '{bucket_name}'...")
try:
signal.alarm(3600)
start_time=time.time()
# 删除所有直播频道delete_all_live_channels(bucket)
# 删除所有分片上传delete_all_multipart_uploads(bucket)
# 删除所有对象delete_all_objects(bucket)
# 删除所有对象版本delete_all_object_versions(bucket)
# 尝试删除bucketprint(f"Deleting bucket '{bucket_name}'...")
bucket.delete_bucket()
signal.alarm(0)
end_time=time.time()
time_taken=end_time-start_timedelete_timings[bucket_name] =time_takendelete_success.append(bucket_name)
print(f"Bucket '{bucket_name}' deleted successfully in {time_taken:.2f}s")
pbar.update(1)
exceptTimeoutException:
print(f"Deleting bucket '{bucket_name}' timed out after 300 seconds.")
delete_failed.append({'bucket_name': bucket_name, 'error': 'Timeout after 300 seconds'})
delete_timings[bucket_name] ='Timeout'pbar.update(1)
exceptoss2.exceptions.OssErrorase:
signal.alarm(0)
end_time=time.time()
time_taken=end_time-start_timedelete_timings[bucket_name] =time_takenerror_message=str(e)
print(f"Failed to delete bucket '{bucket_name}' in {time_taken:.2f}s: {error_message}")
delete_failed.append({'bucket_name': bucket_name, 'error': error_message})
pbar.update(1)
exceptKeyboardInterrupt:
print("\nOperation interrupted by user.")
finally:
pbar.close()
result_filename=f"{current_timestamp}_bucket_delete_result.txt"write_results_to_file(result_filename, bucket_count, buckets_detail, delete_success, delete_failed, delete_timings, include_buckets, exclude_buckets)
print(f"Script execution completed. Check {result_filename} for details.")

运行

  1. 在脚本中设置 exclude_buckets 变量为您不想删除的存储桶名称集合。
  2. (可选)设置 include_buckets 变量为您想要删除的存储桶名称集合。当设置了 include_buckets 变量时,exclude_buckets 将被忽略。
  3. 运行以下命令以执行脚本:
pip install oss2
pip install tqdm
exportOSS_ACCESS_KEY_ID=<your_ak_id>
exportOSS_ACCESS_KEY_SECRET=<your_ak_secret>
python3 delete_buckets.py

输出

Total buckets listed: 19
Bucket Details (Name, Region, Status):
bucket1: oss-region1 (status: Not Specified)
bucket2: oss-region1 (status: Not Specified)
bucket3: oss-region1 (status: Not Specified)
bucket4: oss-region1 (status: Not Specified)
bucket5: oss-region2 (status: Not Specified)
bucket6: region1 (status: Not Specified)
bucket7: oss-region3 (status: Not Specified)
bucket8: oss-region3 (status: Not Specified)
bucket9: oss-region4 (status: Not Specified)
bucket10: oss-region2 (status: Not Specified)
bucket11: oss-region1 (status: Not Specified)
bucket12: oss-region5 (status: Not Specified)
bucket13: oss-region6 (status: Not Specified)
bucket14: oss-region3 (status: Not Specified)
bucket15: oss-region4 (status: Not Specified)
bucket16: oss-region1 (status: Not Specified)
bucket17: oss-region1 (status: Not Specified)
bucket18: oss-region4 (status: Not Specified)
bucket19: oss-region7 (status: Not Specified)
Total buckets successfully deleted: 0
Successfully Deleted Buckets:
Total buckets failed to delete: 19
Failed to Delete Buckets:
bucket1 (time taken: 0.082s, reason: Data lake storage is disabled. Please delete using the console.)
bucket2 (time taken: 0.116s, reason: Data lake storage is disabled. Please delete using the console.)
bucket3 (time taken: 0.083s, reason: Data lake storage is disabled. Please delete using the console.)
bucket4 (time taken: 0.070s, reason: Data lake storage is disabled. Please delete using the console.)
bucket5 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket6 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket7 (time taken: 1.126s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket8 (time taken: 1.024s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket9 (time taken: 0.213s, reason: Data lake storage is disabled. Please delete using the console.)
bucket10 (time taken: 0.108s, reason: Data lake storage is disabled. Please delete using the console.)
bucket11 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket12 (time taken: 0.156s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket13 (time taken: 1.533s, reason: Data lake storage is disabled. Please delete using the console.)
bucket14 (time taken: 1.021s, reason: Bucket still binding access points. Please delete access points, then delete the bucket.)
bucket15 (time taken: 0.212s, reason: Data lake storage is disabled. Please delete using the console.)
bucket16 (time taken: 0.083s, reason: Data lake storage is disabled. Please delete using the console.)
bucket17 (time taken: Skipped, reason: WORM enabled; not allowed to delete before expiration.)
bucket18 (time taken: 0.203s, reason: Data lake storage is disabled. Please delete using the console.)
bucket19 (time taken: 0.320s, reason: Data lake storage is disabled. Please delete using the console.)



相关实践学习
借助OSS搭建在线教育视频课程分享网站
本教程介绍如何基于云服务器ECS和对象存储OSS,搭建一个在线教育视频课程分享网站。
目录
相关文章
|
27天前
|
存储 人工智能 开发工具
AI助理化繁为简,速取代码参数——使用python SDK 处理OSS存储的图片
只需要通过向AI助理提问的方式输入您的需求,即可瞬间获得核心流程代码及参数,缩短学习路径、提升开发效率。
1415 4
AI助理化繁为简,速取代码参数——使用python SDK 处理OSS存储的图片
|
8天前
|
对象存储
一个通过 GitHub Action 将 GitHub 仓库与阿里云 OSS 完全同步的脚本
一种将 GitHub 仓库完全同步到阿里云 OSS 的方法。
|
2月前
|
对象存储 Python
Ceph Reef(18.2.X)之python操作对象存储网关
这篇文章介绍了如何在Ceph Reef(18.2.X)环境中使用Python操作对象存储网关(rgw),包括环境搭建、账号创建、使用s3cmd工具以及编写和测试Python代码。
44 3
|
2月前
|
Python
Python:Pandas实现批量删除Excel中的sheet
Python:Pandas实现批量删除Excel中的sheet
93 0
|
4月前
|
消息中间件 分布式计算 DataWorks
DataWorks产品使用合集之如何使用Python和阿里云SDK读取OSS中的文件
DataWorks作为一站式的数据开发与治理平台,提供了从数据采集、清洗、开发、调度、服务化、质量监控到安全管理的全套解决方案,帮助企业构建高效、规范、安全的大数据处理体系。以下是对DataWorks产品使用合集的概述,涵盖数据处理的各个环节。
|
3月前
|
弹性计算 JavaScript Ubuntu
ECS 挂载 OSS 多Bucket
ECS 挂载 OSS 多Bucket
61 0
|
5月前
|
存储 域名解析 前端开发
云上攻防-云服务篇&对象存储&Bucket桶&任意上传&域名接管&AccessKey泄漏
云上攻防-云服务篇&对象存储&Bucket桶&任意上传&域名接管&AccessKey泄漏
169 8
|
6月前
|
存储 Serverless 对象存储
通过FC运行脚本下载文件到OSS
本文介绍了在阿里云中使用函数计算服务(Function Compute)从URL下载文件并存储到OSS(Object Storage Service)的步骤。首先,需开通函数计算服务并创建RAM角色,授权函数计算访问OSS权限。费用详情参考官方计费概述。操作步骤包括:登录OSS控制台,使用公共模板创建执行,配置参数并运行Python脚本,脚本负责从URL下载文件并上传至指定OSS Bucket。执行成功后,文件将出现在目标OSS Bucket中。
通过FC运行脚本下载文件到OSS
|
6月前
|
安全 Go 开发工具
对象存储OSS产品常见问题之go语言SDK client 和 bucket 并发安全如何解决
对象存储OSS是基于互联网的数据存储服务模式,让用户可以安全、可靠地存储大量非结构化数据,如图片、音频、视频、文档等任意类型文件,并通过简单的基于HTTP/HTTPS协议的RESTful API接口进行访问和管理。本帖梳理了用户在实际使用中可能遇到的各种常见问题,涵盖了基础操作、性能优化、安全设置、费用管理、数据备份与恢复、跨区域同步、API接口调用等多个方面。
128 9
|
6月前
|
存储 监控 开发工具
对象存储OSS产品常见问题之python sdk中的append_object方法支持追加上传xls文件如何解决
对象存储OSS是基于互联网的数据存储服务模式,让用户可以安全、可靠地存储大量非结构化数据,如图片、音频、视频、文档等任意类型文件,并通过简单的基于HTTP/HTTPS协议的RESTful API接口进行访问和管理。本帖梳理了用户在实际使用中可能遇到的各种常见问题,涵盖了基础操作、性能优化、安全设置、费用管理、数据备份与恢复、跨区域同步、API接口调用等多个方面。
207 9

相关产品

  • 对象存储