一、数据集格式
1.1 YOLO
每一张图片对应一个.txt文件,每一行表示该图片的一个标注框,该图片有多少标注框就有多少行数据,每一行有五列,分别表示:类别代号、标注框横向的相对中心坐标x_center、标注框纵向的相对中心坐标y_center、标注框相对宽度w、标注框相对高度h。注意x_center、y_center、w、h为真实像素值除以图片的高和宽之后的值。
1.2 XML
<annotation>
<folder>17</folder> # 图片所处文件夹
<filename>77258.bmp</filename> # 图片名
<path>~/frcnn-image/61/ADAS/image/frcnn-image/17/77258.bmp</path>
<source> #图片来源相关信息
<database>Unknown</database>
</source>
<size> #图片尺寸
<width>640</width>
<height>480</height>
<depth>3</depth>
</size>
<segmented>0</segmented> #是否有分割label
<object> 包含的物体
<name>car</name> #物体类别
<pose>Unspecified</pose> #物体的姿态
<truncated>0</truncated> #物体是否被部分遮挡(>15%)
<difficult>0</difficult> #是否为难以辨识的物体, 主要指要结体背景才能判断出类别的物体。虽有标注, 但一般忽略这类物体
<bndbox> #物体的bound box
<xmin>2</xmin> #左
<ymin>156</ymin> #上
<xmax>111</xmax> #右
<ymax>259</ymax> #下
</bndbox>
</object>
</annotation>
1.3 JSON
"version": "4.6.0",
"flags": {},
"shapes": [
{
"label": "break",
"points": [
[
988.936170212766,
297.0
],
[
1053.8297872340424,
368.27659574468083
]
],
"group_id": null,
"shape_type": "rectangle",
"flags": {}
}
],
"imagePath": "20220617_blue_h_24.jpg",
points是指矩形框的两个对角点
1.4 COCO
coco格式为标准coco数据集里的object instances格式,coco的坐标信息为(xmin,ymin,w,h),(xmin,ymin)表示标注框的左上角坐标,这四个值都是绝对值,coco格式的基本信息描述如下:
{
"info": info, #描述数据集的相关信息,内部由字典组成
"licenses": [license], #列表形式,内部由字典组成
"images": [image], #描述图片信息,列表形式,内部由字典组成,字典数量为图片数量
"annotations": [annotation], #描述bounding box信息列表形式,内部由字典组成,字典数量为bounding box数量
"categories": [category] # 描述图片类别信息,列表形式 ,内部由字典组成,字典数量为类别个数
}
和yolo标注文件不同的是,coco标注文件的格式为.json文件,且所有图片的标注信息在一个.json文件里,该json文件由上面描述的字典组成,该字典有五个key,下面将描述每个key对应value的详细信息:
info{
"year": int, #年份
"version": str, #数据集版本
"description": str, #数据集描述
"contributor": str, #数据集的提供者
"url": str, #数据集的下载地址
"date_created": datetime, #数据集的创建日期
}
license{
"id": int,
"name": str,
"url": str,
}
image{
"id": int, #图片标识,相当于图片的身份证
"width": int, #图片宽度
"height": int, #图片高度
"file_name": str, #图片名称,注意不是图片的路径,仅仅是名称
"license": int,
"flickr_url": str, #flicker网络地址
"coco_url": str, #网络地址路径
"date_captured": datetime, #图片获取日期
}
annotation{
"id": int, #bounding box标识,相当于bounding box身份证
"image_id": int, #图片标识,和image中的"id"对应
"category_id": int, #类别id
"segmentation": RLE or [polygon], #描述分割信息,iscrowd=0,则segmentation是polygon格式;iscrowd=1,则segmentation就是RLE格式
"area": float, #标注框面积
"bbox": [x,y,width,height], #标注框坐标信息,前文有描述
"iscrowd": 0 or 1, #是否有遮挡,无遮挡为0,有遮挡为1
}
category{
"id": int, #类别id,注意从1开始,而不是从0开始
"name": str, #类别名称
"supercategory": str, #该类别的超类是什么
}
1.5 VOC数据集格式
其中JPEGImages目录下放的是所有的图片,这里只是叫这个名字,可以是非JPG格式的图片,而Annotations则存放的是和上述图片同名的所有XML格式的标注文件(所以xml标注和jpg图片一一对应),这个标注文件的具体解析后面会讲到。ImageSets文件夹下的则是数据集划分的txt文件,Segmentation存放的是分割用数据集的划分。
对这四个文件夹而言,它们下面加*号的表示按类别存在多个划分文件,但是对于通用目标检测而言则是只需要train.txt等文件。这些文本文件每一行有一个图片编号,整个文件其实组成文件名列表,依此就能完成数据集的划分。
所以,对于通用目标检测任务(针对对象类别)而言,我们一般只需要关注Main文件夹下的四个总体划分的文件即可。
└─VOC2007
├─JPEGImages
│ ├─1.jpg
│ ├─2.jpg
│ └─3.jpg
├─Annotations
│ ├─1.xml
│ ├─2.xml
│ └─3.xml
├─ImageSets
│ ├─Layout
│ │ ├─train.txt
│ │ ├─trainva.txt
│ │ ├─test.txt
│ │ └─val.txt
│ ├─Main
│ │ ├─*_train.txt
│ │ ├─*_trainva.txt
│ │ ├─*_test.txt
│ │ └─*_val.txt
│ ├─Action
│ │ ├─*_train.txt
│ │ ├─*_trainva.txt
│ │ ├─*_test.txt
│ │ └─*_val.txt
│ └─Segmentation
│ ├─train.txt
│ ├─trainva.txt
│ ├─test.txt
│ └─val.txt
├─SegmentationClass
└─SegmentationObject
1.6 YOLO数据集格式
1.7 COCO数据集格式
COCO_ROOT #根目录
├── annotations # 存放json格式的标注
│ ├── instances_train2017.json
│ └── instances_val2017.json
└── train2017 # 存放图片文件
│ ├── 000000000001.jpg
│ ├── 000000000002.jpg
│ └── 000000000003.jpg
└── val2017
├── 000000000004.jpg
└── 000000000005.jpg
这里的 train2017 和 val2017 称为 set_name , annnotations 文件夹中的 json 格式的标注文件名要与之对应并以 instances_ 开头,也就是 instances_{setname}.json 。
二、可视化验证数据集
2.1 YOLO
2.1.1 目标检测
import cv2
import os
# 读取txt文件信息
def read_list(txt_path):
pos = []
with open(txt_path, 'r') as file_to_read:
while True:
lines = file_to_read.readline() # 整行读取数据
if not lines:
break
# 将整行数据分割处理,如果分割符是空格,括号里就不用传入参数,如果是逗号, 则传入‘,'字符。
p_tmp = [float(i) for i in lines.split(' ')]
pos.append(p_tmp) # 添加新读取的数据
# Efield.append(E_tmp)
pass
return pos
# txt转换为box
def convert(size, box):
xmin = (box[1] - box[3] / 2.) * size[1]
xmax = (box[1] + box[3] / 2.) * size[1]
ymin = (box[2] - box[4] / 2.) * size[0]
ymax = (box[2] + box[4] / 2.) * size[0]
box = (int(xmin), int(ymin), int(xmax), int(ymax))
return box
def draw_box_in_single_image(image_path, txt_path):
# 读取图像
image = cv2.imread(image_path)
pos = read_list(txt_path)
for i in range(len(pos)):
label = classes[int(str(int(pos[i][0])))]
print('label is '+label)
box = convert(image.shape, pos[i])
image = cv2.rectangle(image,(box[0], box[1]),(box[2],box[3]),colores[int(str(int(pos[i][0])))],2)
cv2.putText(image, label,(box[0],box[1]-2), 0, 1, colores[int(str(int(pos[i][0])))], thickness=2, lineType=cv2.LINE_AA)
cv2.imshow("images", image)
cv2.waitKey(0)
if __name__ == '__main__':
img_folder = "D:\datasets\YOLO/images"
img_list = os.listdir(img_folder)
img_list.sort()
label_folder = "D:\datasets\YOLO/labels"
label_list = os.listdir(label_folder)
label_list.sort()
classes = {0: "cat", 1: "dog"}
colores = [(0,0,255),(255,0,255)]
for i in range(len(img_list)):
image_path = img_folder + "\\" + img_list[i]
txt_path = label_folder + "\\" + label_list[i]
draw_box_in_single_image(image_path, txt_path)
2.1.2 图像分割
单张
import cv2
import numpy as np
if __name__ == '__main__':
pic_path = r"coco128-seg\images\train2017\000000000009.jpg"
txt_path = r"coco128-seg\labels\train2017\000000000009.txt"
img = cv2.imread(pic_path)
height, width, _ = img.shape
file_handle = open(txt_path)
cnt_info = file_handle.readlines()
new_cnt_info = [line_str.replace("\n", "").split(" ") for line_str in cnt_info]
# 45 bowl 碗 49 橘子 50 西兰花
color_map = {"49": (0, 255, 255), "45": (255, 0, 255), "50": (255, 255, 0)}
for new_info in new_cnt_info:
s = []
for i in range(1, len(new_info), 2):
b = [float(tmp) for tmp in new_info[i:i + 2]]
s.append([int(b[0] * width), int(b[1] * height)])
cv2.polylines(img, [np.array(s, np.int32)], True, color_map.get(new_info[0]))
cv2.imshow('img', img)
cv2.waitKey()
多张
import os, cv2
import numpy as np
img_base_path = '../dataset/custom_dataset/images/train'
lab_base_path = '../dataset/custom_dataset/labels/train'
label_path_list = [i.split('.')[0] for i in os.listdir(img_base_path)]
for path in label_path_list:
image = cv2.imread(f'{img_base_path}/{path}.jpg')
h, w, c = image.shape
label = np.zeros((h, w), dtype=np.uint8)
with open(f'{lab_base_path}/{path}.txt') as f:
mask = np.array(list(map(lambda x:np.array(x.strip().split()), f.readlines())))
for i in mask:
i = np.array(i, dtype=np.float32)[1:].reshape((-1, 2))
i[:, 0] *= w
i[:, 1] *= h
label = cv2.fillPoly(label, [np.array(i, dtype=np.int32)], color=255)
image = cv2.bitwise_and(image, image, mask=label)
cv2.imshow('Pic', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
2.2 JSON
# -*- coding: utf-8 -*-
"""
Created on Tue Mar 29 17:42:11 2022
@author: https://blog.csdn.net/suiyingy?type=blog
"""
import cv2
import os
import json
import shutil
import numpy as np
from pathlib import Path
from glob import glob
id2cls = {0: 'fishing', 1: "no_fishing"}
cls2id = {'fishing': 0, "no_fishing": 1}
id2color = {"fishing": (0, 255, 0), "no_fishing": (0, 255, 255)}
# 支持中文路径
def cv_imread(filePath):
cv_img = cv2.imdecode(np.fromfile(filePath, dtype=np.uint8), flags=cv2.IMREAD_COLOR)
return cv_img
def get_labelme_info(label_file):
anno = json.load(open(label_file, "r", encoding="utf-8"))
shapes = anno['shapes']
image_path = os.path.basename(anno['imagePath'])
labels = []
for s in shapes:
pts = s['points']
x1, y1 = pts[0]
x2, y2 = pts[1]
color = id2color[s["label"]]
labels.append([color, x1, y1, x2, y2])
return labels, image_path
def vis_labelme(labelme_label_dir, save_dir='res/'):
labelme_label_dir = str(Path(labelme_label_dir)) + '/'
save_dir = str(Path(save_dir)) + '/'
if not os.path.exists(save_dir):
os.makedirs(save_dir)
json_files = glob(labelme_label_dir + '*.json')
for ijf, jf in enumerate(json_files):
print(ijf + 1, '/', len(json_files), jf)
filename = os.path.basename(jf).rsplit('.', 1)[0]
labels, image_path = get_labelme_info(jf)
image = cv_imread(labelme_label_dir + image_path)
for label in labels:
color = label[0]
x1, y1, x2, y2 = label[1:]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
cv2.rectangle(image, (x1, y1), (x2, y2), color, 3)
# 显示图片
# cv2.imshow(filename, image)
# cv2.waitKey(0)
# 支持中文路径,保存图片
cv2.imencode(os.path.splitext(image_path)[-1], image)[1].tofile(save_dir + image_path)
print('Completed!')
if __name__ == '__main__':
root_dir = r'D:\Python\money\data\test'
save_dir = r'D:\Python\money\data\test/t'
vis_labelme(root_dir, save_dir)
2.3 XML
import xml.etree.ElementTree as ET # 读取xml。
import os
from PIL import Image, ImageDraw, ImageFont
def parse_rec(filename):
tree = ET.parse(filename) # 解析读取xml函数
objects = []
img_dir = []
for xml_name in tree.findall('filename'):
img_path = os.path.join(pic_path, xml_name.text.replace(".jpeg", "").replace(".png",""))
img_dir.append(img_path)
for obj in tree.findall('object'):
obj_struct = {}
obj_struct['name'] = obj.find('name').text
obj_struct['pose'] = obj.find('pose').text
obj_struct['truncated'] = int(obj.find('truncated').text)
obj_struct['difficult'] = int(obj.find('difficult').text)
bbox = obj.find('bndbox')
obj_struct['bbox'] = [int(bbox.find('xmin').text),
int(bbox.find('ymin').text),
int(bbox.find('xmax').text),
int(bbox.find('ymax').text)]
objects.append(obj_struct)
return objects, img_dir
# 可视化
def visualise_gt(objects, img_dir):
for id, img_path in enumerate(img_dir):
img_path = img_path + '.jpg'
img = Image.open(img_path)
draw = ImageDraw.Draw(img)
for a in objects:
xmin = int(a['bbox'][0])
ymin = int(a['bbox'][1])
xmax = int(a['bbox'][2])
ymax = int(a['bbox'][3])
label = a['name']
draw.rectangle((xmin, ymin, xmax, ymax), fill=None, outline=(0, 255, 0), width=2)
draw.text((xmin - 10, ymin - 15), label, fill=(0, 255, 0), font=font) # 利用ImageDraw的内置函数,在图片上写入文字
img.show()
if __name__ == '__main__':
fontPath = "C:\Windows\Fonts\Consolas\consola.ttf" # 字体路径
# root = 'F:/dataset/AQM'
root = 'toukui'
ann_path = os.path.join(root, 'xml') # xml文件所在路径
pic_path = os.path.join(root, 'image') # 样本图片路径
font = ImageFont.truetype(fontPath, 16)
for filename in os.listdir(ann_path):
xml_path = os.path.join(ann_path, filename)
object, img_dir = parse_rec(xml_path)
visualise_gt(object, img_dir)
2.3 COCO
三、XML转COCO
只要修改倒数第三四行的代码即可使用,倒数第四行主要是xml的文件夹,注意下路径斜杠的方向即可,倒数第三行就是这个xml的文件名,是val就写val,是train就写train。
import xml.etree.ElementTree as ET
import os
import json
coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []
category_set = dict()
image_set = set()
category_item_id = -1
image_id = 20180000000
annotation_id = 0
def addCatItem(name):
global category_item_id
category_item = dict()
category_item['supercategory'] = 'none'
category_item_id += 1
category_item['id'] = category_item_id
category_item['name'] = name
coco['categories'].append(category_item)
category_set[name] = category_item_id
return category_item_id
def addImgItem(file_name, size):
global image_id
if file_name is None:
raise Exception('Could not find filename tag in xml file.')
if size['width'] is None:
raise Exception('Could not find width tag in xml file.')
if size['height'] is None:
raise Exception('Could not find height tag in xml file.')
image_id += 1
image_item = dict()
image_item['id'] = image_id
image_item['file_name'] = file_name
image_item['width'] = size['width']
image_item['height'] = size['height']
coco['images'].append(image_item)
image_set.add(file_name)
return image_id
def addAnnoItem(object_name, image_id, category_id, bbox):
global annotation_id
annotation_item = dict()
annotation_item['segmentation'] = []
seg = []
# bbox[] is x,y,w,h
# left_top
seg.append(bbox[0])
seg.append(bbox[1])
# left_bottom
seg.append(bbox[0])
seg.append(bbox[1] + bbox[3])
# right_bottom
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1] + bbox[3])
# right_top
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1])
annotation_item['segmentation'].append(seg)
annotation_item['area'] = bbox[2] * bbox[3]
annotation_item['iscrowd'] = 0
annotation_item['ignore'] = 0
annotation_item['image_id'] = image_id
annotation_item['bbox'] = bbox
annotation_item['category_id'] = category_id
annotation_id += 1
annotation_item['id'] = annotation_id
coco['annotations'].append(annotation_item)
def parseXmlFiles(xml_path):
for f in os.listdir(xml_path):
if not f.endswith('.xml'):
continue
bndbox = dict()
size = dict()
current_image_id = None
current_category_id = None
file_name = None
size['width'] = None
size['height'] = None
size['depth'] = None
xml_file = os.path.join(xml_path, f)
print(xml_file)
tree = ET.parse(xml_file)
root = tree.getroot()
if root.tag != 'annotation':
raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))
# elem is <folder>, <filename>, <size>, <object>
for elem in root:
current_parent = elem.tag
current_sub = None
object_name = None
if elem.tag == 'folder':
continue
if elem.tag == 'filename':
file_name = elem.text
if file_name in category_set:
raise Exception('file_name duplicated')
# add img item only after parse <size> tag
elif current_image_id is None and file_name is not None and size['width'] is not None:
if file_name not in image_set:
current_image_id = addImgItem(file_name, size)
print('add image with {} and {}'.format(file_name, size))
else:
raise Exception('duplicated image: {}'.format(file_name))
# subelem is <width>, <height>, <depth>, <name>, <bndbox>
for subelem in elem:
bndbox['xmin'] = None
bndbox['xmax'] = None
bndbox['ymin'] = None
bndbox['ymax'] = None
current_sub = subelem.tag
if current_parent == 'object' and subelem.tag == 'name':
object_name = subelem.text
if object_name not in category_set:
current_category_id = addCatItem(object_name)
else:
current_category_id = category_set[object_name]
elif current_parent == 'size':
if size[subelem.tag] is not None:
raise Exception('xml structure broken at size tag.')
size[subelem.tag] = int(subelem.text)
# option is <xmin>, <ymin>, <xmax>, <ymax>, when subelem is <bndbox>
for option in subelem:
if current_sub == 'bndbox':
if bndbox[option.tag] is not None:
raise Exception('xml structure corrupted at bndbox tag.')
bndbox[option.tag] = int(option.text)
# only after parse the <object> tag
if bndbox['xmin'] is not None:
if object_name is None:
raise Exception('xml structure broken at bndbox tag')
if current_image_id is None:
raise Exception('xml structure broken at bndbox tag')
if current_category_id is None:
raise Exception('xml structure broken at bndbox tag')
bbox = []
# x
bbox.append(bndbox['xmin'])
# y
bbox.append(bndbox['ymin'])
# w
bbox.append(bndbox['xmax'] - bndbox['xmin'])
# h
bbox.append(bndbox['ymax'] - bndbox['ymin'])
print('add annotation with {},{},{},{}'.format(object_name, current_image_id, current_category_id,
bbox))
addAnnoItem(object_name, current_image_id, current_category_id, bbox)
if __name__ == '__main__':
xml_path = 'C:/Users/rexmatken/Desktop/nanodettrain/T/val/xml/' # 这是xml文件所在的地址
json_file = './val.json' # 这是你要生成的json文件
parseXmlFiles(xml_path) # 只需要改动这两个参数就行了
json.dump(coco, open(json_file, 'w'))
四、VOC和YOLO
4.1 VOC转YOLO
voc数据集格式如下
#voc格式
--Root
--Annotations
--videoDir1
--0.xml
--1.xml
--videoDir2
--0.xml
--1.xml
--Images
--videoDir1
--0.jpg
--1.jpg
--videoDir2
--0.jpg
--1.jpg
pascal_voc_classes.json
{
"aeroplane": 1,
"bicycle": 2,
"bird": 3,
"boat": 4,
"bottle": 5,
"bus": 6,
"car": 7,
"cat": 8,
"chair": 9,
"cow": 10,
"diningtable": 11,
"dog": 12,
"horse": 13,
"motorbike": 14,
"person": 15,
"pottedplant": 16,
"sheep": 17,
"sofa": 18,
"train": 19,
"tvmonitor": 20
}
转换代码
"""
本脚本有两个功能:
1.将voc数据集标注信息(.xml)转为yolo标注格式(.txt),并将图像文件复制到相应文件夹
2.根据json标签文件,生成对应names标签(my_data_label.names)
"""
import os
from tqdm import tqdm
from lxml import etree
import json
import shutil
# voc数据集根目录以及版本
voc_root = "/data/VOCdevkit"
voc_version = "VOC2012"
# 转换的训练集以及验证集对应txt文件
train_txt = "train.txt"
val_txt = "val.txt"
# 转换后的文件保存目录
save_file_root = "./my_yolo_dataset"
# label标签对应json文件
label_json_path = './data/pascal_voc_classes.json'
# 拼接出voc的images目录,xml目录,txt目录
voc_images_path = os.path.join(voc_root, voc_version, "JPEGImages")
voc_xml_path = os.path.join(voc_root, voc_version, "Annotations")
train_txt_path = os.path.join(voc_root, voc_version, "ImageSets", "Main", train_txt)
val_txt_path = os.path.join(voc_root, voc_version, "ImageSets", "Main", val_txt)
# 检查文件/文件夹都是否存在
assert os.path.exists(voc_images_path), "VOC images path not exist..."
assert os.path.exists(voc_xml_path), "VOC xml path not exist..."
assert os.path.exists(train_txt_path), "VOC train txt file not exist..."
assert os.path.exists(val_txt_path), "VOC val txt file not exist..."
assert os.path.exists(label_json_path), "label_json_path does not exist..."
if os.path.exists(save_file_root) is False:
os.makedirs(save_file_root)
def parse_xml_to_dict(xml):
"""
将xml文件解析成字典形式,参考tensorflow的recursive_parse_xml_to_dict
Args:
xml: xml tree obtained by parsing XML file contents using lxml.etree
Returns:
Python dictionary holding XML contents.
"""
if len(xml) == 0: # 遍历到底层,直接返回tag对应的信息
return {xml.tag: xml.text}
result = {}
for child in xml:
child_result = parse_xml_to_dict(child) # 递归遍历标签信息
if child.tag != 'object':
result[child.tag] = child_result[child.tag]
else:
if child.tag not in result: # 因为object可能有多个,所以需要放入列表里
result[child.tag] = []
result[child.tag].append(child_result[child.tag])
return {xml.tag: result}
def translate_info(file_names: list, save_root: str, class_dict: dict, train_val='train'):
"""
将对应xml文件信息转为yolo中使用的txt文件信息
:param file_names:
:param save_root:
:param class_dict:
:param train_val:
:return:
"""
save_txt_path = os.path.join(save_root, train_val, "labels")
if os.path.exists(save_txt_path) is False:
os.makedirs(save_txt_path)
save_images_path = os.path.join(save_root, train_val, "images")
if os.path.exists(save_images_path) is False:
os.makedirs(save_images_path)
for file in tqdm(file_names, desc="translate {} file...".format(train_val)):
# 检查下图像文件是否存在
img_path = os.path.join(voc_images_path, file + ".jpg")
assert os.path.exists(img_path), "file:{} not exist...".format(img_path)
# 检查xml文件是否存在
xml_path = os.path.join(voc_xml_path, file + ".xml")
assert os.path.exists(xml_path), "file:{} not exist...".format(xml_path)
# read xml
with open(xml_path) as fid:
xml_str = fid.read()
xml = etree.fromstring(xml_str)
data = parse_xml_to_dict(xml)["annotation"]
img_height = int(data["size"]["height"])
img_width = int(data["size"]["width"])
# write object info into txt
assert "object" in data.keys(), "file: '{}' lack of object key.".format(xml_path)
if len(data["object"]) == 0:
# 如果xml文件中没有目标就直接忽略该样本
print("Warning: in '{}' xml, there are no objects.".format(xml_path))
continue
with open(os.path.join(save_txt_path, file + ".txt"), "w") as f:
for index, obj in enumerate(data["object"]):
# 获取每个object的box信息
xmin = float(obj["bndbox"]["xmin"])
xmax = float(obj["bndbox"]["xmax"])
ymin = float(obj["bndbox"]["ymin"])
ymax = float(obj["bndbox"]["ymax"])
class_name = obj["name"]
class_index = class_dict[class_name] - 1 # 目标id从0开始
# 进一步检查数据,有的标注信息中可能有w或h为0的情况,这样的数据会导致计算回归loss为nan
if xmax <= xmin or ymax <= ymin:
print("Warning: in '{}' xml, there are some bbox w/h <=0".format(xml_path))
continue
# 将box信息转换到yolo格式
xcenter = xmin + (xmax - xmin) / 2
ycenter = ymin + (ymax - ymin) / 2
w = xmax - xmin
h = ymax - ymin
# 绝对坐标转相对坐标,保存6位小数
xcenter = round(xcenter / img_width, 6)
ycenter = round(ycenter / img_height, 6)
w = round(w / img_width, 6)
h = round(h / img_height, 6)
info = [str(i) for i in [class_index, xcenter, ycenter, w, h]]
if index == 0:
f.write(" ".join(info))
else:
f.write("\n" + " ".join(info))
# copy image into save_images_path
path_copy_to = os.path.join(save_images_path, img_path.split(os.sep)[-1])
if os.path.exists(path_copy_to) is False:
shutil.copyfile(img_path, path_copy_to)
def create_class_names(class_dict: dict):
keys = class_dict.keys()
with open("./data/my_data_label.names", "w") as w:
for index, k in enumerate(keys):
if index + 1 == len(keys):
w.write(k)
else:
w.write(k + "\n")
def main():
# read class_indict
json_file = open(label_json_path, 'r')
class_dict = json.load(json_file)
# 读取train.txt中的所有行信息,删除空行
with open(train_txt_path, "r") as r:
train_file_names = [i for i in r.read().splitlines() if len(i.strip()) > 0]
# voc信息转yolo,并将图像文件复制到相应文件夹
translate_info(train_file_names, save_file_root, class_dict, "train")
# 读取val.txt中的所有行信息,删除空行
with open(val_txt_path, "r") as r:
val_file_names = [i for i in r.read().splitlines() if len(i.strip()) > 0]
# voc信息转yolo,并将图像文件复制到相应文件夹
translate_info(val_file_names, save_file_root, class_dict, "val")
# 创建my_data_label.names文件
create_class_names(class_dict)
if __name__ == "__main__":
main()
运行后生成的yolo数据长这样
--Root
--Annotations
--videoDir1
--0.xml
--1.xml
--videoDir2
--0.xml
--1.xml
--Images
--videoDir1
--0.jpg
--1.jpg
--videoDir2
--0.jpg
--1.jpg
--labels
--videoDir1
--0.txt
--1.txt
--videoDir2
--0.txt
--1.txt
--train.txt
--test.txt
import os
import xml.etree.ElementTree as ET
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog",
"horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
# 将x1, y1, x2, y2转换成yolov5所需要的x, y, w, h格式
def xyxy2xywh(size, box):
dw = 1. / size[0]
dh = 1. / size[1]
x = (box[0] + box[2]) / 2 * dw
y = (box[1] + box[3]) / 2 * dh
w = (box[2] - box[0]) * dw
h = (box[3] - box[1]) * dh
return (x, y, w, h) # 返回的都是标准化后的值
def voc2yolo(path):
# 可以打印看看该路径是否正确
print(len(os.listdir(path)))
# 遍历每一个xml文件
for file in os.listdir(path):
# xml文件的完整路径, 注意:因为是路径所以要确保准确,我是直接使用了字符串拼接, 为了保险可以用os.path.join(path, file)
label_file = os.path.join(path,file)
# 最终要改成的txt格式文件,这里我是放在voc2007/labels/下面
# 注意: labels文件夹必须存在,没有就先创建,不然会报错
out_file = open(path.replace('Annotations', 'labels') + file.replace('xml', 'txt'), 'w')
# print(label_file)
# 开始解析xml文件
tree = ET.parse(label_file)
root = tree.getroot()
size = root.find('size') # 图片的shape值
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult) == 1:
continue
# 将名称转换为id下标
cls_id = classes.index(cls)
# 获取整个bounding box框
bndbox = obj.find('bndbox')
# xml给出的是x1, y1, x2, y2
box = [float(bndbox.find('xmin').text), float(bndbox.find('ymin').text), float(bndbox.find('xmax').text),
float(bndbox.find('ymax').text)]
# 将x1, y1, x2, y2转换成yolov5所需要的x_center, y_center, w, h格式
bbox = xyxy2xywh((w, h), box)
# 写入目标文件中,格式为 id x y w h
out_file.write(str(cls_id) + " " + " ".join(str(x) for x in bbox) + '\n')
if __name__ == '__main__':
# 这里要改成自己数据集路径的格式
path = '/home/lqs/Downloads/dataset/VOCdevkit/VOC2007/Annotations/'
voc2yolo(path)
4.2 YOLO转VOC
五、JSON和TXT
5.1 JSON转TXT
import os
import numpy as np
import json
from glob import glob
import cv2
from sklearn.model_selection import train_test_split
from shutil import copyfile
import argparse
obj_classes = []
# Labelme坐标到YOLO V5坐标的转换
def convert(size, box):
dw = 1. / (size[0])
dh = 1. / (size[1])
x = (box[0] + box[1]) / 2.0 - 1
y = (box[2] + box[3]) / 2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return (x, y, w, h)
# 样本转换
def convertToYolo5(fileList, output_dir, labelImg_path):
# 创建指定样本的父目录
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# 创建指定样本的images和labels子目录
yolo_images_dir = '{}/images/'.format(output_dir)
yolo_labels_dir = '{}/labels/'.format(output_dir)
if not os.path.exists(yolo_images_dir):
os.makedirs(yolo_images_dir)
if not os.path.exists(yolo_labels_dir):
os.makedirs(yolo_labels_dir)
# 一个样本图片一个样本图片地转换
for num,json_file_ in enumerate(fileList):
# print('fileList',fileList)
# 1. 生成YOLO样本图片
# 构建json图片文件的全路径名
imagePath = labelImg_path + '/' + json_file_ + ".jpg"
print('name',imagePath, json_file_)
# print('labelme_path', labelme_path)
# 构建Yolo图片文件的全路径名
yolo_image_file_path = yolo_images_dir + "{}.jpg".format(json_file_)
print('yolo_image_file_path', yolo_image_file_path)
# copy样本图片
copyfile(imagePath, yolo_image_file_path)
# 2. 生成YOLO样本标签
# 构建json标签文件的全路径名
labelme_path_ = labelImg_path.split('image')[0]
json_filename = labelme_path_ + 'json'+'\\' + json_file_ + ".json"
# 构建Yolo标签文件的全路径名
yolo_label_file_path = yolo_labels_dir + "{}.txt".format(json_file_)
print(yolo_label_file_path)
# 创建新的Yolo标签文件
yolo_label_file = open(yolo_label_file_path, 'w')
# 获取当前图片的Json标签文件
json_obj = json.load(open(json_filename, "r", encoding="utf-8"))
# 获取当前图片的长度、宽度信息
height = json_obj['imageHeight']
width = json_obj['imageWidth']
# 依次读取json文件中所有目标的shapes信息
for shape in json_obj["shapes"]:
# 获取shape中的物体分类信息
label = shape["label"]
if label not in ['car_red','car_orange','car_green','car_blue','car_black','car_white','car_purple','car_grey','car_silvery','car grey','car orange','car black','car','car blue','car purple','car white','car silvery','car green','car_yellow','car red']:
if label == 'Fengtain' or label == 'FengTain' or label == 'FengtTian' or label == 'Fengtian':
label = 'FengTian'
if (label not in obj_classes):
obj_classes.append(label)
# 获取shape中的物体坐标信息
if (shape["shape_type"] == 'rectangle'):
points = np.array(shape["points"])
xmin = min(points[:, 0]) if min(points[:, 0]) > 0 else 0
xmax = max(points[:, 0]) if max(points[:, 0]) > 0 else 0
ymin = min(points[:, 1]) if min(points[:, 1]) > 0 else 0
ymax = max(points[:, 1]) if max(points[:, 1]) > 0 else 0
# 对坐标信息进行合法性检查
if xmax <= xmin:
pass
elif ymax <= ymin:
pass
else:
# Labelme坐标转换成YOLO V5坐标
bbox_labelme_float = (float(xmin), float(xmax), float(ymin), float(ymax))
bbox_yolo_normalized = convert((width, height), bbox_labelme_float)
# 把分类标签转换成分类id
class_id = obj_classes.index(label)
# 生成YOLO V5的标签文件
yolo_label_file.write(str(class_id) + " " + " ".join([str(a) for a in bbox_yolo_normalized]) + '\n')
yolo_label_file.close()
# except Exception as e:
# print(e)
# error +=1
# print('{}错误标签为{}'.format(labelImg_path,error))
def check_output_directory(output=""):
# 创建保存输出图片的目录
save_path = output + '/'
is_exists = os.path.exists(save_path)
if is_exists:
print('Warning: path of %s already exist, please remove it firstly by manual' % save_path)
# shutil.rmtree(save_path) # 避免误删除已有的文件
return ""
# print('create output path %s' % save_path)
os.makedirs(save_path)
return save_path
def create_yolo_dataset_cfg(output_dir='', label_class=[]):
# 创建文件
data_cfg_file = open(output_dir + '/data.yaml', 'w')
# 创建文件内容
data_cfg_file.write('train: ../train/images\n')
data_cfg_file.write("val: ../valid/images\n")
data_cfg_file.write("test: ../test/images\n")
data_cfg_file.write("\n")
data_cfg_file.write("# Classes\n")
data_cfg_file.write("nc: %s\n" % len(label_class))
data_cfg_file.write('names: ')
i = 0
for label in label_class:
if (i == 0):
data_cfg_file.write("[")
else:
data_cfg_file.write(", ")
if (i % 10 == 0):
data_cfg_file.write("\n ")
i += 1
data_cfg_file.write("'" + label + "'")
data_cfg_file.write('] # class names')
data_cfg_file.close()
# 关闭文件
def labelImg2yolo(input='', output=''):
outputdir_root = output + '/'
labelImg_path = input
print(labelImg_path)
labelImg_path_imagepath = labelImg_path + '\\' + 'image'
print(labelImg_path_imagepath)
json_path = labelImg_path+'\\'+'json'
print(json_path)
print("*"*100)
# 1.获取input目录中所有的json标签文件全路径名
files = glob(json_path + "/*.json")
print(files)
# 2.获取所有标签文件的短文件名称
files = [i.replace("\\", "/").split("/")[-1].split(".json")[0] for i in files]
print(files)
# 3. 按比例随机切分数据集,获取训练集样本
train_files, valid_test_files = train_test_split(files, test_size=0.2, random_state=55)
# 4. 按比例随机切分数据集,获取验证集和测试集样本
valid_files, test_files = train_test_split(valid_test_files, test_size=0.1, random_state=55)
# 5. 构建YOLO数据集目录
train_path = outputdir_root + '/train'
valid_path = outputdir_root + '/valid'
test_path = outputdir_root + '/test'
# 6. 生成YOLO 训练、验证、测试数据集:图片+标签
convertToYolo5(train_files, train_path, labelImg_path_imagepath)
convertToYolo5(valid_files, valid_path, labelImg_path_imagepath)
convertToYolo5(test_files, test_path, labelImg_path_imagepath)
print("*"*100)
# 7. 创建YOLO数据集配置文件
create_yolo_dataset_cfg(output, obj_classes)
labelme_path = input
print("Classes:", obj_classes)
print('Finished, output path =', outputdir_root)
return 0
def parse_opt():
# define argparse object
parser = argparse.ArgumentParser()
# add argument for command line
parser.add_argument('--input', type=str, default=r'D:\datasets\cover_datasets_test\img_and_json_to_txt',help='The input LabelImg directory')
parser.add_argument('--output', type=str,default=r'D:\datasets\cover_datasets_test\img_and_json_to_txt/txt', help='The output YOLO V5 directory')
# parse arges from command line
opt = parser.parse_args()
print("input =", opt.input)
print("output =", opt.output)
# return opt
return opt
def main(opt):
labelImg2yolo(**vars(opt))
if __name__ == '__main__':
opt = parse_opt()
main(opt)
输出(可直接通过YOLOv5来训练):
5.2 TXT转JSON
import os
import json
import base64
import cv2
def read_txt_file(txt_file):
with open(txt_file, 'r') as f:
lines = f.readlines()
data = []
for line in lines:
line = line.strip().split()
class_name = line[0]
bbox = [coord for coord in line[1:]]
data.append({'class_name': class_name, 'bbox': bbox})
return data
def convert_to_labelme(data, image_path, image_size):
labelme_data = {
'version': '4.5.6',
'flags': {},
'shapes': [],
'imagePath': json_image_path,
'imageData': None,
'imageHeight': image_size[0],
'imageWidth': image_size[1]
}
for obj in data:
dx = obj['bbox'][0]
dy = obj['bbox'][1]
dw = obj['bbox'][2]
dh = obj['bbox'][3]
w = eval(dw) * image_size[1]
h = eval(dh) * image_size[0]
center_x = eval(dx) * image_size[1]
center_y = eval(dy) * image_size[0]
x1 = center_x - w/2
y1 = center_y - h/2
x2 = center_x + w/2
y2 = center_y + h/2
if obj['class_name'] == '0': #判断对应的标签名称,写入json文件中
label = classes["0"]
else:
label = classes["1"]
shape_data = {
'label': label,
'points': [[x1, y1], [x2, y2]],
'group_id': None,
'shape_type': 'rectangle',
'flags': {}
}
labelme_data['shapes'].append(shape_data)
return labelme_data
def save_labelme_json(labelme_data, image_path, output_file):
with open(image_path, 'rb') as f:
image_data = f.read()
labelme_data['imageData'] = base64.b64encode(image_data).decode('utf-8')
with open(output_file, 'w') as f:
json.dump(labelme_data, f, indent=4)
# 设置文件夹路径和输出文件夹路径
txt_folder = r"D:\datasets\test_mission\jiguimen_txt" # 存放LabelImg标注的txt文件的文件夹路径
output_folder = r"D:\datasets\test_mission\jihuimen_json" # 输出LabelMe标注的json文件的文件夹路径
img_folder = r"D:\datasets\test_mission\jiguimen" #存放对应标签的图片文件夹路径
classes = {"0": str('door_open'), "1": str("door_close")}
# 创建输出文件夹
if not os.path.exists(output_folder):
os.makedirs(output_folder)
# 遍历txt文件夹中的所有文件
for filename in os.listdir(txt_folder):
if filename.endswith('.txt'):
# 生成对应的输出文件名
output_filename = os.path.splitext(filename)[0] + '.json'
# 读取txt文件
txt_file = os.path.join(txt_folder, filename)
data = read_txt_file(txt_file)
# 设置图片路径和尺寸
image_filename = os.path.splitext(filename)[0] + '.jpg' # 图片文件名与txt文件名相同,后缀为.jpg
image_path = os.path.join(img_folder, image_filename)
# image_size = (1280, 720) # 根据实际情况修改
json_image_path = image_path.split('\\')[-1]
image_size = cv2.imread(image_path).shape
# 转化为LabelMe格式
labelme_data = convert_to_labelme(data, image_path, image_size)
# 保存为LabelMe JSON文件
output_file = os.path.join(output_folder, output_filename)
save_labelme_json(labelme_data, image_path, output_file)
六、VOC转COCO
具体代码
import xml.etree.ElementTree as ET
import os
import json
coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []
category_set = dict()
image_set = set()
category_item_id = -1
image_id = 20180000000
annotation_id = 0
def addCatItem(name):
global category_item_id
category_item = dict()
category_item['supercategory'] = 'none'
category_item_id += 1
category_item['id'] = category_item_id
category_item['name'] = name
coco['categories'].append(category_item)
category_set[name] = category_item_id
return category_item_id
def addImgItem(file_name, size):
global image_id
if file_name is None:
raise Exception('Could not find filename tag in xml file.')
if size['width'] is None:
raise Exception('Could not find width tag in xml file.')
if size['height'] is None:
raise Exception('Could not find height tag in xml file.')
image_id += 1
image_item = dict()
image_item['id'] = image_id
image_item['file_name'] = file_name
image_item['width'] = size['width']
image_item['height'] = size['height']
coco['images'].append(image_item)
image_set.add(file_name)
return image_id
def addAnnoItem(object_name, image_id, category_id, bbox):
global annotation_id
annotation_item = dict()
annotation_item['segmentation'] = []
seg = []
# bbox[] is x,y,w,h
# left_top
seg.append(bbox[0])
seg.append(bbox[1])
# left_bottom
seg.append(bbox[0])
seg.append(bbox[1] + bbox[3])
# right_bottom
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1] + bbox[3])
# right_top
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1])
annotation_item['segmentation'].append(seg)
annotation_item['area'] = bbox[2] * bbox[3]
annotation_item['iscrowd'] = 0
annotation_item['ignore'] = 0
annotation_item['image_id'] = image_id
annotation_item['bbox'] = bbox
annotation_item['category_id'] = category_id
annotation_id += 1
annotation_item['id'] = annotation_id
coco['annotations'].append(annotation_item)
def _read_image_ids(image_sets_file):
ids = []
with open(image_sets_file) as f:
for line in f:
ids.append(line.rstrip())
return ids
"""通过txt文件生成"""
# split ='train' 'va' 'trainval' 'test'
def parseXmlFiles_by_txt(data_dir, json_save_path, split='train'):
print("hello")
labelfile = split + ".txt"
image_sets_file = data_dir + "/ImageSets/Main/" + labelfile
ids = _read_image_ids(image_sets_file)
for _id in ids:
xml_file = data_dir + f"/Annotations/{_id}.xml"
bndbox = dict()
size = dict()
current_image_id = None
current_category_id = None
file_name = None
size['width'] = None
size['height'] = None
size['depth'] = None
tree = ET.parse(xml_file)
root = tree.getroot()
if root.tag != 'annotation':
raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))
# elem is <folder>, <filename>, <size>, <object>
for elem in root:
current_parent = elem.tag
current_sub = None
object_name = None
if elem.tag == 'folder':
continue
if elem.tag == 'filename':
file_name = elem.text
if file_name in category_set:
raise Exception('file_name duplicated')
# add img item only after parse <size> tag
elif current_image_id is None and file_name is not None and size['width'] is not None:
if file_name not in image_set:
current_image_id = addImgItem(file_name, size)
print('add image with {} and {}'.format(file_name, size))
else:
raise Exception('duplicated image: {}'.format(file_name))
# subelem is <width>, <height>, <depth>, <name>, <bndbox>
for subelem in elem:
bndbox['xmin'] = None
bndbox['xmax'] = None
bndbox['ymin'] = None
bndbox['ymax'] = None
current_sub = subelem.tag
if current_parent == 'object' and subelem.tag == 'name':
object_name = subelem.text
if object_name not in category_set:
current_category_id = addCatItem(object_name)
else:
current_category_id = category_set[object_name]
elif current_parent == 'size':
if size[subelem.tag] is not None:
raise Exception('xml structure broken at size tag.')
size[subelem.tag] = int(subelem.text)
# option is <xmin>, <ymin>, <xmax>, <ymax>, when subelem is <bndbox>
for option in subelem:
if current_sub == 'bndbox':
if bndbox[option.tag] is not None:
raise Exception('xml structure corrupted at bndbox tag.')
bndbox[option.tag] = int(option.text)
# only after parse the <object> tag
if bndbox['xmin'] is not None:
if object_name is None:
raise Exception('xml structure broken at bndbox tag')
if current_image_id is None:
raise Exception('xml structure broken at bndbox tag')
if current_category_id is None:
raise Exception('xml structure broken at bndbox tag')
bbox = []
# x
bbox.append(bndbox['xmin'])
# y
bbox.append(bndbox['ymin'])
# w
bbox.append(bndbox['xmax'] - bndbox['xmin'])
# h
bbox.append(bndbox['ymax'] - bndbox['ymin'])
print('add annotation with {},{},{},{}'.format(object_name, current_image_id, current_category_id,
bbox))
addAnnoItem(object_name, current_image_id, current_category_id, bbox)
json.dump(coco, open(json_save_path, 'w'))
"""直接从xml文件夹中生成"""
def parseXmlFiles(xml_path, json_save_path):
for f in os.listdir(xml_path):
if not f.endswith('.xml'):
continue
bndbox = dict()
size = dict()
current_image_id = None
current_category_id = None
file_name = None
size['width'] = None
size['height'] = None
size['depth'] = None
xml_file = os.path.join(xml_path, f)
print(xml_file)
tree = ET.parse(xml_file)
root = tree.getroot()
if root.tag != 'annotation':
raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))
# elem is <folder>, <filename>, <size>, <object>
for elem in root:
current_parent = elem.tag
current_sub = None
object_name = None
if elem.tag == 'folder':
continue
if elem.tag == 'filename':
file_name = elem.text
if file_name in category_set:
raise Exception('file_name duplicated')
# add img item only after parse <size> tag
elif current_image_id is None and file_name is not None and size['width'] is not None:
if file_name not in image_set:
current_image_id = addImgItem(file_name, size)
print('add image with {} and {}'.format(file_name, size))
else:
raise Exception('duplicated image: {}'.format(file_name))
# subelem is <width>, <height>, <depth>, <name>, <bndbox>
for subelem in elem:
bndbox['xmin'] = None
bndbox['xmax'] = None
bndbox['ymin'] = None
bndbox['ymax'] = None
current_sub = subelem.tag
if current_parent == 'object' and subelem.tag == 'name':
object_name = subelem.text
if object_name not in category_set:
current_category_id = addCatItem(object_name)
else:
current_category_id = category_set[object_name]
elif current_parent == 'size':
if size[subelem.tag] is not None:
raise Exception('xml structure broken at size tag.')
size[subelem.tag] = int(subelem.text)
# option is <xmin>, <ymin>, <xmax>, <ymax>, when subelem is <bndbox>
for option in subelem:
if current_sub == 'bndbox':
if bndbox[option.tag] is not None:
raise Exception('xml structure corrupted at bndbox tag.')
bndbox[option.tag] = int(option.text)
# only after parse the <object> tag
if bndbox['xmin'] is not None:
if object_name is None:
raise Exception('xml structure broken at bndbox tag')
if current_image_id is None:
raise Exception('xml structure broken at bndbox tag')
if current_category_id is None:
raise Exception('xml structure broken at bndbox tag')
bbox = []
# x
bbox.append(bndbox['xmin'])
# y
bbox.append(bndbox['ymin'])
# w
bbox.append(bndbox['xmax'] - bndbox['xmin'])
# h
bbox.append(bndbox['ymax'] - bndbox['ymin'])
print('add annotation with {},{},{},{}'.format(object_name, current_image_id, current_category_id,
bbox))
addAnnoItem(object_name, current_image_id, current_category_id, bbox)
json.dump(coco, open(json_save_path, 'w'))
if __name__ == '__main__':
# 通过txt文件生成
# voc_data_dir="E:/VOCdevkit/VOC2007"
# json_save_path="E:/VOCdevkit/voc2007trainval.json"
# parseXmlFiles_by_txt(voc_data_dir,json_save_path,"trainval")
# 通过文件夹生成
ann_path = "G:\dataset\VOC2007/train_Annotations"
json_save_path = "G:\dataset\VOC2007/train_Annotations.json"
parseXmlFiles(ann_path, json_save_path)
COCO格式
--Root
--Annotations
test.json
train.json
val.json
--Images
--videoDir1
--0.jpg
--1.jpg
--videoDir2
--0.jpg
--1.jpg
七、COCO转VOC
# encoding:utf-8
import os
import json
import cv2
from lxml import etree
import xml.etree.cElementTree as ET
import time
import pandas as pd
from tqdm import tqdm
import json
import argparse
def coco2voc(anno, xml_dir):
with open(anno, 'r', encoding='utf-8') as load_f:
f = json.load(load_f)
imgs = f['images']
df_cate = pd.DataFrame(f['categories'])
_ = df_cate.sort_values(["id"], ascending=True)
df_anno = pd.DataFrame(f['annotations'])
categories = dict(zip(df_cate.id.values, df_cate.name.values))
for i in tqdm(range(len(imgs))):
xml_content = []
file_name = imgs[i]['file_name']
height = imgs[i]['height']
img_id = imgs[i]['id']
width = imgs[i]['width']
xml_content.append("<annotation>")
xml_content.append(" <folder>VOC2007</folder>")
xml_content.append(" <filename>" + file_name + "</filename>")
xml_content.append(" <size>")
xml_content.append(" <width>" + str(width) + "</width>")
xml_content.append(" <height>" + str(height) + "</height>")
xml_content.append(" </size>")
xml_content.append(" <segmented>0</segmented>")
annos = df_anno[df_anno["image_id"].isin([img_id])]
for index, row in annos.iterrows():
bbox = row["bbox"]
category_id = row["category_id"]
cate_name = categories[category_id]
xml_content.append(" <object>")
xml_content.append(" <name>" + cate_name + "</name>")
xml_content.append(" <pose>Unspecified</pose>")
xml_content.append(" <truncated>0</truncated>")
xml_content.append(" <difficult>0</difficult>")
xml_content.append(" <bndbox>")
xml_content.append(" <xmin>" + str(int(bbox[0])) + "</xmin>")
xml_content.append(" <ymin>" + str(int(bbox[1])) + "</ymin>")
xml_content.append(" <xmax>" + str(int(bbox[0] + bbox[2])) + "</xmax>")
xml_content.append(" <ymax>" + str(int(bbox[1] + bbox[3])) + "</ymax>")
xml_content.append(" </bndbox>")
xml_content.append(" </object>")
xml_content.append("</annotation>")
x = xml_content
xml_content = [x[i] for i in range(0, len(x)) if x[i] != "\n"]
xml_path = os.path.join(xml_dir, file_name.split('.')[-2] + '.xml')
with open(xml_path, 'w+', encoding="utf8") as f:
f.write('\n'.join(xml_content))
xml_content[:] = []
if __name__ == '__main__':
parser = argparse.ArgumentParser(description="convert coco .json annotation to voc .xml annotation")
parser.add_argument('--json_path', type=str, help='path to json file.', default="./annotaions/train.json")
parser.add_argument('--output', type=str, help='path to output xml files.', default="./train_xml")
args = parser.parse_args()
if not os.path.exists(args.output):
os.mkdir(args.output)
coco2voc(args.json_path, args.output)
八、JSON转PNG
方式1:通过labelme自己标注生成一个个的小json文件,这样你就可以得到全部图片对应的json文件了,然后将图片放在一个文件夹,所有的标注信息放在一个文件夹;或者所有图片和JSON再一个文件夹的形式。然后我们通过下面的代码将其转换为TXT标签文件。
import json
import os
import cv2
json_filepath="D:\Shanmh\DeskTop/vase"#包含json文件的目录
image_filepath="D:\Shanmh\DeskTop/vase"#包含原图文件的目录
txt_savapath="D:\Shanmh\DeskTop/vase"#txt保存目录
json_files=[i for i in os.listdir(json_filepath) if i.endswith("json")]
for i in json_files:
#新建一个数据list
info_list=[]
json_path=os.path.join(json_filepath,i)
with open(json_path,"r") as r:
json_info=json.load(r)
r.close()
imagePath=json_info["imagePath"]
#得到图像宽高
h,w=cv2.imread(os.path.join(image_filepath,imagePath)).shape[:2]
shapes=json_info["shapes"]
for shape in shapes:#每一个区域
row_str=""#初始化一行
label=shape["label"]
row_str+=label
points=shape["points"]
for point in points:
x=round(float(point[0])/w,6)#小数保留6位
y = round(float(point[1]) / h, 6) # 小数保留6位
row_str+=" "+str(x)+" "+str(y)
row_str+="\n"
info_list.append(row_str)
with open(os.path.join(txt_savapath,i.replace(".json",".txt")),"w") as w:
w.writelines(info_list)
w.close()
print(f"已保存文件{i.replace('.json','.txt')}")
或者图片和JSON在一个文件夹的形式,通过下面的代码会再相同文件夹下生成对应的txt文件
import os, cv2, json
import numpy as np
classes = ['square', 'triangle'] # 修改成对应的类别
base_path = '../dataset/labelme_dataset' # 指定json和图片的位置
path_list = [i.split('.')[0] for i in os.listdir(base_path)]
for path in path_list:
image = cv2.imread(f'{base_path}/{path}.jpg')
h, w, c = image.shape
with open(f'{base_path}/{path}.json') as f:
masks = json.load(f)['shapes']
with open(f'{base_path}/{path}.txt', 'w+') as f:
for idx, mask_data in enumerate(masks):
mask_label = mask_data['label']
if '_' in mask_label:
mask_label = mask_label.split('_')[0]
mask = np.array([np.array(i) for i in mask_data['points']], dtype=np.float)
mask[:, 0] /= w
mask[:, 1] /= h
mask = mask.reshape((-1))
if idx != 0:
f.write('\n')
f.write(f'{classes.index(mask_label)} {" ".join(list(map(lambda x:f"{x:.6f}", mask)))}')
8.1 大JSON转小JSON
如果我们下载的数据集是COCO格式的,只有一个很大的JSON文件,然后还有对应的图片文件,这时候我们就需要将大的JSON文件转换一个个小的JSON文件,然后再按照方式1的方法来进行处理。转换代码如下:
import json
import os
def coco_to_labelme(coco_file, output_dir):
with open(coco_file, 'r') as f:
data = json.load(f)
images = data['images']
annotations = data['annotations']
categories = {category['id']: category['name'] for category in data['categories']}
for image in images:
image_id = image['id']
image_file = image['file_name']
print(image['file_name'].rsplit('\\', 1))
# dir, image_file_1 = image['file_name'].rsplit('\\', 1) # 如果包含路径则需要通过这种方式获取文件名
image_file_1 = image['file_name'].rsplit('\\', 1)
image_width = image['width']
image_height = image['height']
labelme_data = {
"version": "5.0.1",
"flags": {},
"shapes": [],
"imagePath": image_file_1,
"imageData": None,
"imageHeight": image_height,
"imageWidth": image_width
}
for annotation in annotations:
if annotation['image_id'] == image_id:
category_id = annotation['category_id']
category_name = categories[category_id]
bbox = annotation['bbox']
segmentation = annotation['segmentation'][0]
# Convert segmentation to polygon format
polygon = []
for i in range(0, len(segmentation), 2):
x = segmentation[i]
y = segmentation[i + 1]
polygon.append([x, y])
shape_data = {
"label": category_name,
"points": polygon,
"group_id": None,
"shape_type": "polygon",
"flags": {}
}
labelme_data['shapes'].append(shape_data)
image_name = os.path.splitext(os.path.basename(image_file))[0]
labelme_output_file = os.path.join(output_dir, image_name + '.json')
with open(labelme_output_file, 'w') as f:
json.dump(labelme_data, f, indent=4)
print(f"Converted {image_file} to {labelme_output_file}")
# 使用示例
coco_file = r'annotations/instances_train2014.json' # 这里是原始的COCO大JSON文件
output_dir = r'labelme/train2014' # 这里是保存的位置
coco_to_labelme(coco_file, output_dir)
九、PNG转JSON
感谢这位博主的代码:链接
通过mask获取每个类别对应的灰度值
import os
import cv2
from tqdm import tqdm
mask_path = 'mask'
data_files = os.listdir(mask_path)
# color_list = []
# for data_file in tqdm(data_files):
# img_file_path = os.path.join(data_path,data_file)
# img = cv2.imread(img_file_path)
# for x in range(img.shape[0]):
# for y in range(img.shape[1]):
# color = img[x,y]
# color = list(color)
# if color not in color_list:
# color_list.append(color)
#
#
# print(color_list)
gray_list = []
for data_file in tqdm(data_files):
img_file_path = os.path.join(mask_path,data_file)
img = cv2.imread(img_file_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
for x in range(img.shape[0]):
for y in range(img.shape[1]):
value = gray[x,y]
if value not in gray_list:
gray_list.append(value)
print(gray_list)
然后将灰度值写入到下面代码,通过提取灰度图每个类别的像素值得到轮廓信息并转化为json。
import cv2
import os
import json
from PIL import Image
import io
import base64
# class_dict = {
# "sky": 10,
# "building": 0,
# "column pole": 1,
# "road": 2,
# "sidewalk": 3,
# "tree": 4,
# "sign symbol": 5,
# "fence": 6,
# "car": 7,
# "pedestrian": 8,
# "bicyclist": 9
# }
color_list = [128, 0, 0], [0, 0, 244], [255, 96, 0], [240, 0, 0], [255, 212, 0], [0, 212, 255], [0, 100, 255], [74, 255,
182]
gray_list = [15, 73, 85, 27, 154, 201, 135, 213]
# def rgb_to_gray_value(RGB):
# R = RGB[0]
# G = RGB[1]
# B = RGB[2]
# Gray = (R * 299 + G * 587 + B * 114) / 1000
# return round(Gray)
#
#
# def bgr_2_rgb(color):
# color[0], color[2] = color[2], color[0]
# return color
# class_dict = {
# "A1 尾胶面破损": rgb_to_gray_value(bgr_2_rgb(color_list[0])),
# "B1 尾胶少胶": rgb_to_gray_value(bgr_2_rgb(color_list[1])),
# "C1 尾胶裂": rgb_to_gray_value(bgr_2_rgb(color_list[2])),
# "E1 骨架破损": rgb_to_gray_value(bgr_2_rgb(color_list[3])),
# "F1 骨架裂纹": rgb_to_gray_value(bgr_2_rgb(color_list[4])),
# "G1 尾胶溢胶": rgb_to_gray_value(bgr_2_rgb(color_list[5])),
# "H1 磁芯": rgb_to_gray_value(bgr_2_rgb(color_list[6])),
# }
class_dict = {
"A1 尾胶面破损": gray_list[0],
"B1 尾胶少胶": gray_list[1],
"C1 尾胶裂": gray_list[2],
"E1 骨架破损": gray_list[3],
"F1 骨架裂纹": gray_list[4],
"G1 尾胶溢胶": gray_list[5],
"H1 磁芯": gray_list[6]
}
def img_tobyte(img_pil):
# 类型转换 重要代码
# img_pil = Image.fromarray(roi)
ENCODING = 'utf-8'
img_byte = io.BytesIO()
img_pil.save(img_byte, format='PNG')
binary_str2 = img_byte.getvalue()
imageData = base64.b64encode(binary_str2)
base64_string = imageData.decode(ENCODING)
return base64_string
def func(file: str) -> dict:
if os.path.basename(file) == "0016E5_07959.png":
print('t')
png = cv2.imread(file)
gray = cv2.cvtColor(png, cv2.COLOR_BGR2GRAY)
img_file_path = os.path.join(img_path, os.path.basename(file).split('.')[0] + '.jpg')
img = Image.open(img_file_path)
imgData = img_tobyte(img)
dic = {"version": "5.1.1", "flags": {}, "shapes": list(), "imagePath": os.path.basename(file), "imageData": imgData,
"imageHeight": png.shape[0], "imageWidth": png.shape[1]}
#
# cv2.imshow("mask", gray)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
for k, v in class_dict.items():
# _, binary = cv2.threshold(gray, v + 1 , 255, cv2.THRESH_TOZERO_INV)
# _, binary = cv2.threshold(binary, v , 255, cv2.THRESH_BINARY_INV)
binary = gray.copy()
binary[binary != v] = 0
binary[binary == v] = 255
# _, binary = cv2.threshold(gray, i+1, 255, cv2.THRESH_BINARY_INV)
# _, binary = cv2.threshold(binary, i, 255, cv2.THRESH_TOZERO_INV)
# _, binary = cv2.threshold(binary, 125, 255, cv2.THRESH_BINARY_INV)
# if os.path.basename(file) == "16729150388540_class3.png":
# print('t')
# cv2.imshow('bin', binary)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
# 只检测外轮廓并存储所有的轮廓点
contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
# img = cv2.imread(img_file_path)
# cv2.drawContours(img, contour, -1, (0, 0, 255), 3)
# cv2.imshow("img", img)
# cv2.waitKey(0)
# cv2.destroyWindow("img")
temp = list()
if len(contour) < 4:
continue
for point in contour:
# if (point[0][0] < edge_th and point[0][1] < edge_th) or (point[0][0] < edge_th and point[0][1] > png.shape[0] -edge_th) \
# or (point[0][0] > png.shape[1] - edge_th and point[0][1] < edge_th) or (point[0][0] > png.shape[1] - edge_th and point[0][1] > png.shape[0] -edge_th):
# continue
# if len(temp) > 1 and temp[-2][0] * temp[-2][1] * int(point[0][0]) * int(point[0][1]) != 0 and (
# int(point[0][0]) - temp[-2][0]) * (
# temp[-1][1] - temp[-2][1]) == (int(point[0][1]) - temp[-2][1]) * (temp[-1][0] - temp[-1][0]):
# temp[-1][0] = int(point[0][0])
# temp[-1][1] = int(point[0][1])
# else:
temp.append([float(point[0][0]), float(point[0][1])])
dic["shapes"].append({"label": k, "points": temp, "group_id": None,
"shape_type": "polygon", "flags": {}})
return dic
if __name__ == "__main__":
# print(rgb_to_gray_value(bgr_2_rgb(color_list[0])))
# print(class_dict)
# edge_th = 2
img_path = 'image'
mask_path = 'mask'
save_path = 'json'
os.makedirs(save_path, exist_ok=True)
mask_files = os.listdir(mask_path)
for mask_file in mask_files:
mask_file_path = os.path.join(mask_path, mask_file)
save_file = mask_file.split('.')[0] + '.json'
save_file_path = os.path.join(save_path, save_file)
with open(save_file_path, mode='w', encoding='utf-8') as f:
json.dump(func(mask_file_path), f)
十、YOLO转COCO
###########################################################################################################################
import json
import os
import cv2
"""
Step1:需要先在root_path;路径下创建classes.txt!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Step2:images(原图文件夹)也需要复制到root_path下!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Step3:修改下面前五个变量的路径或者名称
"""
######################################第一步##################################################
#将yolo格式的标签:classId, xCenter, yCenter, w, h转换为coco格式:classId, xMin, yMim, xMax, #
# yMax格式。coco的id编号从1开始计算,所以这里classId应该从1开始计算。最终annos.txt中每行为imageName, #
# classId, xMin, yMim, xMax, yMax, 一个bbox对应一行 #
#############################################################################################
# 原始标签路径 txt cls 中心点的位置+宽高
originLabelsDir = r'D:\Python\company\Instance_segmentation\datasets\yolo\coco128-seg\labels\train2017'
# 转换后的文件保存路径 输出结果
saveDir = r'D:\Python\company\Instance_segmentation\datasets\test\annos.txt'
# 原始标签对应的图片路径
originImagesDir = r'D:\Python\company\Instance_segmentation\datasets\yolo\coco128-seg\images\train2017'
# 以及annotations文件夹(如果没有则会自动创建,用于保存最后的json)
root_path = r'D:\Python\company\Instance_segmentation\datasets\test'
# 用于创建训练集或验证集
phase = 'train' # 需要修正
txtFileList = os.listdir(originLabelsDir)
i = 0
with open(saveDir, 'w') as fw:
for txtFile in txtFileList:
with open(os.path.join(originLabelsDir, txtFile), 'r') as fr:
labelList = fr.readlines()
i += 1
if i == len(txtFileList) -1:
break
for label in labelList:
label = label.strip().split()
x = float(label[1])
y = float(label[2])
w = float(label[3])
h = float(label[4])
# convert x,y,w,h to x1,y1,x2,y2
imagePath = os.path.join(originImagesDir,
txtFile.replace('txt', 'jpg'))
image = cv2.imread(imagePath)
H, W, _ = image.shape
x1 = (x - w / 2) * W
y1 = (y - h / 2) * H
x2 = (x + w / 2) * W
y2 = (y + h / 2) * H
# 为了与coco标签方式对,标签序号从1开始计算
fw.write(txtFile.replace('txt', 'jpg') + ' {} {} {} {} {}\n'.format(int(label[0]) + 1, x1, y1, x2, y2))
print('{} done {} {}'.format(txtFile, i, len(txtFileList)))
######################################第二步##################################################
#将标签转换为coco格式并以json格式保存,代码如下。根路径root_path中,包含images(图片文件夹) #
# ,annos.txt(bbox标注),classes.txt(一行对应一种类别名字), 以及annotations文件夹(如果没有则会自动创建 #
# 用于保存最后的json) #
#############################################################################################
# ------------用os提取images文件夹中的图片名称,并且将BBox都读进去------------
# 根路径,里面包含images(图片文件夹),annos.txt(bbox标注),classes.txt(类别标签),
# dataset用于保存所有数据的图片信息和标注信息
dataset = {'categories': [], 'annotations': [], 'images': []}
# 打开类别标签
with open(os.path.join(root_path, 'classes.txt')) as f:
classes = f.read().strip().split()
# 建立类别标签和数字id的对应关系
for i, cls in enumerate(classes, 1):
dataset['categories'].append({'id': i, 'name': cls, 'supercategory': 'mark'})
# 读取images文件夹的图片名称
indexes = os.listdir(os.path.join(root_path, 'images'))
# 统计处理图片的数量
global count
count = 0
# 读取Bbox信息
with open(os.path.join(root_path, 'annos.txt')) as tr:
annos = tr.readlines()
# ---------------接着将,以上数据转换为COCO所需要的格式---------------
for k, index in enumerate(indexes):
count += 1
# 用opencv读取图片,得到图像的宽和高
im = cv2.imread(os.path.join(root_path, 'images/') + index)
height, width, _ = im.shape
# 添加图像的信息到dataset中
dataset['images'].append({'file_name': index,
'id': k,
'width': width,
'height': height})
for ii, anno in enumerate(annos):
parts = anno.strip().split()
# 如果图像的名称和标记的名称对上,则添加标记
if parts[0] == index:
# 类别
cls_id = parts[1]
# x_min
x1 = float(parts[2])
# y_min
y1 = float(parts[3])
# x_max
x2 = float(parts[4])
# y_max
y2 = float(parts[5])
width = max(0, x2 - x1)
height = max(0, y2 - y1)
dataset['annotations'].append({
'area': width * height,
'bbox': [x1, y1, width, height],
'category_id': int(cls_id),
'id': i,
'image_id': k,
'iscrowd': 0,
# mask, 矩形是从左上角点按顺时针的四个顶点
'segmentation': [[x1, y1, x2, y1, x2, y2, x1, y2]]
})
print('{} images handled'.format(count))
# 保存结果的文件夹
folder = os.path.join(root_path, 'annotations')
if not os.path.exists(folder):
os.makedirs(folder)
json_name = os.path.join(root_path, 'annotations/{}.json'.format(phase))
with open(json_name, 'w') as f:
json.dump(dataset, f)