前言
根据西瓜书上所述:数据和特征决定了机器学习的上限,而模型和算法只是逼近这个上限而已 ,因此在进行机器学习上,我们往往需要大量的优质数据集,为此我们尝试了各种数据扩增手段,例如:
- 增加或降低图像亮度
- 增加噪声或滤除噪声
- 图像进行镜像处理 ......
那我们应当如何高效利用已有的数据集进行数据的快速扩增(图像数据+Xml数据),正对上述需求,我们可以分为两大块项目进行处理:图像的缩放拼接、Xml文件解析并重编为此在本篇博客中提出一种缩小拼接法快速扩增数据集用来提高数据数量和质量。(针对Voc型数据集)
图像缩小拼接
例如:我们在拿到一张像素大小为1920x1080的图像A.jpg,我们需要将图像A缩小符合3x3的样式,因此需要对图像A进行缩小3倍处理,同时需要Copy9份图像,然后再对这9份图像进行拼接成原图像A大小的新图像。流程如下:
未标注原图像样例:
标注原图像样例:
通过标注原图像样例种我们可以看出如下标记框和编号和与之配套的xml文件如下
label1 : 眼睛
label2 : 鼻子
label3 : 腮红
label4 :嘴巴
<annotation> <folder>11</folder> <filename>s.jpg</filename> <path>C:\Users\kiven\Desktop\11\s.jpg</path> <source> <database>Unknown</database> </source> <size> <width>1920</width> <height>1080</height> <depth>3</depth> </size> <segmented>0</segmented> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>365</xmin> <ymin>467</ymin> <xmax>676</xmax> <ymax>660</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1221</xmin> <ymin>414</ymin> <xmax>1530</xmax> <ymax>610</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>938</xmin> <ymin>714</ymin> <xmax>1055</xmax> <ymax>769</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>149</xmin> <ymin>819</ymin> <xmax>535</xmax> <ymax>964</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1401</xmin> <ymin>697</ymin> <xmax>1785</xmax> <ymax>848</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>782</xmin> <ymin>905</ymin> <xmax>1226</xmax> <ymax>1042</ymax> </bndbox> </object> </annotation> 复制代码
缩小拼接后图像:
Xml解析重构
通过对原图像对应的xml文件进行解析我们不难得出:需要生成的新Xml只有object部分进行更改,其他部分保持一致即可。在进行object部分修改的时需要注意如下问题:
- 在原图种的每个object的name存在一个或多个,那么在缩小拼接后的图像种name的要和坐标对应好;
- pose、truncated和difficult 部分的参数都一样,故而不需要进行更改;
- 每个name对应的坐标在进行缩放拼接后对应的值需要转化为整数型计算,在返回时也需要转为整数型返回程序逻辑:
经过缩小变换后的标注图如下所示:
我们可以发现在每张小图上把眼睛鼻子腮红嘴巴都标注出来了,没有落下的,检查LabelImg发现也没有错误,编号都一一对应好了,这次的扩充数据算是圆满完成了。
扩增后的Xml文件:
<annotation> <folder>11</folder> <filename>s.jpg</filename> <path>C:\Users\kiven\Desktop\11\s.jpg</path> <size> <width>1920</width> <height>1080</height> <depth>3</depth> </size> <segmented>0</segmented> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>121</xmin> <ymin>155</ymin> <xmax>225</xmax> <ymax>220</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>761</xmin> <ymin>155</ymin> <xmax>865</xmax> <ymax>220</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1401</xmin> <ymin>155</ymin> <xmax>1505</xmax> <ymax>220</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>121</xmin> <ymin>515</ymin> <xmax>225</xmax> <ymax>580</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>761</xmin> <ymin>515</ymin> <xmax>865</xmax> <ymax>580</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1401</xmin> <ymin>515</ymin> <xmax>1505</xmax> <ymax>580</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>121</xmin> <ymin>875</ymin> <xmax>225</xmax> <ymax>940</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>761</xmin> <ymin>875</ymin> <xmax>865</xmax> <ymax>940</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1401</xmin> <ymin>875</ymin> <xmax>1505</xmax> <ymax>940</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>407</xmin> <ymin>138</ymin> <xmax>510</xmax> <ymax>203</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1047</xmin> <ymin>138</ymin> <xmax>1150</xmax> <ymax>203</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1687</xmin> <ymin>138</ymin> <xmax>1790</xmax> <ymax>203</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>407</xmin> <ymin>498</ymin> <xmax>510</xmax> <ymax>563</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1047</xmin> <ymin>498</ymin> <xmax>1150</xmax> <ymax>563</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1687</xmin> <ymin>498</ymin> <xmax>1790</xmax> <ymax>563</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>407</xmin> <ymin>858</ymin> <xmax>510</xmax> <ymax>923</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1047</xmin> <ymin>858</ymin> <xmax>1150</xmax> <ymax>923</ymax> </bndbox> </object> <object> <name>1</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1687</xmin> <ymin>858</ymin> <xmax>1790</xmax> <ymax>923</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>312</xmin> <ymin>238</ymin> <xmax>351</xmax> <ymax>256</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>952</xmin> <ymin>238</ymin> <xmax>991</xmax> <ymax>256</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1592</xmin> <ymin>238</ymin> <xmax>1631</xmax> <ymax>256</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>312</xmin> <ymin>598</ymin> <xmax>351</xmax> <ymax>616</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>952</xmin> <ymin>598</ymin> <xmax>991</xmax> <ymax>616</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1592</xmin> <ymin>598</ymin> <xmax>1631</xmax> <ymax>616</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>312</xmin> <ymin>958</ymin> <xmax>351</xmax> <ymax>976</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>952</xmin> <ymin>958</ymin> <xmax>991</xmax> <ymax>976</ymax> </bndbox> </object> <object> <name>2</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1592</xmin> <ymin>958</ymin> <xmax>1631</xmax> <ymax>976</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>49</xmin> <ymin>273</ymin> <xmax>178</xmax> <ymax>321</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>689</xmin> <ymin>273</ymin> <xmax>818</xmax> <ymax>321</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1329</xmin> <ymin>273</ymin> <xmax>1458</xmax> <ymax>321</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>49</xmin> <ymin>633</ymin> <xmax>178</xmax> <ymax>681</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>689</xmin> <ymin>633</ymin> <xmax>818</xmax> <ymax>681</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1329</xmin> <ymin>633</ymin> <xmax>1458</xmax> <ymax>681</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>49</xmin> <ymin>993</ymin> <xmax>178</xmax> <ymax>1041</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>689</xmin> <ymin>993</ymin> <xmax>818</xmax> <ymax>1041</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1329</xmin> <ymin>993</ymin> <xmax>1458</xmax> <ymax>1041</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>467</xmin> <ymin>232</ymin> <xmax>595</xmax> <ymax>282</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1107</xmin> <ymin>232</ymin> <xmax>1235</xmax> <ymax>282</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1747</xmin> <ymin>232</ymin> <xmax>1875</xmax> <ymax>282</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>467</xmin> <ymin>592</ymin> <xmax>595</xmax> <ymax>642</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1107</xmin> <ymin>592</ymin> <xmax>1235</xmax> <ymax>642</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1747</xmin> <ymin>592</ymin> <xmax>1875</xmax> <ymax>642</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>467</xmin> <ymin>952</ymin> <xmax>595</xmax> <ymax>1002</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1107</xmin> <ymin>952</ymin> <xmax>1235</xmax> <ymax>1002</ymax> </bndbox> </object> <object> <name>3</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1747</xmin> <ymin>952</ymin> <xmax>1875</xmax> <ymax>1002</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>260</xmin> <ymin>301</ymin> <xmax>408</xmax> <ymax>347</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>900</xmin> <ymin>301</ymin> <xmax>1048</xmax> <ymax>347</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1540</xmin> <ymin>301</ymin> <xmax>1688</xmax> <ymax>347</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>260</xmin> <ymin>661</ymin> <xmax>408</xmax> <ymax>707</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>900</xmin> <ymin>661</ymin> <xmax>1048</xmax> <ymax>707</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1540</xmin> <ymin>661</ymin> <xmax>1688</xmax> <ymax>707</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>260</xmin> <ymin>1021</ymin> <xmax>408</xmax> <ymax>1067</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>900</xmin> <ymin>1021</ymin> <xmax>1048</xmax> <ymax>1067</ymax> </bndbox> </object> <object> <name>4</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>1540</xmin> <ymin>1021</ymin> <xmax>1688</xmax> <ymax>1067</ymax> </bndbox> </object> </annotation> 复制代码
回顾总结
通过对图像的缩小拼接,我们可以快速完成数据集的扩增,与此同时,我们也得到了较小的目标,为小目标检测打下了数据基础。