t10k-images-idx3-ubyte
将文件夹中的图片处理成二进制数据集(如 MNIST 数据集的格式)是一个常见的需求,特别是在需要将数据集用于某些特定的框架或工具时。以下是一个详细的步骤指南,展示如何将文件夹中的图片和对应的标签转换为二进制格式。
步骤 1:准备数据
假设你的数据集结构如下:
复制
dataset/
├── train/
│ ├── 0/
│ ├── 1/
│ ├── 2/
│ └── ...
└── val/
├── 0/
├── 1/
├── 2/
└── ...
每个文件夹中的图片对应一个类别。
步骤 2:读取图片和标签
使用 Python 的 PIL 或 OpenCV 库读取图片,并将图片和标签存储到内存中。
Python复制
import os
from PIL import Image
import numpy as np
def load_images_and_labels(data_dir, target_size=(28, 28)):
images = []
labels = []
for label, folder in enumerate(sorted(os.listdir(data_dir))):
folder_path = os.path.join(data_dir, folder)
if not os.path.isdir(folder_path):
continue
for filename in os.listdir(folder_path):
if filename.endswith('.png') or filename.endswith('.jpg'):
image_path = os.path.join(folder_path, filename)
image = Image.open(image_path).convert('L') # 转换为灰度图
image = image.resize(target_size) # 调整大小
images.append(np.array(image, dtype=np.uint8))
labels.append(label)
return np.array(images), np.array(labels)
示例
train_images, train_labels = load_images_and_labels('dataset/train')
val_images, val_labels = load_images_and_labels('dataset/val')
步骤 3:保存为二进制文件
将图片和标签保存为二进制文件,格式类似于 MNIST 数据集。
Python复制
import struct
def save_mnist(images, labels, image_file_path, label_file_path):
with open(image_file_path, 'wb') as image_file, open(label_file_path, 'wb') as label_file:
# 写入图片文件头
image_file.write(struct.pack('>IIII', 2051, len(images), images.shape[1], images.shape[2]))
# 写入标签文件头
label_file.write(struct.pack('>II', 2049, len(labels)))
# 写入图片数据
for image in images:
image_file.write(image.tobytes())
# 写入标签数据
label_file.write(labels.tobytes())
AI 代码解读
示例
save_mnist(train_images, train_labels, 'train-images-idx3-ubyte', 'train-labels-idx1-ubyte')
save_mnist(val_images, val_labels, 't10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte')
步骤 4:验证保存的文件
你可以使用之前提到的读取 MNIST 数据集的代码来验证保存的文件是否正确。
Python复制
def read_mnist_images(file_path):
with open(file_path, 'rb') as f:
magic, num, rows, cols = struct.unpack(">IIII", f.read(16))
images = np.frombuffer(f.read(), dtype=np.uint8).reshape(num, rows, cols)
return images
def read_mnist_labels(file_path):
with open(file_path, 'rb') as f:
magic, num = struct.unpack(">II", f.read(8))
labels = np.frombuffer(f.read(), dtype=np.uint8)
return labels
示例
train_images = read_mnist_images('train-images-idx3-ubyte')
train_labels = read_mnist_labels('train-labels-idx1-ubyte')
val_images = read_mnist_images('t10k-images-idx3-ubyte')
val_labels = read_mnist_labels('t10k-labels-idx1-ubyte')
print(train_images.shape, train_labels.shape)
print(val_images.shape, val_labels.shape)
6个月前
t10k-labels-idx1-ubyte
6个月前
train-images-idx3-ubyte
6个月前
train-labels-idx1-ubyte
6个月前
t10k-images.idx3-ubyte
7.84MB
6个月前
下载
t10k-labels.idx1-ubyte
10.01KB
6个月前
下载
train-images.idx3-ubyte