sklearn SVM 图像识别
SVM 应用于图像识别,这是一个具有非常大的维度空间的经典问题(图像的每个像素的值被视为一个特征)。
- 给出一个人脸的图像,预测它可能属于列表中的哪些人
- SVM 模型在训练模型时可能非常耗费计算量,并且它们不会返回数字指标,表明它们对预测的置信度
- 可以使用一些技术,如 K 折交叉验证来避免这种情况,代价是增加计算成本
1.导入数据集
我们的数据集在 scikit-learn 中提供,所以让我们从导入开始并打印其描述。
>>> import sklearn as sk >>> import numpy as np >>> import matplotlib.pyplot as plt >>> from sklearn.datasets import fetch_olivetti_faces >>> faces = fetch_olivetti_faces() >>> print faces.DESCR
该数据集包含 40 个不同人脸的 400 张图像。拍摄的照片采用不同的光线条件和面部表情(包括睁眼/闭眼,微笑/不笑,戴眼镜/不戴眼镜)。有关数据集的其他信息,请参阅这个页面。
查看faces
对象的内容,我们得到以下属性:images
,data
和target
。图像包含表示为64 x 64
像素矩阵的 400 个图像。 data
包含相同的 400 个图像,但是作为 4096 个像素的数组。正如预期的那样,target
是一个具有目标类的数组,范围从 0 到 39。
>>> print faces.keys() ['images', 'data', 'target', 'DESCR'] >>> print faces.images.shape (400, 64, 64) >>> print faces.data.shape (400, 4096) >>> print faces.target.shape (400,)
2.归一化像素值
>>> print np.max(faces.data) 1.0 >>> print np.min(faces.data) 0.0 >>> print np.mean(faces.data) 0.547046432495
3.查看人脸
>>> def print_faces(images, target, top_n): >>> # set up the figure size in inches >>> fig = plt.figure(figsize=(12, 12)) >>> fig.subplots_adjust(left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05) >>> for i in range(top_n): >>> # plot the images in a matrix of 20x20 >>> p = fig.add_subplot(20, 20, i + 1, xticks=[], yticks=[]) >>> p.imshow(images[i], cmap=plt.cm.bone) >>> >>> # label the image with the target value >>> p.text(0, 14, str(target[i])) >>> p.text(0, 60, str(i))
如果我们打印前 20 张图像,我们可以看到两个人脸。
>>> print_faces(faces.images, faces.target, 20)
4.支持向量机
从sklearn.svm
模块导入SVC
类:
>>> from sklearn.svm import SVC
支持向量分类器(SVC)将用于分类
SVC 实现具有不同的重要参数;使用最简单的核,即linear
。
>>> svc_1 = SVC(kernel='linear')
5.数据集划分
>>> from sklearn.cross_validation import train_test_split >>> X_train, X_test, y_train, y_test = train_test_split( faces.data, faces.target, test_size=0.25, random_state=0)
6.K折验证
我们将定义函数来评估 K 折交叉验证。
>>> from sklearn.cross_validation import cross_val_score, KFold >>> from scipy.stats import sem >>> >>> def evaluate_cross_validation(clf, X, y, K): >>> # create a k-fold croos validation iterator >>> cv = KFold(len(y), K, shuffle=True, random_state=0) >>> # by default the score used is the one returned by score method of the estimator (accuracy) >>> scores = cross_val_score(clf, X, y, cv=cv) >>> print scores >>> print ("Mean score: {0:.3f} (+/-{1:.3f})").format( np.mean(scores), sem(scores)) >>> evaluate_cross_validation(svc_1, X_train, y_train, 5) [ 0.93333333 0.91666667 0.95 0.95 0.91666667] Mean score: 0.933 (+/-0.007)
交叉验证五次,获得了相当不错的结果(准确率为 0.933)
7.训练和评估
我们还将定义一个函数来对训练集进行训练并评估测试集上的表现。
>>> from sklearn import metrics >>> >>> def train_and_evaluate(clf, X_train, X_test, y_train, y_test): >>> >>> clf.fit(X_train, y_train) >>> >>> print "Accuracy on training set:" >>> print clf.score(X_train, y_train) >>> print "Accuracy on testing set:" >>> print clf.score(X_test, y_test) >>> >>> y_pred = clf.predict(X_test) >>> >>> print "Classification Report:" >>> print metrics.classification_report(y_test, y_pred) >>> print "Confusion Matrix:" >>> print metrics.confusion_matrix(y_test, y_pred)
>>> train_and_evaluate(svc_1, X_train, X_test, y_train, y_test) Accuracy on training set: 1.0 Accuracy on testing set: 0.99
8.戴眼镜的人脸
>>> # the index ranges of images of people with glasses >>> glasses = [ (10, 19), (30, 32), (37, 38), (50, 59), (63, 64), (69, 69), (120, 121), (124, 129), (130, 139), (160, 161), (164, 169), (180, 182), (185, 185), (189, 189), (190, 192), (194, 194), (196, 199), (260, 269), (270, 279), (300, 309), (330, 339), (358, 359), (360, 369) ]
用1
标记带有眼镜的人脸,而0
用于没有眼镜的人脸:
>>> def create_target(segments): >>> # create a new y array of target size initialized with zeros >>> y = np.zeros(faces.target.shape[0]) >>> # put 1 in the specified segments >>> for (start, end) in segments: >>> y[start:end + 1] = 1 >>> return y >>> target_glasses = create_target(glasses)
训练/测试
>>> X_train, X_test, y_train, y_test = train_test_split( faces.data, target_glasses, test_size=0.25, random_state=0)
>>> svc_2 = SVC(kernel='linear')
>>> evaluate_cross_validation(svc_2, X_train, y_train, 5) [ 0.98333333 0.98333333 0.93333333 0.96666667 0.96666667] Mean score: 0.967 (+/-0.009)
交叉验证获得 0.967 的平均准确率。
>>> train_and_evaluate(svc_2, X_train, X_test, y_train, y_test) Accuracy on training set: 1.0 Accuracy on testing set: 0.99 Classification Report: precision recall f1-score support 0 1.00 0.99 0.99 67 1 0.97 1.00 0.99 33 avg / total 0.99 0.99 0.99 100 Confusion Matrix: [[66 1] [ 0 33]]
>>> X_test = faces.data[30:40] >>> y_test = target_glasses[30:40] >>> print y_test.shape[0] 10 >>> select = np.ones(target_glasses.shape[0]) >>> select[30:40] = 0 >>> X_train = faces.data[select == 1] >>> y_train = target_glasses[select == 1] >>> print y_train.shape[0] 390 >>> svc_3 = SVC(kernel='linear') >>> train_and_evaluate(svc_3, X_train, X_test, y_train, y_test) Accuracy on training set: 1.0 Accuracy on testing set: 0.9 Classification Report: precision recall f1-score support 0 0.83 1.00 0.91 5 1 1.00 0.80 0.89 5 avg / total 0.92 0.90 0.90 10 Confusion Matrix: [[5 0] [1 4]]
10 张图片中,只有一个错误, 仍然是非常好的结果
>>> y_pred = svc_3.predict(X_test) >>> eval_faces = [np.reshape(a, (64, 64)) for a in X_eval]
然后使用我们的print_faces
函数绘图:
>>> print_faces(eval_faces, y_pred, 10)
上图中的图像编号8带有眼镜,并且被分类为无眼镜。如果我们看一下这个例子,我们可以看到它与其他带眼镜的图像不同(眼镜的边框看不清楚,人闭着眼睛),这可能就是它误判的原因。
可以尝试使用多项式或 RBF 核。此外,C
和gamma
参数可能会影响结果。有关参数及其值的说明,请参阅 scikit-learn 文档。