本案例采用Olivetti Faces数据集进行训练,构建一个深度为8的卷积神经网络对人脸进行识别,我们发现数据增强能够显著降低总体损失,提升神经网络性能。

本案例主要包括以下内容:

1. 数据探索
2. 数据增强
3. 建立卷积神经网络模型
    3.1 定义评价函数
    3.2 维数转换
    3.3 设计网络结构
    3.4 模型训练与评估
4. 结果分析
In [1]:
# 如果有GPU,可以使用此代码
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '7'

人脸识别

1 数据探索

Olivetti Faces是由纽约大学整理的一个人脸数据集,具体信息可以参考官网,该数据集包括40个人的400张图片,每个人的10张人脸图像都是在不同时间、光线和表情下采集的,每张图片的灰度级为8位,每个像素的灰度大小位于0-255之间,每张图片大小为64×64。

下面首先从sklearn的datasets中载入Olivetti Faces数据集:

In [1]:
% config InlineBackend.figure_format='retina'
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# 加载人脸数据集
from sklearn.datasets import fetch_olivetti_faces
faces=fetch_olivetti_faces()

观察数据集结构组成:

In [2]:
faces
Out[2]:
{'DESCR': 'Modified Olivetti faces dataset.\n\nThe original database was available from\n\n    http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html\n\nThe version retrieved here comes in MATLAB format from the personal\nweb page of Sam Roweis:\n\n    http://www.cs.nyu.edu/~roweis/\n\nThere are ten different images of each of 40 distinct subjects. For some\nsubjects, the images were taken at different times, varying the lighting,\nfacial expressions (open / closed eyes, smiling / not smiling) and facial\ndetails (glasses / no glasses). All the images were taken against a dark\nhomogeneous background with the subjects in an upright, frontal position (with\ntolerance for some side movement).\n\nThe original dataset consisted of 92 x 112, while the Roweis version\nconsists of 64x64 images.\n',
 'data': array([[ 0.30991736,  0.36776859,  0.41735536, ...,  0.15289256,
          0.16115703,  0.1570248 ],
        [ 0.45454547,  0.47107437,  0.51239669, ...,  0.15289256,
          0.15289256,  0.15289256],
        [ 0.31818181,  0.40082645,  0.49173555, ...,  0.14049587,
          0.14876033,  0.15289256],
        ..., 
        [ 0.5       ,  0.53305787,  0.60743803, ...,  0.17768595,
          0.14876033,  0.19008264],
        [ 0.21487603,  0.21900827,  0.21900827, ...,  0.57438016,
          0.59090906,  0.60330576],
        [ 0.5165289 ,  0.46280992,  0.28099173, ...,  0.35950413,
          0.35537189,  0.38429752]], dtype=float32),
 'images': array([[[ 0.30991736,  0.36776859,  0.41735536, ...,  0.37190083,
           0.33057851,  0.30578512],
         [ 0.3429752 ,  0.40495867,  0.43801653, ...,  0.37190083,
           0.33884299,  0.3140496 ],
         [ 0.3429752 ,  0.41735536,  0.45041323, ...,  0.38016528,
           0.33884299,  0.29752067],
         ..., 
         [ 0.21487603,  0.20661157,  0.22314049, ...,  0.15289256,
           0.16528925,  0.17355372],
         [ 0.20247933,  0.2107438 ,  0.2107438 , ...,  0.14876033,
           0.16115703,  0.16528925],
         [ 0.20247933,  0.20661157,  0.20247933, ...,  0.15289256,
           0.16115703,  0.1570248 ]],
 
        [[ 0.45454547,  0.47107437,  0.51239669, ...,  0.19008264,
           0.18595041,  0.18595041],
         [ 0.44628099,  0.48347107,  0.52066118, ...,  0.21487603,
           0.2107438 ,  0.2107438 ],
         [ 0.49586776,  0.5165289 ,  0.53305787, ...,  0.20247933,
           0.20661157,  0.20661157],
         ..., 
         [ 0.77272725,  0.78099173,  0.79338843, ...,  0.14462809,
           0.14462809,  0.14462809],
         [ 0.77272725,  0.77685952,  0.78925622, ...,  0.13636364,
           0.13636364,  0.13636364],
         [ 0.76446283,  0.78925622,  0.78099173, ...,  0.15289256,
           0.15289256,  0.15289256]],
 
        [[ 0.31818181,  0.40082645,  0.49173555, ...,  0.40082645,
           0.35537189,  0.30991736],
         [ 0.30991736,  0.39669421,  0.47933885, ...,  0.40495867,
           0.37603307,  0.30165288],
         [ 0.26859504,  0.34710744,  0.45454547, ...,  0.39669421,
           0.37190083,  0.30991736],
         ..., 
         [ 0.1322314 ,  0.09917355,  0.08264463, ...,  0.13636364,
           0.14876033,  0.15289256],
         [ 0.11570248,  0.09504132,  0.0785124 , ...,  0.14462809,
           0.14462809,  0.1570248 ],
         [ 0.11157025,  0.09090909,  0.0785124 , ...,  0.14049587,
           0.14876033,  0.15289256]],
 
        ..., 
        [[ 0.5       ,  0.53305787,  0.60743803, ...,  0.28512397,
           0.23966943,  0.21487603],
         [ 0.49173555,  0.54132229,  0.60330576, ...,  0.29752067,
           0.20247933,  0.20661157],
         [ 0.46694216,  0.55785125,  0.61983472, ...,  0.29752067,
           0.17768595,  0.18595041],
         ..., 
         [ 0.03305785,  0.46280992,  0.5289256 , ...,  0.17355372,
           0.17355372,  0.16942149],
         [ 0.1570248 ,  0.52479339,  0.53305787, ...,  0.16528925,
           0.1570248 ,  0.18595041],
         [ 0.45454547,  0.52066118,  0.53305787, ...,  0.17768595,
           0.14876033,  0.19008264]],
 
        [[ 0.21487603,  0.21900827,  0.21900827, ...,  0.71487606,
           0.71487606,  0.69421488],
         [ 0.20247933,  0.20661157,  0.20661157, ...,  0.71074378,
           0.70661157,  0.69421488],
         [ 0.2107438 ,  0.20661157,  0.20661157, ...,  0.6859504 ,
           0.69008267,  0.69421488],
         ..., 
         [ 0.2644628 ,  0.25619835,  0.26033059, ...,  0.54132229,
           0.57438016,  0.59090906],
         [ 0.26859504,  0.2644628 ,  0.26859504, ...,  0.56198347,
           0.58264464,  0.59504133],
         [ 0.27272728,  0.26859504,  0.27272728, ...,  0.57438016,
           0.59090906,  0.60330576]],
 
        [[ 0.5165289 ,  0.46280992,  0.28099173, ...,  0.57851237,
           0.54132229,  0.60330576],
         [ 0.5165289 ,  0.45041323,  0.29338843, ...,  0.58264464,
           0.55371898,  0.57851237],
         [ 0.5165289 ,  0.44214877,  0.29338843, ...,  0.59917355,
           0.57851237,  0.54545456],
         ..., 
         [ 0.39256197,  0.41322315,  0.38842976, ...,  0.33471075,
           0.37190083,  0.39669421],
         [ 0.39256197,  0.38429752,  0.40495867, ...,  0.33057851,
           0.35950413,  0.37603307],
         [ 0.36776859,  0.40495867,  0.39669421, ...,  0.35950413,
           0.35537189,  0.38429752]]], dtype=float32),
 'target': array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  3,  3,  3,  3,
         3,  3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  5,
         5,  5,  5,  5,  5,  5,  5,  5,  5,  6,  6,  6,  6,  6,  6,  6,  6,
         6,  6,  7,  7,  7,  7,  7,  7,  7,  7,  7,  7,  8,  8,  8,  8,  8,
         8,  8,  8,  8,  8,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9, 10, 10,
        10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11,
        11, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13,
        13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15,
        15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
        17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 18, 18, 18, 18, 18, 18, 18,
        18, 18, 18, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 20, 20, 20, 20,
        20, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 22,
        22, 22, 22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 23, 23,
        23, 23, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25,
        25, 25, 25, 25, 25, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 27, 27,
        27, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 28, 28,
        28, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30,
        30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 32, 32, 32,
        32, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33,
        34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 35, 35, 35, 35, 35, 35, 35,
        35, 35, 35, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 37, 37, 37, 37,
        37, 37, 37, 37, 37, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 39,
        39, 39, 39, 39, 39, 39, 39, 39, 39])}

观察发现,该数据集包括四部分:
1)DESCR 主要介绍了数据的来源;
2)data 以一维向量的形式存储了数据集中的400张图像;
3)images 以二维矩阵的形式存储了数据集中的400张图像;
4)target 存储了数据集中400张图像的类别信息,以0-39分别代表40个人。

下面进一步观察数据的结构与类型:

In [4]:
print("The shape of data:",faces.data.shape, "The data type of data:",type(faces.data))
print("The shape of images:",faces.images.shape, "The data type of images:",type(faces.images))
print("The shape of target:",faces.target.shape, "The data type of target:",type(faces.target))
The shape of data: (400, 4096) The data type of data: <class 'numpy.ndarray'>
The shape of images: (400, 64, 64) The data type of images: <class 'numpy.ndarray'>
The shape of target: (400,) The data type of target: <class 'numpy.ndarray'>

所有的数据都以numpy.ndarray形式存储,利用起来很方便,因为下一步我们希望搭建卷积神经网络CNN来实现人脸识别,所以特征要用二维矩阵存储的图像,这样可以充分挖掘图像的结构信息。

下面我们进一步观察一组样本图像:

In [10]:
for i in range(3):
    fig=plt.figure(figsize=(20,5))
    count=1
    if i==0:
        print("This is the first person:")
    if i==1:
        print("This is the second person:")
    if i==2:
        print("This is the third person:")
    for j in range(5):
        ax1=fig.add_subplot(1,5,count)
        count += 1
        plt.imshow(faces.images[10*i+j],cmap="Greys_r")     # 显示图片
        plt.axis('off')                                     # 不显示坐标轴
    plt.show()      
This is the first person:
This is the second person:
This is the third person:

观察发现,每一个人的不同图像都存在角度、表情、光线的区别,这种样本之间的差异性虽然提升了分类难度,但同时要求模型必须提取到人脸的高阶特征,增强了模型的泛化能力。

2 数据增强

首先划分训练集与测试集:

In [14]:
# 定义特征和标签
x=faces.images
y=faces.target

# 以8:2比例随机地划分训练集和测试集
from sklearn.cross_validation import train_test_split
train_x, test_x, train_y, test_y = train_test_split(x, y, test_size=0.2, random_state=0)

# 记录测试集中出现的类别,后期模型评价画混淆矩阵时需要
index=set(test_y)

在做数据增强前,需要先把二维图像转换为三通道的图像,因为keras中的flow函数只接受三通道的图像输入,一般的RGB图像都附有三种原色通道:红、绿、蓝,而灰度图像则只有一通道,当需要把灰度图像转换为三通道RGB图像时,只需要在其三个通道上都用原始灰度像素填充。

在下面的循环块中,我们借助了tqdm进度条库来即时展示循环块的运行进程,这可以使我们了解到程序是否在顺利运行,对于计算复杂度高的循环十分有用。

In [4]:
import tqdm

# 定义列表train_x_RGB来存储转换的三通道图像
train_x_RGB=[]

# 利用tqdm输出运行进程
for k in tqdm.tqdm(range(len(train_x))):
    
    # 定义三维数组来存储转换后的三通道图像
    image_RGB=np.empty((64,64,3))
    
    # 用灰度像素填充三通道
    for i in range(64):
        for j in range(64):
            image_RGB[i][j]=train_x[k][i][j]
    train_x_RGB.append(image_RGB)
train_x_RGB=np.array(train_x_RGB)
100%|████████████████████████████████████████| 320/320 [00:06<00:00, 38.80it/s]
In [5]:
print(train_x_RGB.shape)
(320, 64, 64, 3)

每张图像的维度已经变成了(64, 64, 3),说明已经成功转换为三通道图像。

在深度学习中,为了防止过拟合,我们通常需要足够的数据,当无法得到充分大的数据量时,可以通过图像的几何变换来增加训练数据的量。为了充分利用有限的训练集(只有320个样本),我们将通过一系列随机变换对数据进行提升,以防止过拟、提升模型泛化能力。

In [6]:
from keras.preprocessing.image import ImageDataGenerator

# 定义随机变换的类别及程度
datagen = ImageDataGenerator(
        rotation_range=0,            # 图像随机转动的角度
        width_shift_range=0.01,      # 图像水平偏移的幅度
        height_shift_range=0.01,     # 图像竖直偏移的幅度
        shear_range=0.01,            # 逆时针方向的剪切变换角度
        zoom_range=0.01,             # 随机缩放的幅度
        horizontal_flip=True,
        fill_mode='nearest')

for i in tqdm.tqdm(range(len(train_x_RGB))):
    img = train_x_RGB[i]
    img = img.reshape((1,) + img.shape)  # 维数转换
    count = 0
    for batch in datagen.flow(img,batch_size=1,
                        save_to_dir="G:\\data\\face",    # 生成后的图像保存路径
                        save_prefix=str(train_y[i]),     # 生成后图像用其标签命名,这样方便记录其种类
                        save_format='png'):
        count += 1
        if count >= 3:                  # 只保存扩增的前3个图像
            break  
Using TensorFlow backend.
100%|████████████████████████████████████████| 320/320 [00:11<00:00, 29.05it/s]

读取增强后的训练集:

In [15]:
import os
import tqdm
import matplotlib.image as mpimg

# 读取训练集数据
train_x_enhance=[]               
train_y_enhance=[]

fileDir = r"G:\\data\\face"                  # 数据所在的文件夹
for root, dirs, files in os.walk(fileDir): 

    for name in tqdm.tqdm(files):
        path=str(root+ '/' + name)           # 路径
        photo=mpimg.imread(path)             # 读取图像
        photo_2D=photo[:,:,0]                # 增强后的图像是三通道的,切片读取为二维矩阵
        train_x_enhance.append(photo_2D)     # 添加训练集数据
        train_y_enhance.append(int(name.split("_")[0])) # 通过字符串切割得到类别信息

train_x_enhance=np.array(train_x_enhance)
train_y_enhance=np.array(train_y_enhance)      
100%|███████████████████████████████████████| 960/960 [00:01<00:00, 760.79it/s]

3 建立卷积神经网络模型

3.1 定义评价函数

In [24]:
import seaborn as sns
from sklearn import metrics
from sklearn.metrics import classification_report 

"""
函数evaluate(pred,test_y)用来对分类结果进行评价;
输入:真实的分类、预测的分类结果
输出:分类的准确率、混淆矩阵等
"""
def evaluate(pred,test_y):
    # 将三维的分类结果转换成一维数组
    pred_1D=[]
    for i in range(len(pred)):
        some=list(pred[i])
        max_index=some.index(max(some))  #记录最大值的index
        pred_1D.append(max_index)
    pred_1D=np.array(pred_1D)
    
    # 将三维的分类结果转换成一维数组
    testy_1D=[]
    for i in range(len(test_y)):
        some=list(test_y[i])
        max_index=some.index(max(some))  #记录最大值的index
        testy_1D.append(max_index)
    testy_1D=np.array(testy_1D)
    
    # 输出分类的准确率
    print("Accuracy: %.4f"  % (metrics.accuracy_score(testy_1D,pred_1D)))
    # 输出衡量分类效果的各项指标
    print(classification_report(testy_1D, pred_1D)) 
    # 更直观的,我们通过seaborn画出混淆矩阵
    %matplotlib inline
    plt.figure(figsize=(9,6))
    colorMetrics = metrics.confusion_matrix(testy_1D,pred_1D)
    # 坐标y代表test_y,即真实的类别,坐标x代表估计出的类别pred
    sns.heatmap(colorMetrics,annot=True,fmt='d',xticklabels=list(index),yticklabels=list(index))
    sns.plt.show()

3.2 维数转换

keras对输入数据的格式有严格要求:
1)一维卷积层Conv1D输入三维数组,二维卷积层Conv2D输入四维数组;
2)label必须是one-hot变量。

In [16]:
from keras.utils import np_utils

# 对特征的转换:由3维转换为4维
train_x=train_x.reshape(train_x.shape + (1,))
train_x_enhance=train_x_enhance.reshape(train_x_enhance.shape + (1,))
test_x=test_x.reshape(test_x.shape + (1,))

# 对标签的转换:one-hot编码
train_y= np_utils.to_categorical(train_y, 40)
train_y_enhance= np_utils.to_categorical(train_y_enhance, 40)
test_y= np_utils.to_categorical(test_y, 40)

3.3 设计网络结构

卷积神经网络的结构如下:

Structure for CNN

从keras的相应模块引入需要的对象:
In [7]:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten 
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import Adam

定义网络结构:

In [28]:
# 定义种子使参数初始化一致,从而实验可重复
seed = 100
np.random.seed(seed)

# 开始定义一个模型
model = Sequential()  

# 添加一个卷积层,卷积核有32个,卷积核大小为(3,3),valid指卷积核只遍历有完整结构的数据
# input_shape最后一维为1代表输入是灰度图片,3代表彩色RGB图片
model.add(Conv2D(32, 3, 3, border_mode='valid', input_shape=(64, 64, 1)))  
model.add(Activation('relu')) 

# 添加池化窗口为(2,2)的最大池化层
model.add(MaxPooling2D(pool_size=(2, 2)))  
  
# 添加一个卷积层,卷积核有64个,卷积核大小为(3,3)
model.add(Conv2D(64, 3, 3))  
model.add(Activation('relu')) 

# 添加池化窗口为(2,2)的最大池化层
model.add(MaxPooling2D(pool_size=(2, 2))) 

# 添加Dropout层防止过拟
# Dropout将在训练过程中每次更新参数时随机断开一定比率的输入神经元
model.add(Dropout(0.25)) 

# 添加Flatten层把多维的输入一维化
# Flatten层常用在从卷积层到全连接层的过渡
model.add(Flatten())  

# 添加有512个节点的全连接层
model.add(Dense(512))  
model.add(Activation('tanh'))

# 添加softmax输出层
model.add(Dense(40))  
model.add(Activation('softmax'))  

# 定义损失函数与优化器
# 'categorical_crossentropy'用于多分类问题,优化器选择Adam
model.compile(loss='categorical_crossentropy', optimizer="Adam", metrics=['accuracy'])

3.4 模型训练与评估

用原始数据训练模型:

In [18]:
# 训练模型
model.fit(train_x, train_y, batch_size=200, nb_epoch=20, validation_data=(test_x,test_y))

# 模型评价
score = model.evaluate(test_x, test_y, batch_size=20)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
Train on 320 samples, validate on 80 samples
Epoch 1/20
320/320 [==============================] - 7s - loss: 3.7650 - acc: 0.0156 - val_loss: 3.7613 - val_acc: 0.0125
Epoch 2/20
320/320 [==============================] - 3s - loss: 3.6639 - acc: 0.0531 - val_loss: 3.7195 - val_acc: 0.0125
Epoch 3/20
320/320 [==============================] - 3s - loss: 3.6573 - acc: 0.0281 - val_loss: 3.7069 - val_acc: 0.0125
Epoch 4/20
320/320 [==============================] - 3s - loss: 3.6426 - acc: 0.0281 - val_loss: 3.7197 - val_acc: 0.0125
Epoch 5/20
320/320 [==============================] - 3s - loss: 3.5903 - acc: 0.0719 - val_loss: 3.7676 - val_acc: 0.0125
Epoch 6/20
320/320 [==============================] - 3s - loss: 3.5403 - acc: 0.0656 - val_loss: 3.7530 - val_acc: 0.0250
Epoch 7/20
320/320 [==============================] - 3s - loss: 3.4466 - acc: 0.1375 - val_loss: 3.6554 - val_acc: 0.1125
Epoch 8/20
320/320 [==============================] - 4s - loss: 3.3132 - acc: 0.3281 - val_loss: 3.4630 - val_acc: 0.2375
Epoch 9/20
320/320 [==============================] - 3s - loss: 3.1425 - acc: 0.4906 - val_loss: 3.2436 - val_acc: 0.4375
Epoch 10/20
320/320 [==============================] - 3s - loss: 2.8672 - acc: 0.6688 - val_loss: 2.9554 - val_acc: 0.3875
Epoch 11/20
320/320 [==============================] - 3s - loss: 2.5646 - acc: 0.6656 - val_loss: 2.6144 - val_acc: 0.5000
Epoch 12/20
320/320 [==============================] - 3s - loss: 2.1859 - acc: 0.7031 - val_loss: 2.2983 - val_acc: 0.5375
Epoch 13/20
320/320 [==============================] - 4s - loss: 1.8077 - acc: 0.7844 - val_loss: 1.8777 - val_acc: 0.6875
Epoch 14/20
320/320 [==============================] - 4s - loss: 1.4232 - acc: 0.8344 - val_loss: 1.6579 - val_acc: 0.6625
Epoch 15/20
320/320 [==============================] - 4s - loss: 1.1274 - acc: 0.8375 - val_loss: 1.3932 - val_acc: 0.7125
Epoch 16/20
320/320 [==============================] - 4s - loss: 0.8326 - acc: 0.8937 - val_loss: 1.1181 - val_acc: 0.7750
Epoch 17/20
320/320 [==============================] - 4s - loss: 0.6718 - acc: 0.9062 - val_loss: 0.9172 - val_acc: 0.8125
Epoch 18/20
320/320 [==============================] - 3s - loss: 0.4851 - acc: 0.9281 - val_loss: 0.7827 - val_acc: 0.8250
Epoch 19/20
320/320 [==============================] - 3s - loss: 0.3612 - acc: 0.9687 - val_loss: 0.6832 - val_acc: 0.8625
Epoch 20/20
320/320 [==============================] - 4s - loss: 0.2873 - acc: 0.9750 - val_loss: 0.5710 - val_acc: 0.9000
80/80 [==============================] - 0s     
Test loss: 0.571005783975
Test accuracy: 0.899999991059

更详细的模型评价:

In [25]:
pred=model.predict(test_x)
evaluate(pred,test_y)
Accuracy: 0.9000
C:\Users\dell\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:1113: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
             precision    recall  f1-score   support

          0       0.80      0.80      0.80         5
          1       1.00      1.00      1.00         4
          2       0.67      1.00      0.80         2
          3       1.00      1.00      1.00         1
          4       1.00      1.00      1.00         1
          5       1.00      1.00      1.00         3
          6       1.00      1.00      1.00         3
          7       0.67      0.67      0.67         3
          9       1.00      1.00      1.00         1
         10       0.75      1.00      0.86         3
         11       1.00      1.00      1.00         1
         12       1.00      1.00      1.00         1
         13       1.00      1.00      1.00         2
         14       1.00      0.75      0.86         4
         15       1.00      1.00      1.00         3
         17       1.00      0.67      0.80         6
         19       1.00      1.00      1.00         3
         20       1.00      1.00      1.00         1
         21       1.00      1.00      1.00         1
         22       0.33      1.00      0.50         1
         23       1.00      1.00      1.00         1
         24       1.00      1.00      1.00         2
         25       0.00      0.00      0.00         1
         26       1.00      1.00      1.00         3
         27       1.00      1.00      1.00         1
         28       1.00      1.00      1.00         2
         29       1.00      1.00      1.00         3
         30       1.00      1.00      1.00         3
         31       1.00      1.00      1.00         3
         32       1.00      1.00      1.00         2
         33       1.00      1.00      1.00         1
         34       1.00      0.33      0.50         3
         35       1.00      1.00      1.00         1
         36       1.00      1.00      1.00         2
         37       1.00      1.00      1.00         2
         39       0.33      1.00      0.50         1

avg / total       0.93      0.90      0.90        80

用原始数据训练模型,能达到90%的准确率,其中错误较多的是把第17人识别为第22人、把第34人识别为第39人。

用增强后的数据训练模型:

需要注意:在重新训练模型之前,必须重新初始化模型参数,不然会从上次的训练结果开始迭代,导致出错。

In [30]:
# 训练模型
model.fit(train_x_enhance, train_y_enhance, batch_size=200, nb_epoch=20, validation_data=(test_x,test_y))

# 模型评价
score = model.evaluate(test_x, test_y, batch_size=20)
print('Test score:', score[0])
print('Test accuracy:', score[1])
Train on 960 samples, validate on 80 samples
Epoch 1/20
960/960 [==============================] - 13s - loss: 3.7088 - acc: 0.0271 - val_loss: 3.7028 - val_acc: 0.0125
Epoch 2/20
960/960 [==============================] - 10s - loss: 3.6507 - acc: 0.0375 - val_loss: 3.6979 - val_acc: 0.0125
Epoch 3/20
960/960 [==============================] - 10s - loss: 3.5633 - acc: 0.0760 - val_loss: 3.6605 - val_acc: 0.0250
Epoch 4/20
960/960 [==============================] - 10s - loss: 3.3563 - acc: 0.2146 - val_loss: 3.4547 - val_acc: 0.1500
Epoch 5/20
960/960 [==============================] - 10s - loss: 2.9537 - acc: 0.4844 - val_loss: 3.0044 - val_acc: 0.2500
Epoch 6/20
960/960 [==============================] - 10s - loss: 2.2424 - acc: 0.7240 - val_loss: 2.1256 - val_acc: 0.5750
Epoch 7/20
960/960 [==============================] - 10s - loss: 1.3697 - acc: 0.8448 - val_loss: 1.5327 - val_acc: 0.6750
Epoch 8/20
960/960 [==============================] - 10s - loss: 0.7111 - acc: 0.9135 - val_loss: 1.2560 - val_acc: 0.6000
Epoch 9/20
960/960 [==============================] - 10s - loss: 0.4081 - acc: 0.9333 - val_loss: 0.7134 - val_acc: 0.8375
Epoch 10/20
960/960 [==============================] - 10s - loss: 0.2575 - acc: 0.9615 - val_loss: 0.6534 - val_acc: 0.7875
Epoch 11/20
960/960 [==============================] - 10s - loss: 0.1540 - acc: 0.9760 - val_loss: 0.7246 - val_acc: 0.7500
Epoch 12/20
960/960 [==============================] - 10s - loss: 0.1003 - acc: 0.9885 - val_loss: 0.6008 - val_acc: 0.8125
Epoch 13/20
960/960 [==============================] - 10s - loss: 0.0636 - acc: 0.9958 - val_loss: 0.2695 - val_acc: 0.9000
Epoch 14/20
960/960 [==============================] - 10s - loss: 0.0436 - acc: 0.9990 - val_loss: 0.4828 - val_acc: 0.8625
Epoch 15/20
960/960 [==============================] - 10s - loss: 0.0321 - acc: 1.0000 - val_loss: 0.2997 - val_acc: 0.9250
Epoch 16/20
960/960 [==============================] - 10s - loss: 0.0226 - acc: 1.0000 - val_loss: 0.4018 - val_acc: 0.9000
Epoch 17/20
960/960 [==============================] - 10s - loss: 0.0189 - acc: 1.0000 - val_loss: 0.2541 - val_acc: 0.9250
Epoch 18/20
960/960 [==============================] - 11s - loss: 0.0158 - acc: 1.0000 - val_loss: 0.3042 - val_acc: 0.9250
Epoch 19/20
960/960 [==============================] - 11s - loss: 0.0118 - acc: 1.0000 - val_loss: 0.3262 - val_acc: 0.9125
Epoch 20/20
960/960 [==============================] - 10s - loss: 0.0111 - acc: 1.0000 - val_loss: 0.2909 - val_acc: 0.9375
80/80 [==============================] - 0s     
Test score: 0.290891885757
Test accuracy: 0.9375

更详细的模型评价:

In [31]:
pred=model.predict(test_x)
evaluate(pred,test_y)
Accuracy: 0.9375
C:\Users\dell\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:1113: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
C:\Users\dell\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:1115: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples.
  'recall', 'true', average, warn_for)
             precision    recall  f1-score   support

          0       1.00      1.00      1.00         5
          1       1.00      1.00      1.00         4
          2       0.67      1.00      0.80         2
          3       1.00      1.00      1.00         1
          4       1.00      1.00      1.00         1
          5       1.00      1.00      1.00         3
          6       1.00      1.00      1.00         3
          7       1.00      1.00      1.00         3
          9       1.00      1.00      1.00         1
         10       1.00      1.00      1.00         3
         11       1.00      1.00      1.00         1
         12       1.00      1.00      1.00         1
         13       1.00      1.00      1.00         2
         14       0.80      1.00      0.89         4
         15       1.00      1.00      1.00         3
         17       1.00      0.50      0.67         6
         18       0.00      0.00      0.00         0
         19       1.00      1.00      1.00         3
         20       0.50      1.00      0.67         1
         21       1.00      1.00      1.00         1
         22       0.50      1.00      0.67         1
         23       1.00      1.00      1.00         1
         24       1.00      1.00      1.00         2
         25       0.00      0.00      0.00         1
         26       1.00      1.00      1.00         3
         27       1.00      1.00      1.00         1
         28       1.00      1.00      1.00         2
         29       1.00      1.00      1.00         3
         30       1.00      1.00      1.00         3
         31       1.00      1.00      1.00         3
         32       1.00      1.00      1.00         2
         33       1.00      1.00      1.00         1
         34       1.00      1.00      1.00         3
         35       0.00      0.00      0.00         1
         36       1.00      1.00      1.00         2
         37       1.00      1.00      1.00         2
         39       1.00      1.00      1.00         1

avg / total       0.94      0.94      0.93        80

同样的模型结构,增强数据能达到93.75%的准确率,即使识别错误,也最多错一次。

4 结果分析

两个模型的结果如下:

model loss accuracy precision recall f1-score
model_original 0.571 0.900 0.93 0.90 0.90
model_enhance 0.291 0.938 0.94 0.94 0.93

画出柱状图直观展示两个模型的差异:

In [33]:
index=["loss","accuracy","precision","recall","f1-score"]
data=np.array([[0.571,0.291],[0.900,0.938],[0.93,0.94],[0.90,0.94],[0.90,0.93]])
df=pd.DataFrame(data,columns=["original","enhance"],index=index)
plt.figure(figsize=(8,6))
df.plot(kind="bar")
plt.ylim([0.2,1])
plt.show()
<matplotlib.figure.Figure at 0x17018db160>

数据增强后的模型效果明显好于原始数据模型,这是因为训练数据的增多能降低过拟合,当遇到数据量较小的数据集时,应该优先考虑这种方法。