Fit_generator keras hướng dẫn

Dù Tensorflow Keras đã hỗ trợ VGG16, ở bài viết này, chúng ta vẫn sẽ cùng nhau viết lại VGG16 trong Tensorflow với Keras để hiểu cấu trúc mạng và cùng thử nghiệm với dataset Kaggle Dogs and Cats để phân loại chó mèo nhé.

Nội dung chính Show

I. Chuẩn bị dataset
Chia dữ liệu
Tạo các ImageDataGenerator
Kết quả

Mình sẽ trình bày bài viết này giống như một Jupyter Notebook kèm theo kết quả đã thực hiện để mọi người dễ theo dõi. Bài viết sẽ gồm phần chuẩn bị dataset chó mèo trước, và sau đó là xây dựng bộ phân loại VGG16 bằng Keras và tiến hành huấn luyện phân loại ảnh chó mèo. Các bạn cũng nên sử dụng Jupyter Notebook hoặc Jupyter Lab để triển khai mã nguồn.

# Dependencies: Các package cần dùng:

Python 3.7

1jupyterlab=1.1.4 2scipy=1.3.1 3matplotlib=3.1.1 4pillow=6.2.0 5tensorflow-gpu=2.0.0 6cudnn=7.6.0

Trước tiên chúng ta cần import toàn bộ thư viện cần dùng.

1import os 2import tensorflow as tf 3import numpy as np 4import math 5import timeit 6import matplotlib.pyplot as plt 7 8%matplotlib inline

I. Chuẩn bị dataset

Các bạn cần tải bộ dữ liệu Dogs vs Cats của Kaggle tại trang này và lưu lại với tên dogs-vs-cats.zip.

Tiếp đó chúng ta sẽ thực hiện giải nén và chia dữ liệu.

1import os 2import zipfile 3 4# Remove old dataset folders 5!rm -rf dataset_dogs_vs_cats/ 6!rm -rf /tmp/dogs-vs-cats 7 8# Create temporary folder 9!mkdir -p /tmp/dogs-vs-cats 10 11local_zip = 'dogs-vs-cats.zip' 12zip_ref = zipfile.ZipFile(local_zip, 'r') 13zip_ref.extractall('/tmp/dogs-vs-cats') 14zip_ref.close() 15 16# Extract train folder 17train_zip = '/tmp/dogs-vs-cats/train.zip' 18zip_ref = zipfile.ZipFile(train_zip, 'r') 19zip_ref.extractall('/tmp/dogs-vs-cats') 20zip_ref.close() 21 22# Dataset 23dataset_folder = '/tmp/dogs-vs-cats/train/' 24

Kiểm tra lại dữ liệu đã giải nén bằng cách in ra một số hình chó.

1from matplotlib import pyplot 2from matplotlib.image import imread 3# Define location of dataset 4folder = dataset_folder 5# Plot first few images 6for i in range(9): 7 # Define subplot 8 pyplot.subplot(330 + 1 + i) 9 # Define filename 10 filename = folder + 'dog.' + str(i) + '.jpg' 11 # Load image pixels 12 image = imread(filename) 13 # Plot raw pixel data 14 pyplot.imshow(image) 15# Show the figure 16pyplot.show()

dogs

Chia dữ liệu

Phần này sẽ tạo ra dữ mục dataset_dogs_vs_cats chứa toàn bộ dữ liệu cần dùng, và chia theo thư mục như dưới. Thư mục train sẽ chứa toàn bộ dữ liệu cho training, thư mục val chứa dữ liệu cho validation và thư mục test sẽ chứa dữ liệu cho testing. Dữ liệu được chia lần lượt theo tỷ lệ 60%, 20%, 20%.

1dataset_dogs_vs_cats 2├── train 3│ ├── cats 4│ └── dogs 5├── val 6│ ├── cats 7│ └── dogs 8└── test 9 ├── cats 10 └── dogs

1from os import makedirs 2from os import listdir 3from shutil import copyfile 4from random import seed 5from random import random 6 7# Create directories 8dataset_home = 'dataset_dogs_vs_cats/' 9subdirs = ['train/', 'val/', 'test/'] 10for subdir in subdirs: 11 # Create label subdirectories 12 labeldirs = ['dogs/', 'cats/'] 13 for labldir in labeldirs: 14 newdir = dataset_home + subdir + labldir 15 makedirs(newdir, exist_ok=True) 16 17# Copy dataset into folders 18dog_files = [] 19cat_files = [] 20 21# Define ratio of pictures to use for testing 22# Copy training dataset images into subdirectories 23src_directory = dataset_folder 24for file in listdir(src_directory): 25 if file.startswith('cat'): 26 cat_files.append(file) 27 elif file.startswith('dog'): 28 dog_files.append(file) 29 30def train_validate_test_split(data, train_percent=.6, validate_percent=.2, seed=None): 31 np.random.seed(seed) 32 perm = np.random.permutation(np.arange(len(data))) 33 m = len(data) 34 train_end = int(train_percent * m) 35 validate_end = int(validate_percent * m) + train_end 36 37 train = np.array(data)[perm[:train_end]].copy() 38 validate = np.array(data)[perm[train_end:validate_end]].copy() 39 test = np.array(data)[perm[validate_end:]].copy() 40 return train, validate, test 41 42train_cats, val_cats, test_cats = train_validate_test_split(cat_files, seed=42, train_percent=.6, validate_percent=.2) 43train_dogs, val_dogs, test_dogs = train_validate_test_split(dog_files, seed=42, train_percent=.6, validate_percent=.2) 44 45list(map(lambda file: copyfile(src_directory + '/' + file, dataset_home + 'train/cats/' + file), train_cats)) 46list(map(lambda file: copyfile(src_directory + '/' + file, dataset_home + 'train/dogs/' + file), train_dogs)) 47list(map(lambda file: copyfile(src_directory + '/' + file, dataset_home + 'val/cats/' + file), val_cats)) 48list(map(lambda file: copyfile(src_directory + '/' + file, dataset_home + 'val/dogs/' + file), val_dogs)) 49list(map(lambda file: copyfile(src_directory + '/' + file, dataset_home + 'test/cats/' + file), test_cats)) 50list(map(lambda file: copyfile(src_directory + '/' + file, dataset_home + 'test/dogs/' + file), test_dogs)) 51 52print("Done!")

Tạo các ImageDataGenerator

Ở Keras, chúng ta có ImageDataGenerator để thực hiện tiền xử lý dữ liệu, đồng thời cung cấp dữ liệu cho cả quá trình training và testing. Ở dữ liệu training, chúng ta sử dụng thêm width_shift_range=0.1, height_shift_range=0.1, horizontal_flip=True cho data augmentation. Việc này sẽ tạo thêm dữ liệu bằng cách dịch ảnh, và lật ảnh theo chiều ngang (ảnh chó, mèo vẫn sẽ là chó, mèo khi thực hiện lật theo chiều ngang, không ảnh hưởng đến ý nghĩa ảnh). Việc chỉnh batch size được thực hiện tại đây.

1from tensorflow.keras.preprocessing.image import ImageDataGenerator 2train_datagen = ImageDataGenerator(rescale=1.0/255.0, 3 width_shift_range=0.1, height_shift_range=0.1, horizontal_flip=True) 4val_datagen = ImageDataGenerator(rescale=1.0/255.0) 5test_datagen = ImageDataGenerator(rescale=1.0/255.0) 6# prepare iterators 7train_it = train_datagen.flow_from_directory('dataset_dogs_vs_cats/train/', 8 class_mode='binary', batch_size=20, target_size=(224, 224)) 9val_it = val_datagen.flow_from_directory('dataset_dogs_vs_cats/val/', 10 class_mode='binary', batch_size=20, target_size=(224, 224)) 11test_it = test_datagen.flow_from_directory('dataset_dogs_vs_cats/test/', 12 class_mode='binary', batch_size=20, target_size=(224, 224))

1Found 15000 images belonging to 2 classes. 2Found 5000 images belonging to 2 classes. 3Found 5000 images belonging to 2 classes.

II. VGG16

Sau việc chuẩn bị dữ liệu, chúng ta thực hiện việc xây dựng mạng VGG16 từ đầu trên Keras và huấn luyện mô hình để phân loại chó mèo.

Mọi người có thể đọc paper của VGG tại đây.

vgg

Hình ảnh kiến trúc VGG - https://arxiv.org/abs/1409.1556

Chúng ta sẽ tập trung vào hình ảnh kiến trúc mạng. Hình ảnh trên mô tả các kiến trúc mạng VGG, lấy từ paper. Chúng ta sẽ implement VGG16 (tức kiến trúc có 16 lớp (16 weight layers)). Tuy nhiên vì chúng ta chỉ phân loại 2 lớp (chó, mèo), sử dụng binary cross entropy làm loss function nên cần thay FC-1000 thành 1 lớp Fully Connected kích thước 1 và activation là sigmoid. Các bạn tham khảo phần mã nguồn bên dưới.

1import tensorflow as tf 2import tensorflow.keras 3from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten 4from tensorflow.keras.layers import Conv2D 5from tensorflow.keras.layers import MaxPooling2D 6 7for gpu in tf.config.experimental.list_physical_devices('GPU'): 8 tf.compat.v2.config.experimental.set_memory_growth(gpu, True) 9 10model = Sequential() 11 12model.add(Conv2D(64, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block1_conv1', input_shape=(224, 224, 3))) 13model.add(Conv2D(64, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block1_conv2')) 14model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), name='block1_maxpool')) 15 16model.add(Conv2D(128, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block2_conv1')) 17model.add(Conv2D(128, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block2_conv2')) 18model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), name='block2_maxpool')) 19 20model.add(Conv2D(256, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block3_conv1')) 21model.add(Conv2D(256, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block3_conv2')) 22model.add(Conv2D(256, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block3_conv3')) 23model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), name='block3_maxpool')) 24 25model.add(Conv2D(512, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block4_conv1')) 26model.add(Conv2D(512, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block4_conv2')) 27model.add(Conv2D(512, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block4_conv3')) 28model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), name='block4_maxpool')) 29 30model.add(Conv2D(512, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block5_conv1')) 31model.add(Conv2D(512, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block5_conv2')) 32model.add(Conv2D(512, (3,3), activation="relu", padding="same", kernel_initializer='he_uniform', name='block5_conv3')) 33model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), name='block5_maxpool')) 34 35model.add(Flatten()) 36model.add(Dense(4096, activation='relu')) 37model.add(Dense(4096, activation='relu')) 38model.add(Dense(1, activation='sigmoid')) 39 40def optimizer_init_fn(): 41 learning_rate = 1e-4 42 return tf.keras.optimizers.Adam(learning_rate) 43 44model.compile(optimizer=optimizer_init_fn(), 45 loss='binary_crossentropy', 46 metrics=['accuracy']) 47 48# Fit model (training) 49history = model.fit_generator(train_it, steps_per_epoch=len(train_it), 50 validation_data=val_it, validation_steps=len(val_it), epochs=50, verbose=1) 51

1Epoch 1/20 2750/750 [==============================] - 274s 365ms/step - loss: 0.6539 - accuracy: 0.6133 - val_loss: 0.5673 - val_accuracy: 0.7050 3Epoch 2/20 4750/750 [==============================] - 272s 362ms/step - loss: 0.5619 - accuracy: 0.7081 - val_loss: 0.5001 - val_accuracy: 0.7540 5Epoch 3/20 6750/750 [==============================] - 272s 362ms/step - loss: 0.4555 - accuracy: 0.7885 - val_loss: 0.4330 - val_accuracy: 0.7998 7Epoch 4/20 8750/750 [==============================] - 273s 363ms/step - loss: 0.3532 - accuracy: 0.8419 - val_loss: 0.3022 - val_accuracy: 0.8756 9Epoch 5/20 10750/750 [==============================] - 273s 364ms/step - loss: 0.2720 - accuracy: 0.8849 - val_loss: 0.2449 - val_accuracy: 0.8976 11Epoch 6/20 12750/750 [==============================] - 274s 365ms/step - loss: 0.2193 - accuracy: 0.9089 - val_loss: 0.2297 - val_accuracy: 0.9058 13Epoch 7/20 14750/750 [==============================] - 273s 363ms/step - loss: 0.1929 - accuracy: 0.9210 - val_loss: 0.2055 - val_accuracy: 0.9112 15Epoch 8/20 16750/750 [==============================] - 272s 363ms/step - loss: 0.1651 - accuracy: 0.9317 - val_loss: 0.2105 - val_accuracy: 0.9174 17Epoch 9/20 18750/750 [==============================] - 275s 367ms/step - loss: 0.1487 - accuracy: 0.9402 - val_loss: 0.1920 - val_accuracy: 0.9192 19Epoch 10/20 20750/750 [==============================] - 272s 362ms/step - loss: 0.1421 - accuracy: 0.9423 - val_loss: 0.1536 - val_accuracy: 0.9370 21Epoch 11/20 22750/750 [==============================] - 271s 361ms/step - loss: 0.1211 - accuracy: 0.9508 - val_loss: 0.1670 - val_accuracy: 0.9334 23Epoch 12/20 24750/750 [==============================] - 270s 361ms/step - loss: 0.1151 - accuracy: 0.9533 - val_loss: 0.1756 - val_accuracy: 0.9322 25Epoch 13/20 26750/750 [==============================] - 270s 360ms/step - loss: 0.1077 - accuracy: 0.9581 - val_loss: 0.1545 - val_accuracy: 0.9398 27Epoch 14/20 28750/750 [==============================] - 271s 361ms/step - loss: 0.0991 - accuracy: 0.9605 - val_loss: 0.1451 - val_accuracy: 0.9418 29Epoch 15/20 30750/750 [==============================] - 270s 361ms/step - loss: 0.0908 - accuracy: 0.9640 - val_loss: 0.1666 - val_accuracy: 0.9382 31Epoch 16/20 32750/750 [==============================] - 271s 361ms/step - loss: 0.0861 - accuracy: 0.9647 - val_loss: 0.1828 - val_accuracy: 0.9374 33Epoch 17/20 34750/750 [==============================] - 270s 360ms/step - loss: 0.0831 - accuracy: 0.9689 - val_loss: 0.1423 - val_accuracy: 0.9406 35Epoch 18/20 36750/750 [==============================] - 270s 361ms/step - loss: 0.0726 - accuracy: 0.9713 - val_loss: 0.1512 - val_accuracy: 0.9404 37Epoch 19/20 38750/750 [==============================] - 271s 361ms/step - loss: 0.0693 - accuracy: 0.9716 - val_loss: 0.1506 - val_accuracy: 0.9454 39Epoch 20/20 40750/750 [==============================] - 272s 362ms/step - loss: 0.0695 - accuracy: 0.9732 - val_loss: 0.1505 - val_accuracy: 0.9496

Kết quả

Ở bước này, chúng ta visualize lại toàn bộ quá trình training, đồng thời test model trên tập testing (20% chia ra từ bộ dữ liệu ban đầu). Sau khi nhận được đồ thị visualize quá trình training, chúng ta có thể sử dụng để tinh chỉnh các tham số như batch size, learning rate bằng cách quay lại các bước trên.

1# Baseline model with data augmentation for the dogs vs cats dataset 2import sys 3from matplotlib import pyplot as plt 4 5# Plot diagnostic learning curves 6def summarize_diagnostics(history): 7 8 plt.plot(history.history['accuracy']) 9 plt.plot(history.history['val_accuracy']) 10 plt.title('model accuracy') 11 plt.ylabel('accuracy') 12 plt.xlabel('epoch') 13 plt.legend(['train', 'val'], loc='upper left') 14 plt.show() 15 16 plt.plot(history.history['loss']) 17 plt.plot(history.history['val_loss']) 18 plt.title('model loss') 19 plt.ylabel('loss') 20 plt.xlabel('epoch') 21 plt.legend(['train', 'val'], loc='upper left') 22 plt.show() 23 24# Evaluate model 25_, acc = model.evaluate_generator(test_it, steps=len(test_it), verbose=1) 26print('> %.3f' % (acc * 100.0)) 27 28# Learning curves 29summarize_diagnostics(history)

250/250 [==============================] - 25s 99ms/step - loss: 0.1411 - accuracy: 0.9534 > 95.340

Training History

Vậy là chúng ta đã có thể code lại VGG16 và áp dụng phân loại chó mèo với độ chính xác lên đến 95.34 % !! Các bạn có thể áp dụng thêm các kỹ thuật khác như thêm một số layer Dropout để tăng độ chính xác và giảm overfiting cho mô hình.

Mọi góp ý cho bài viết các bạn hãy viết ở dưới comment nhé. Xin cảm ơn!