Real Data Analysis

FASHION MNIST with Python (DAY 7) - MLP

딥스탯 2018. 8. 22. 23:51
FASHION_MNIST_DAY7_with_Python

FASHION MNIST with Python (DAY 7)

DATA SOURCE : https://www.kaggle.com/zalando-research/fashionmnist (Kaggle, Fashion MNIST)

FASHION MNIST with Python (DAY 1) : http://deepstat.tistory.com/35

FASHION MNIST with Python (DAY 2) : http://deepstat.tistory.com/36

FASHION MNIST with Python (DAY 3) : http://deepstat.tistory.com/37

FASHION MNIST with Python (DAY 4) : http://deepstat.tistory.com/38

FASHION MNIST with Python (DAY 5) : http://deepstat.tistory.com/39

FASHION MNIST with Python (DAY 6) : http://deepstat.tistory.com/40

Datasets

Importing numpy, pandas, pyplot

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Loading datasets

In [2]:
data_train = pd.read_csv("..\\datasets\\fashion-mnist_train.csv")
data_test = pd.read_csv("..\\datasets\\fashion-mnist_test.csv")
In [3]:
data_train_y = data_train.label
y_test = data_test.label
In [4]:
data_train_x = data_train.drop("label",axis=1)/256
x_test = data_test.drop("label",axis=1)/256

Spliting valid and training

In [5]:
np.random.seed(0)
valid2_idx = np.random.choice(60000,10000,replace = False)
valid1_idx = np.random.choice(list(set(range(60000)) - set(valid2_idx)),10000,replace=False)
train_idx = list(set(range(60000))-set(valid1_idx)-set(valid2_idx))

x_train = data_train_x.iloc[train_idx,:]
y_train = data_train_y.iloc[train_idx]

x_valid1 = data_train_x.iloc[valid1_idx,:]
y_valid1 = data_train_y.iloc[valid1_idx]

x_valid2 = data_train_x.iloc[valid2_idx,:]
y_valid2 = data_train_y.iloc[valid2_idx]

Multilayer Perceptron (MLP)

Importing TensorFlow

In [6]:
import tensorflow as tf
from sklearn.metrics import confusion_matrix

Defining weight_variables and bias_variables

In [7]:
def weight_variables(shape):
    initial = tf.truncated_normal(shape)
    return tf.Variable(initial)

def bias_variables(shape):
    initial = tf.truncated_normal(shape)
    return tf.Variable(initial)    

Constructing the MLP

leaky ReLU, Dropout, Maxout, Batch Normalization, softmax, cross entropy, Adam

  • Model : input -> [inner product -> leaky_relu -> dropout] -> [batch normalization -> inner product -> reshape -> maxout -> dropout] -> [inner product -> leaky_relu -> dropout] -> [batch Normalization -> inner product -> reshape -> maxout -> softmax] -> output

  • Loss : cross entropy

  • Optimizer : Adam

Inputs

In [8]:
x = tf.placeholder("float", [None,784])
y = tf.placeholder("int64", [None,])
y_dummies = tf.one_hot(y,depth = 10)

drop_prob = tf.placeholder("float")
training = tf.placeholder("bool")

Layer1

[inner product -> leaky_relu -> dropout]

In [9]:
l1_w = weight_variables([784,640])
l1_b = bias_variables([640])
l1_inner_product = tf.matmul(x, l1_w) + l1_b
l1_leaky_relu = tf.nn.leaky_relu(l1_inner_product)
l1_dropout = tf.layers.dropout(l1_leaky_relu,rate = drop_prob, training = training)

Layer2

[batch normalization -> inner product -> reshape -> maxout -> dropout]

In [10]:
l2_w = weight_variables([640,640])
l2_b = bias_variables([640])
l2_batch_normalization = tf.layers.batch_normalization(l1_dropout, training = training)
l2_inner_product = tf.matmul(l2_batch_normalization, l2_w) + l2_b
l2_reshape = tf.reshape(l2_inner_product,[-1,80,8])
l2_maxout = tf.reshape(
    tf.contrib.layers.maxout(l2_reshape,num_units=1),
    [-1,80])
l2_dropout = tf.layers.dropout(l2_maxout,rate = drop_prob, training = training)

Layer3

[inner product -> leaky_relu -> dropout]

In [11]:
l3_w = weight_variables([80,80])
l3_b = bias_variables([80])
l3_inner_product = tf.matmul(l2_dropout, l3_w) + l3_b
l3_leaky_relu = tf.nn.leaky_relu(l3_inner_product)
l3_dropout = tf.layers.dropout(l3_leaky_relu,rate = drop_prob, training = training)

Layer4

[batch normalization -> inner product -> reshape -> maxout -> softmax]

In [12]:
l4_w = weight_variables([80,80])
l4_b = bias_variables([80])
l4_batch_normalization = tf.layers.batch_normalization(l3_dropout, training = training)
l4_inner_product = tf.matmul(l4_batch_normalization, l4_w) + l4_b
l4_reshape = tf.reshape(l4_inner_product,[-1,10,8])
l4_maxout = tf.reshape(
    tf.contrib.layers.maxout(l4_reshape,num_units=1),
    [-1,10])
l4_log_softmax = tf.nn.log_softmax(l4_maxout)

Cross-entropy

In [13]:
xent_loss = -tf.reduce_sum( tf.multiply(y_dummies,l4_log_softmax) )

Accuracy

In [14]:
pred_labels = tf.argmax(l4_log_softmax,axis=1)
acc = tf.reduce_mean(tf.cast(tf.equal(y, pred_labels),"float"))

Training the Model

In [15]:
lr = tf.placeholder("float")
train_step = tf.train.AdamOptimizer(lr).minimize(xent_loss)
In [16]:
saver = tf.train.Saver()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
In [17]:
batch_size = 64
for i in range(200001):
    batch_obs = np.random.choice(x_train.shape[0],batch_size,replace=False)
    batch_train_x = x_train.iloc[batch_obs]
    batch_train_y = y_train.iloc[batch_obs]
    feed_dict = {x : batch_train_x, y : batch_train_y, drop_prob : .15, training : True, lr : 0.01}
    _, tmp = sess.run([train_step,xent_loss], feed_dict = feed_dict)
    
    if i % 10000 == 0:
        print("step " + str(i) + " training cross-entropy : " + str(tmp))
    
    if i % 40000 == 0:
        feed_dict = {x : x_train, y : y_train, drop_prob : .15, training : False}
        train_acc = sess.run(acc, feed_dict = feed_dict)
        feed_dict = {x : x_valid1, y : y_valid1, drop_prob : .15, training : False}
        valid1_acc = sess.run(acc, feed_dict = feed_dict)
        print("step " + str(i) + " training_acc = " + str(train_acc) + " valid_acc = " + str(valid1_acc))
        save_path = saver.save(sess, "./MLP/model.ckpt")
        print("Model saved in path: " + save_path)
step 0 training cross-entropy : 627.51556
step 0 training_acc = 0.102275 valid_acc = 0.1025
Model saved in path: ./MLP/model.ckpt
step 10000 training cross-entropy : 16.393414
step 20000 training cross-entropy : 3.653322
step 30000 training cross-entropy : 1.637823
step 40000 training cross-entropy : 11.408547
step 40000 training_acc = 0.9137 valid_acc = 0.8486
Model saved in path: ./MLP/model.ckpt
step 50000 training cross-entropy : 5.169942
step 60000 training cross-entropy : 1.0102936
step 70000 training cross-entropy : 7.9847817
step 80000 training cross-entropy : 1.5986788
step 80000 training_acc = 0.8801 valid_acc = 0.8151
Model saved in path: ./MLP/model.ckpt
step 90000 training cross-entropy : 6.7976327
step 100000 training cross-entropy : 0.26217932
step 110000 training cross-entropy : 1.1882036
step 120000 training cross-entropy : 0.4843976
step 120000 training_acc = 0.955025 valid_acc = 0.8604
Model saved in path: ./MLP/model.ckpt
step 130000 training cross-entropy : 2.9634778
step 140000 training cross-entropy : 0.64811546
step 150000 training cross-entropy : 0.048579566
step 160000 training cross-entropy : 0.21984404
step 160000 training_acc = 0.96965 valid_acc = 0.8702
Model saved in path: ./MLP/model.ckpt
step 170000 training cross-entropy : 1.0292068
step 180000 training cross-entropy : 0.08035797
step 190000 training cross-entropy : 0.5936426
step 200000 training cross-entropy : 1.7777281
step 200000 training_acc = 0.961925 valid_acc = 0.8643
Model saved in path: ./MLP/model.ckpt
In [18]:
batch_size = 64
for i in range(200001):
    batch_obs = np.random.choice(x_train.shape[0],batch_size,replace=False)
    batch_train_x = x_train.iloc[batch_obs]
    batch_train_y = y_train.iloc[batch_obs]
    feed_dict = {x : batch_train_x, y : batch_train_y, drop_prob : .15, training : True, lr : 0.001}
    _, tmp = sess.run([train_step,xent_loss], feed_dict = feed_dict)
    
    if i % 10000 == 0:
        print("step " + str(i) + " training cross-entropy : " + str(tmp))
    
    if i % 40000 == 0:
        feed_dict = {x : x_train, y : y_train, drop_prob : .15, training : False}
        train_acc = sess.run(acc, feed_dict = feed_dict)
        feed_dict = {x : x_valid1, y : y_valid1, drop_prob : .15, training : False}
        valid1_acc = sess.run(acc, feed_dict = feed_dict)
        print("step " + str(i) + " training_acc = " + str(train_acc) + " valid_acc = " + str(valid1_acc))
        save_path = saver.save(sess, "./MLP/model.ckpt")
        print("Model saved in path: " + save_path)
step 0 training cross-entropy : 0.19749565
step 0 training_acc = 0.961825 valid_acc = 0.8644
Model saved in path: ./MLP/model.ckpt
step 10000 training cross-entropy : 0.0027275973
step 20000 training cross-entropy : 0.010455683
step 30000 training cross-entropy : 0.015472678
step 40000 training cross-entropy : 0.21760906
step 40000 training_acc = 0.981975 valid_acc = 0.8739
Model saved in path: ./MLP/model.ckpt
step 50000 training cross-entropy : 8.4638505e-06
step 60000 training cross-entropy : 0.0029786504
step 70000 training cross-entropy : 0.050034236
step 80000 training cross-entropy : 0.0038360865
step 80000 training_acc = 0.979325 valid_acc = 0.8714
Model saved in path: ./MLP/model.ckpt
step 90000 training cross-entropy : 0.012144463
step 100000 training cross-entropy : 0.00048363706
step 110000 training cross-entropy : 0.00011801363
step 120000 training cross-entropy : 0.0042042355
step 120000 training_acc = 0.982 valid_acc = 0.875
Model saved in path: ./MLP/model.ckpt
step 130000 training cross-entropy : 2.1696038e-05
step 140000 training cross-entropy : 0.32488036
step 150000 training cross-entropy : 3.3378572e-06
step 160000 training cross-entropy : 0.006305409
step 160000 training_acc = 0.984975 valid_acc = 0.8764
Model saved in path: ./MLP/model.ckpt
step 170000 training cross-entropy : 0.00022015534
step 180000 training cross-entropy : 0.028006688
step 190000 training cross-entropy : 8.439751e-05
step 200000 training cross-entropy : 2.217285e-05
step 200000 training_acc = 0.989475 valid_acc = 0.8823
Model saved in path: ./MLP/model.ckpt

Training Accuracy

In [19]:
feed_dict = {x : x_train, y : y_train, drop_prob : .15, training : False}
MLP_predict_train, MLP_train_acc = sess.run([pred_labels,acc], feed_dict = feed_dict)
In [20]:
print(confusion_matrix(MLP_predict_train,y_train))
print("TRAINING ACCURACY =",MLP_train_acc)
[[3989    0    0    1    2    0  145    1    0    0]
 [   0 3990    0   18    1    0    1   20    0    0]
 [   4    0 3935    5    4    0    5    1    0    0]
 [   0    0    0 3895    0    0    1    0    0    0]
 [   0    0  119    7 4009    0   61    0    0    0]
 [   0    0    0    0    0 3932    0    9    0    0]
 [   0    0    0    0    0    0 3789    0    0    0]
 [   0    0    0    0    0    0    0 4065    0    0]
 [   1    0    2    3    0    0    3    7 3946    0]
 [   0    0    0    0    0    0    0    0    0 4029]]
TRAINING ACCURACY = 0.989475

Validation Accuracy

In [21]:
feed_dict = {x : x_valid1, y : y_valid1, drop_prob : .15, training : False}
MLP_predict_valid1, MLP_valid1_acc = sess.run([pred_labels,acc], feed_dict = feed_dict)
In [22]:
print(confusion_matrix(MLP_predict_valid1,y_valid1))
print("VALIDATION ACCURACY =",MLP_valid1_acc)
[[ 921    3   16   56   12    0  191    1    7    0]
 [   4 1012    1   26    5    0    2    7    0    0]
 [  18    1  734   10   61    0   74    0    3    0]
 [  17    5   10  867   19    0   15    0    3    0]
 [   2    0  142   32  858    0  122    0    3    0]
 [   0    0    0    0    2 1043    0   34    2   11]
 [  45    2   33   10   26    0  558    0    1    0]
 [   0    0    0    0    0    5    0  869    0   17]
 [   8    2    9   10   12    5   25    6 1015    4]
 [   0    1    0    1    0    7    0   31    0  946]]
VALIDATION ACCURACY = 0.8823
In [23]:
{"TRAIN_ACC" : MLP_train_acc , "VALID_ACC" : MLP_valid1_acc}
Out[23]:
{'TRAIN_ACC': 0.989475, 'VALID_ACC': 0.8823}