Flux in Julia/Flux in Julia

Autoencoder (ver. Flux in Julia)

딥스탯 2018. 12. 5. 12:13
Autoencoder (ver. Flux in Julia)

DATA SET 출처

참고자료

https://deepstat.tistory.com/15 (Autoencoder ver.R)

https://deepstat.tistory.com/31 (Autoencoder ver.Python)

http://fluxml.ai/ (Flux in Julia)

Autoencoder (ver. Flux in Julia)

Iris

In [1]:
using RDatasets

iris = dataset("datasets", "iris")
first(iris, 6)
Out[1]:

6 rows × 5 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpecies
Float64Float64Float64Float64Categorical…
15.13.51.40.2setosa
24.93.01.40.2setosa
34.73.21.30.2setosa
44.63.11.50.2setosa
55.03.61.40.2setosa
65.43.91.70.4setosa
┌ Warning: In the future eachcol will have names argument set to false by default
│   caller = getmaxwidths(::DataFrame, ::UnitRange{Int64}, ::UnitRange{Int64}, ::Symbol) at show.jl:105
└ @ DataFrames /home/yt/.julia/packages/DataFrames/5Rg4Y/src/abstractdataframe/show.jl:105
In [2]:
describe(iris)
Out[2]:

5 rows × 8 columns

variablemeanminmedianmaxnuniquenmissingeltype
SymbolUnion…AnyUnion…AnyUnion…NothingDataType
1SepalLength5.843334.35.87.9Float64
2SepalWidth3.057332.03.04.4Float64
3PetalLength3.7581.04.356.9Float64
4PetalWidth1.199330.11.32.5Float64
5Speciessetosavirginica3CategoricalString{UInt8}

Handling the data set

In [3]:
using DataFrames, Random, StatsBase

Random.seed!(1)
test_obs = sample(vcat(repeat([false],100), repeat([true],50)),150;replace = false)

training_set = iris[.!test_obs,:]
testing_set = iris[test_obs,:]

training_X = training_set[:,1:(end-1)]
testing_X = testing_set[:,1:(end-1)]

describe(training_X)
┌ Warning: Indexing with colon as row will create a copy in the future. Use `df[col_inds]` to get the columns without copying
│   caller = top-level scope at In[3]:8
└ @ Core In[3]:8
┌ Warning: Indexing with colon as row will create a copy in the future. Use `df[col_inds]` to get the columns without copying
│   caller = top-level scope at In[3]:10
└ @ Core In[3]:10
Out[3]:

4 rows × 8 columns

variablemeanminmedianmaxnuniquenmissingeltype
SymbolFloat64Float64Float64Float64NothingNothingDataType
1SepalLength5.7974.35.77.7Float64
2SepalWidth3.0382.03.04.4Float64
3PetalLength3.6371.04.156.9Float64
4PetalWidth1.140.11.32.5Float64
In [4]:
describe(testing_X)
Out[4]:

4 rows × 8 columns

variablemeanminmedianmaxnuniquenmissingeltype
SymbolFloat64Float64Float64Float64NothingNothingDataType
1SepalLength5.9364.46.07.9Float64
2SepalWidth3.0962.33.14.1Float64
3PetalLength4.01.24.56.7Float64
4PetalWidth1.3180.11.52.5Float64
In [5]:
t_train_X = transpose(Matrix(training_X))
t_test_X = transpose(Matrix(testing_X));

Autoencoder

In [6]:
using Flux
loaded
In [7]:
Encoder = Chain(
    BatchNorm(4), Dense(4,8,relu),
    BatchNorm(8), Dense(8,8,relu),
    Dense(8,3),softmax);
In [8]:
Decoder = Chain(
    BatchNorm(3), Dense(3,8,relu),
    BatchNorm(8), Dense(8,8,relu),
    Dense(8,4));

Loss function : cross-entropy

In [9]:
loss(x,y) = Flux.mse(Decoder(Encoder(x)),y)
Out[9]:
loss (generic function with 1 method)
In [10]:
Decoder(Encoder(t_train_X))
Out[10]:
Tracked 4×100 Array{Float64,2}:
 -0.162099   -0.372377  -0.454352  …  -0.232662   -0.251359   -0.0967168
  0.251919    0.58286    0.068945      0.260969   -0.168124   -0.0112085
  1.75081     2.89218    1.41309       0.0518009   0.375284    0.153732 
  0.0789694   0.472873   0.294052      0.208443   -0.0506767  -0.0204917
In [11]:
loss(t_train_X,t_train_X)
Out[11]:
15.332198141583635 (tracked)

Optimizer : ADAM

In [12]:
PARS = params(Encoder, Decoder)

function my_opt(n, lr)
    train_mse_vec = repeat([Inf],3)
    for i = 0:n
        Flux.testmode!(Encoder, false)
        Flux.testmode!(Decoder, false)
        Flux.train!(loss, [(t_train_X,t_train_X)], ADAM(PARS, lr))
        Flux.testmode!(Encoder)
        Flux.testmode!(Decoder)
        train_mse_vec = vcat(train_mse_vec[2:3],loss(t_train_X,t_train_X).data)
        
        if minimum(train_mse_vec) == train_mse_vec[1]
            lr = lr*7/8
        end
        
        if i % 100 == 0
            train_mse = train_mse_vec[3]
            test_mse = loss(t_test_X,t_test_X).data
            println("step:",i,"  train_mse:" ,train_mse,"  test_mse:" ,test_mse)
        end
    end
end
Out[12]:
my_opt (generic function with 1 method)
In [13]:
my_opt(0,0.1)
step:0  train_mse:13.796226442073687  test_mse:14.961258597556458
In [14]:
my_opt(2000,0.1)
step:0  train_mse:8.40278102746566  test_mse:8.990449973517801
step:100  train_mse:0.10336679399328083  test_mse:0.12418334456307464
step:200  train_mse:0.06439854270562317  test_mse:0.07924742245307638
step:300  train_mse:0.04932304110081396  test_mse:0.06359110517905284
step:400  train_mse:0.046078532170465356  test_mse:0.06042383663352165
step:500  train_mse:0.04317289761957186  test_mse:0.057873533114553705
step:600  train_mse:0.04189446208133944  test_mse:0.05678281717037904
step:700  train_mse:0.04182211592346098  test_mse:0.05672693205736499
step:800  train_mse:0.04181875954114598  test_mse:0.056724347017934044
step:900  train_mse:0.041818615590076845  test_mse:0.056724244811515376
step:1000  train_mse:0.041818607228812345  test_mse:0.05672423855979516
step:1100  train_mse:0.041818606905030586  test_mse:0.0567242383105123
step:1200  train_mse:0.04181860689034915  test_mse:0.056724238299333186
step:1300  train_mse:0.04181860688970179  test_mse:0.056724238298832864
step:1400  train_mse:0.04181860688966948  test_mse:0.05672423829880755
step:1500  train_mse:0.0418186068896679  test_mse:0.05672423829880658
step:1600  train_mse:0.04181860688966787  test_mse:0.05672423829880649
step:1700  train_mse:0.04181860688966787  test_mse:0.05672423829880649
step:1800  train_mse:0.04181860688966787  test_mse:0.05672423829880649
step:1900  train_mse:0.04181860688966787  test_mse:0.05672423829880649
step:2000  train_mse:0.04181860688966787  test_mse:0.05672423829880649

comparing the values

In [15]:
t_train_X[:,1:5]
Out[15]:
4×5 Array{Float64,2}:
 4.7  4.6  5.0  4.6  5.0
 3.2  3.1  3.6  3.4  3.4
 1.3  1.5  1.4  1.4  1.5
 0.2  0.2  0.2  0.3  0.2
In [16]:
Decoder(Encoder(t_train_X[:,1:5]))
Out[16]:
Tracked 4×5 Array{Float64,2}:
 4.73134   4.53522   5.06883   4.74072   4.92413 
 3.19971   3.03111   3.48908   3.20775   3.36502 
 1.36919   1.32415   1.44573   1.37132   1.41294 
 0.270484  0.287205  0.241016  0.269669  0.253668
In [17]:
t_test_X[:,1:5]
Out[17]:
4×5 Array{Float64,2}:
 5.1  4.9  5.4  4.4  5.4
 3.5  3.0  3.9  2.9  3.7
 1.4  1.4  1.7  1.4  1.5
 0.2  0.2  0.4  0.2  0.2
In [18]:
Decoder(Encoder(t_test_X[:,1:5]))
Out[18]:
Tracked 4×5 Array{Float64,2}:
 5.04881   4.64263   5.33464   4.44048   5.30766 
 3.47192   3.1235    3.71694   2.94921   3.69381 
 1.4412    1.34889   1.50596   1.3018    1.49985 
 0.242767  0.278101  0.217763  0.294856  0.220123

About the encoder

In [19]:
Encoder(t_train_X)
Out[19]:
Tracked 3×100 Array{Float64,2}:
 3.59821e-5  0.00030409  7.04934e-7  …  0.204828  0.186202  0.201288
 0.251023    0.293336    0.177192       0.674414  0.64017   0.675401
 0.748941    0.70636     0.822807       0.120758  0.173628  0.123311
In [20]:
tmp = DataFrame(Enc = Flux.onecold(Encoder(t_train_X)), spe = training_set[:,5])

by(tmp, [:Enc, :spe], nrow)
┌ Warning: Indexing with colon as row will create a copy in the future. Use `df[col_inds]` to get the columns without copying
│   caller = top-level scope at In[20]:1
└ @ Core In[20]:1
Out[20]:

4 rows × 3 columns

Encspex1
Int64Categorical…Int64
13setosa36
22versicolor33
32virginica29
41virginica2
┌ Warning: In the future eachcol will have names argument set to false by default
│   caller = getmaxwidths(::DataFrame, ::UnitRange{Int64}, ::UnitRange{Int64}, ::Symbol) at show.jl:105
└ @ DataFrames /home/yt/.julia/packages/DataFrames/5Rg4Y/src/abstractdataframe/show.jl:105
In [21]:
tmp2 = DataFrame(Enc = Flux.onecold(Encoder(t_test_X)), spe = testing_set[:,5])

by(tmp2, [:Enc, :spe], nrow)
┌ Warning: Indexing with colon as row will create a copy in the future. Use `df[col_inds]` to get the columns without copying
│   caller = top-level scope at In[21]:1
└ @ Core In[21]:1
Out[21]:

4 rows × 3 columns

Encspex1
Int64Categorical…Int64
13setosa14
22versicolor17
32virginica17
41virginica2
In [ ]: