Artisanal Neural Networks

by Vanessa McHale | 2021-06-10 14:40

J has libraries to read numpy arrays from file; we first export the relevant data, viz.

import numpy as np
import tensorflow_datasets as tfds
train = tfds.load('mnist', split='train', as_supervised=True, batch_size=-1) # as_supervised and batch_size are cargo-culted to make it export to numpy
train_image_data, train_labels = tfds.as_numpy(train)
train_images = train_image_data.astype('float64')/255
# normalize + export to float64 for J
np.save("data/train-images.npy", train_images)
np.save("data/train-labels.npy", train_labels)
test = tfds.load('mnist', split='test', as_supervised=True, batch_size=-1)
test_image_data, test_labels = tfds.as_numpy(test)
test_images = test_image_data.astype('float64')/255
np.save("data/test-images.npy", test_images)
np.save("data/test-labels.npy", test_labels)

Then:

load 'convert/numpy'
NB. raze to a 60000x784 array, instead of 28x28x1 images they're vectors of 784
train_images =: ,"3 readnpy_pnumpy_ 'data/train-images.npy'
NB. convert labels (1, 2, etc.) to vectors (0 1 0 0 ...)
vectorize =: 3 : '=&y (i.10)'
train_labels =: vectorize"0 readnpy_pnumpy_ 'data/train-labels.npy'
test_images =: ,"3 readnpy_pnumpy_ 'data/test-images.npy'
test_labels =: vectorize"0 readnpy_pnumpy_ 'data/test-labels.npy'
X =: train_images
Y =: train_labels
NB. move (0,1) -> (-1, 1)
scale =: (-&1)@:(*&2)
NB. initialize weights randomly
init_weights =: 3 : 'scale"0 y ?@$ 0'
w_hidden =: init_weights 784 128
w_output =: init_weights 128 10
weights_init =: w_hidden;w_output
dot =: +/ . *
mmax =: (]->./)
NB. softmax that won't blow up https://cs231n.github.io/linear-classify/#softmax
softmax =: ((^ % (+/@:^)) @: mmax)
d_softmax =: (([*(1&-)) @: softmax @: mmax)
sigmoid =: monad define
    % 1 + ^ - y
)
sigmoid_ddx =: 3 : '(^-y) % ((1+^-y)^2)'
NB. forward prop
forward =: dyad define
    'l1 l2' =. x
    X =. y
    x_l1 =. X dot l1
    x_sigmoid =. sigmoid x_l1
    x_l2 =. x_sigmoid dot l2
    prediction =. softmax"1 x_l2
    (x_l1;x_l2;x_sigmoid;prediction)
)
train =: dyad define
    'X Y' =. x
    'l1 l2' =. y
    'x_l1 x_l2 x_sigmoid prediction' =. y forward X
    l2_err =. (2 * (Y - prediction) % {.$prediction) * (d_softmax"1 x_l2)
    l1_err =. (|: l2 dot (|: l2_err)) * (sigmoid_ddx"1 x_l1)
    l2_adj =. l2 + (|: x_sigmoid) dot l2_err
    l1_adj =. l1 + (|: X) dot l1_err
    (l1_adj;l2_adj)
)
train_mnist =: (X;Y) & train
NB. smooth out a guess into a canonical estimate
pickmax =: monad define
    max =. >./ y
    =&max y
)
eq_arr1 =: */ @: =
point_accuracy =: monad define
    (+/ (pickmax"1 y) eq_arr1"1 test_labels) % {.$ test_labels
)
NB. to store weights
encode =: 3!:1
w_train =: (train_mnist ^: 10000) weights_init
NB. guess =: >3 { w_train forward test_images
NB. point_accuracy guess
NB.
(encode w_train) fwrite 'weights.jdat'

This happens to be 95% accurate! So this is a good result despite the fact that training a neural net on CPU is not so satisfying.

It also happens to be substantially slower than using plain NumPy, presumably since it only uses one core.

return

blog

Artisanal Neural Networks