I ported a Python tutorial to J.
J has libraries to read numpy arrays from file; we first export the relevant data, viz.
import numpy as np
import tensorflow_datasets as tfds
train = tfds.load('mnist', split='train', as_supervised=True, batch_size=-1) # as_supervised and batch_size are cargo-culted to make it export to numpy
train_image_data, train_labels = tfds.as_numpy(train)
train_images = train_image_data.astype('float64')/255
# normalize + export to float64 for J
np.save("data/train-images.npy", train_images)
np.save("data/train-labels.npy", train_labels)
test = tfds.load('mnist', split='test', as_supervised=True, batch_size=-1)
test_image_data, test_labels = tfds.as_numpy(test)
test_images = test_image_data.astype('float64')/255
np.save("data/test-images.npy", test_images)
np.save("data/test-labels.npy", test_labels)
Then:
load 'convert/numpy'
NB. raze to a 60000x784 array, instead of 28x28x1 images they're vectors of 784
train_images =: ,"3 readnpy_pnumpy_ 'data/train-images.npy'
NB. convert labels (1, 2, etc.) to vectors (0 1 0 0 ...)
vectorize =: 3 : '=&y (i.10)'
train_labels =: vectorize"0 readnpy_pnumpy_ 'data/train-labels.npy'
test_images =: ,"3 readnpy_pnumpy_ 'data/test-images.npy'
test_labels =: vectorize"0 readnpy_pnumpy_ 'data/test-labels.npy'
X =: train_images
Y =: train_labels
NB. move (0,1) -> (-1, 1)
scale =: (-&1)@:(*&2)
NB. initialize weights randomly
init_weights =: 3 : 'scale"0 y ?@$ 0'
w_hidden =: init_weights 784 128
w_output =: init_weights 128 10
weights_init =: w_hidden;w_output
dot =: +/ . *
mmax =: (]->./)
NB. softmax that won't blow up https://cs231n.github.io/linear-classify/#softmax
softmax =: ((^ % (+/@:^)) @: mmax)
d_softmax =: (([*(1&-)) @: softmax @: mmax)
sigmoid =: monad define
% 1 + ^ - y
)
sigmoid_ddx =: 3 : '(^-y) % ((1+^-y)^2)'
NB. forward prop
forward =: dyad define
'l1 l2' =. x
X =. y
x_l1 =. X dot l1
x_sigmoid =. sigmoid x_l1
x_l2 =. x_sigmoid dot l2
prediction =. softmax"1 x_l2
(x_l1;x_l2;x_sigmoid;prediction)
)
train =: dyad define
'X Y' =. x
'l1 l2' =. y
'x_l1 x_l2 x_sigmoid prediction' =. y forward X
l2_err =. (2 * (Y - prediction) % {.$prediction) * (d_softmax"1 x_l2)
l1_err =. (|: l2 dot (|: l2_err)) * (sigmoid_ddx"1 x_l1)
l2_adj =. l2 + (|: x_sigmoid) dot l2_err
l1_adj =. l1 + (|: X) dot l1_err
(l1_adj;l2_adj)
)
train_mnist =: (X;Y) & train
NB. smooth out a guess into a canonical estimate
pickmax =: monad define
max =. >./ y
=&max y
)
eq_arr1 =: */ @: =
point_accuracy =: monad define
(+/ (pickmax"1 y) eq_arr1"1 test_labels) % {.$ test_labels
)
NB. to store weights
encode =: 3!:1
w_train =: (train_mnist ^: 10000) weights_init
NB. guess =: >3 { w_train forward test_images
NB. point_accuracy guess
NB.
(encode w_train) fwrite 'weights.jdat'
This happens to be 95% accurate! So this is a good result despite the fact that training a neural net on CPU is not so satisfying.
It also happens to be substantially slower than using plain NumPy, presumably since it only uses one core.