I ported a Python tutorial to J.
J has libraries to read numpy arrays from file; we first export the relevant data, viz.
import numpy as np
import tensorflow\_datasets as tfds
train = tfds.load('mnist', split='train', as\_supervised=True, batch\_size=-1) # as\_supervised and batch\_size are cargo-culted to make it export to numpy
train\_image\_data, train\_labels = tfds.as\_numpy(train)
train\_images = train\_image\_data.astype('float64')/255
\# normalize + export to float64 for J
np.save("data/train-images.npy", train\_images)
np.save("data/train-labels.npy", train\_labels)
test = tfds.load('mnist', split='test', as\_supervised=True, batch\_size=-1)
test\_image\_data, test\_labels = tfds.as\_numpy(test)
test\_images = test\_image\_data.astype('float64')/255
np.save("data/test-images.npy", test\_images)
np.save("data/test-labels.npy", test\_labels)
Then:
load 'convert/numpy'
NB. raze to a 60000x784 array, instead of 28x28x1 images they're vectors of 784
train\_images =: ,"3 readnpy\_pnumpy\_ 'data/train-images.npy'
NB. convert labels (1, 2, etc.) to vectors (0 1 0 0 ...)
vectorize =: 3 : '=&y (i.10)'
train\_labels =: vectorize"0 readnpy\_pnumpy\_ 'data/train-labels.npy'
test\_images =: ,"3 readnpy\_pnumpy\_ 'data/test-images.npy'
test\_labels =: vectorize"0 readnpy\_pnumpy\_ 'data/test-labels.npy'
X =: train\_images
Y =: train\_labels
NB. move (0,1) -> (-1, 1)
scale =: (-&1)@:(\*&2)
NB. initialize weights randomly
init\_weights =: 3 : 'scale"0 y ?@$ 0'
w\_hidden =: init\_weights 784 128
w\_output =: init\_weights 128 10
weights\_init =: w\_hidden;w\_output
dot =: +/ . \*
mmax =: (]->./)
NB. softmax that won't blow up https://cs231n.github.io/linear-classify/#softmax
softmax =: ((^ % (+/@:^)) @: mmax)
d\_softmax =: (([\*(1&-)) @: softmax @: mmax)
sigmoid =: monad define
% 1 + ^ - y
)
sigmoid\_ddx =: 3 : '(^-y) % ((1+^-y)^2)'
NB. forward prop
forward =: dyad define
'l1 l2' =. x
X =. y
x\_l1 =. X dot l1
x\_sigmoid =. sigmoid x\_l1
x\_l2 =. x\_sigmoid dot l2
prediction =. softmax"1 x\_l2
(x\_l1;x\_l2;x\_sigmoid;prediction)
)
train =: dyad define
'X Y' =. x
'l1 l2' =. y
'x\_l1 x\_l2 x\_sigmoid prediction' =. y forward X
l2\_err =. (2 \* (Y - prediction) % {.$prediction) \* (d\_softmax"1 x\_l2)
l1\_err =. (|: l2 dot (|: l2\_err)) \* (sigmoid\_ddx"1 x\_l1)
l2\_adj =. l2 + (|: x\_sigmoid) dot l2\_err
l1\_adj =. l1 + (|: X) dot l1\_err
(l1\_adj;l2\_adj)
)
train\_mnist =: (X;Y) & train
NB. smooth out a guess into a canonical estimate
pickmax =: monad define
max =. >./ y
=&max y
)
eq\_arr1 =: \*/ @: =
point\_accuracy =: monad define
(+/ (pickmax"1 y) eq\_arr1"1 test\_labels) % {.$ test\_labels
)
NB. to store weights
encode =: 3!:1
w\_train =: (train\_mnist ^: 10000) weights\_init
NB. guess =: >3 { w\_train forward test\_images
NB. point\_accuracy guess
NB.
(encode w\_train) fwrite 'weights.jdat'
This happens to be 95% accurate! So this is a good result despite the fact that training a neural net on CPU is not so satisfying.
It also happens to be substantially slower than using plain NumPy, presumably since it only uses one core.
