Recently, a post touting Numba was published. However, the example—a for loop—is unconvincing.
The example:
def monotonically_increasing(a):
max_value = 0
for i in range(len(a)):
if a[i] > max_value:
max_value = a[i]
a[i] = max_value
This converts an array such as
[1, 2, 1, 3, 3, 5, 4, 6]
to
[1, 2, 2, 3, 3, 5, 5, 6]
The code above will be hopelessly slow. To see what we're up against, consider the problem in R: since Python (like R) is dynamically typed, <
needs to pick an implementation with
every iteration. This overhead is avoided when using vector
operations instead of element-wise operations; in J the type is associated with an array and we need not extract an element, viz.
>. /\ 1 2 1 3 3 5 4 6
Language/Library | Time |
---|---|
J (Linux) | 30ms |
J (Windows) | 20ms |
NumPy | 24 ms |
BQN | 6 ms |
Apple | 37 ms |
Python | 1.32 s |
Numba | 170 ms |
I'm not really sure what Numba does under the hood, nor what it would take to
perform as well as J. I don't really see the point though: >. /\
is far more concise
than monotonically_increasing
anyway. Numba does in fact allow
us to write for-loops instead of dropping down to a low-level language, but we
can get even better performance with clearer code via scans.
In general, one should avoid for loops when working with arrays.