### Install Arraymancer using Nimble

Source: https://github.com/mratsim/arraymancer/blob/master/docs/index.rst

Instructions for installing the Arraymancer library and all its dependencies using the Nim package manager, nimble, after Nim has been installed via choosenim.

```Shell
nimble install arraymancer
```

--------------------------------

### Arraymancer Tensor Pretty Print Example

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.first_steps.rst

This example demonstrates how a multi-dimensional tensor with shape [2, 3, 4, 3, 2] is pretty-printed in Arraymancer. It illustrates the horizontal and vertical stacking of dimensions, the use of separators ('|' and '-'), and the indexing of each dimension layer for clarity.

```Nim
let t1 = toSeq(1..144).toTensor().reshape(2,3,4,3,2)
# Tensor[system.int] of shape "[2, 3, 4, 3, 2]" on backend "Cpu"
#                           0                            |                            1
#        0            1            2            3        |         0            1            2            3
#   |1        2| |7        8| |13      14| |19      20|  |    |73      74| |79      80| |85      86| |91      92|
# 0 |3        4| |9       10| |15      16| |21      22|  |  0 |75      76| |81      82| |87      88| |93      94|
#   |5        6| |11      12| |17      18| |23      24|  |    |77      78| |83      84| |89      90| |95      96|
#   ---------------------------------------------------  |    ---------------------------------------------------
#        0            1            2            3        |         0            1            2            3
#   |25      26| |31      32| |37      38| |43      44|  |    |97      98| |103    104| |109    110| |115    116|
# 1 |27      28| |33      34| |39      40| |45      46|  |  1 |99     100| |105    106| |111    112| |117    118|
#   |29      30| |35      36| |41      42| |47      48|  |    |101    102| |107    108| |113    114| |119    120|
#   ---------------------------------------------------  |    ---------------------------------------------------
#        0            1            2            3        |         0            1            2            3
#   |49      50| |55      56| |61      62| |67      68|  |    |121    122| |127    128| |133    134| |139    140|
# 2 |51      52| |57      58| |63      64| |69      70|  |  2 |123    124| |129    130| |135    136| |141    142|
#   |53      54| |59      60| |65      66| |71      72|  |    |125    126| |131    132| |137    138| |143    144|
#   ---------------------------------------------------  |    ---------------------------------------------------
```

--------------------------------

### Accessing and Modifying Arraymancer Tensor Elements

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.first_steps.rst

This example demonstrates how to access and modify individual elements within an Arraymancer tensor using array bracket notation. It shows how to retrieve a specific value and then update it, illustrating the mutable nature of tensor elements.

```nim
var a = toSeq(1..24).toTensor().reshape(2,3,4)

echo a
# Tensor[system.int] of shape "[2, 3, 4]" on backend "Cpu"
#           0                      1
# |1      2     3     4| |13    14    15    16|
# |5      6     7     8| |17    18    19    20|
# |9     10    11    12| |21    22    23    24|

echo a[1, 1, 1]
# 18

a[1, 1, 1] = 999
echo a
# Tensor[system.int] of shape "[2, 3, 4]" on backend "Cpu"
#             0                          1
# |1        2      3      4| |13      14     15     16|
# |5        6      7      8| |17     999     19     20|
# |9       10     11     12| |21      22     23     24|
```

--------------------------------

### Arraymancer Tensor Slicing Syntax Examples

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.slicing.rst

Illustrates various tensor slicing techniques in Arraymancer, including basic range slicing, exclusive end ranges, full dimension selection, partial spans, negative indexing from the end, stepping, negative steps, and reversing dimensions.

```nim
import arraymancer

let foo = vandermonde(arange(1, 6), arange(1, 6)).asType(int)

echo foo

# Tensor[int] of shape "[5, 5]" on backend "Cpu"
# |1       1        1       1        1|
# |2       4        8      16       32|
# |3       9       27      81      243|
# |4      16       64     256     1024|
# |5      25      125     625     3125|

echo foo[1..2, 3..4] # slice

# Tensor[int] of shape "[2, 2]" on backend "Cpu"
# |16      32|
# |81     243|

echo foo[1..<3, 3..<5] # use "..<" if you do not want to include the end in the slice

# Tensor[int] of shape "[2, 2]" on backend "Cpu"
# |16      32|
# |81     243|

echo foo[_, 3..4] # Span slice (i.e. "_") means "all items" in the dimension (in this case "all rows")
                  # Note that "_" is equivalent (and preferred) to "_.._"

# Tensor[system.int] of shape "[5, 2]" on backend "Cpu"
# |1          1|
# |16        32|
# |81       243|
# |256     1024|
# |625     3125|

echo foo[3.._, _] # Partial span slice (".._" means "until the end")

# Tensor[system.int] of shape "[2, 5]" on backend "Cpu"
# |4         16      64     256    1024|
# |5         25     125     625    3125|

echo foo[_..2, _] # Partial span slice ("_.." means "from the beginning" and is rarely useful)

# Tensor[system.int] of shape "[3, 5]" on backend "Cpu"
# |1        1      1      1      1|
# |2        4      8     16     32|
# |3        9     27     81    243|

echo foo[1..^3, _] # Slice until the 3rd element from the end (inclusive, consistent with Nim,
                   # cannot be combined with "..<")

# Tensor[system.int] of shape "[3, 5]" on backend "Cpu"
# |2        4      8     16     32|
# |3        9     27     81    243|

echo foo[_|2, _] # Take steps of 2 to get all the rows in the even positions

# Tensor[system.int] of shape "[3, 5]" on backend "Cpu"
# |1          1       1       1       1|
# |3          9      27      81     243|
# |5         25     125     625    3125|

echo foo[1.._|2, _] # Take steps of 2 starting on the second element (i.e. index 1)
                    # to get all the rows in the odd positions

# Tensor[system.int] of shape "[2, 5]" on backend "Cpu"
# |2          4       8      16      32|
# |4         16      64     256    1024|

echo foo[3..1|-2, _] # Negative steps are also supported,
                     # but require a slice start that is higher than the slice end

# Tensor[system.int] of shape "[2, 5]" on backend "Cpu"
# |4         16      64     256    1024|
# |2          4       8      16      32|

echo foo[^1..^3|-1, _] # Combining "^" with negative steps is supported,
                       # and make it easy to go through a tensor from the back,
                       # but note the offset of 1 compared to positive steps
                       # (i.e. ^1 points to the last element, not the second to last)

# Tensor[system.int] of shape "[2, 5]" on backend "Cpu"
# |5         25     125     625    3125|
# |4         16      64     256    1024|
# |3          9      27      81     243|

echo foo[_|-1, _] # Combining "_" with a -1 step is the easiest way to reverse a tensor

# Tensor[int] of shape "[5, 5]" on backend "Cpu"
# |5      25      125     625     3125|
# |4      16       64     256     1024|
# |3       9       27      81      243|
# |2       4        8      16       32|
# |1       1        1       1        1|

# Note that while "_" and "_.._" are equivalent to "^1..0"
# partial slices currently do not work with negative steps
```

--------------------------------

### Iterating Arraymancer Tensors with Items and Pairs

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.iterators.rst

This snippet demonstrates how to iterate over an Arraymancer tensor using both the default `items` iterator (implicitly used when iterating directly) to get values, and the `pairs` iterator to get both coordinates and values. It shows the setup of a 3D tensor and the output format for each iteration type.

```nim
import ../arraymancer, sequtils

let a = toSeq(1..24).toTensor.reshape(2,3,4)
# Tensor[system.int] of shape "[2, 3, 4]" on backend "Cpu"
#           0                      1
# |1      2     3     4| |13    14    15    16|
# |5      6     7     8| |17    18    19    20|
# |9     10    11    12| |21    22    23    24|

for v in a:
  echo v

for coord, v in a:
  echo coord
  echo v
```

--------------------------------

### Tensor Broadcasting in Arraymancer (Nim)

Source: https://github.com/mratsim/arraymancer/blob/master/README.md

Illustrates the broadcasting mechanism in Arraymancer, allowing operations between tensors of different shapes by automatically expanding the smaller tensor. This example shows element-wise addition of a column vector and a row vector to produce a matrix.

```Nim
import arraymancer

let j = [0, 10, 20, 30].toTensor.reshape(4,1)
let k = [0, 1, 2].toTensor.reshape(1,3)

echo j +. k
# Tensor[system.int] of shape "[4, 3]" on backend "Cpu"
# |0      1     2|
# |10    11    12|
# |20    21    22|
# |30    31    32|
```

--------------------------------

### Arraymancer Logistic Sigmoid with Explicit Broadcasting

Source: https://github.com/mratsim/arraymancer/blob/master/docs/uth.speed.rst

This Nim code demonstrates Arraymancer's explicit broadcasting operators (`/.` for division, `+.` for addition) to perform a logistic sigmoid on a tensor. Although it uses Arraymancer's syntax, this approach still implicitly results in multiple loops over the data, similar to the naive Numpy example, primarily gaining parallelism.

```nim
import arraymancer

proc customSigmoid[T: SomeFloat](t: Tensor[T]): Tensor[T] =
  result = 1 /. (1 +. exp(-t))
```

--------------------------------

### Perform Accelerated Matrix-Matrix Multiplication in Nim

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.linear_algebra.rst

This snippet demonstrates how to perform accelerated matrix-matrix multiplication for float tensors using the `*` operator. The operation benefits from BLAS acceleration, which is specifically available for float data types, resulting in optimized performance. The example shows the output of a 5x5 resulting matrix.

```nim
echo foo_float * foo_float # Accelerated Matrix-Matrix multiplication (needs float)
# Tensor[float] of shape "[5, 5]" on backend "Cpu"
# |15.0         55.0      225.0       979.0       4425.0|
# |258.0      1146.0     5274.0     24810.0     118458.0|
# |1641.0     7653.0    36363.0    174945.0     849171.0|
# |6372.0    30340.0   146244.0    710980.0    3478212.0|
# |18555.0   89355.0   434205.0   2123655.0   10436805.0|
```

--------------------------------

### Permuting Tensor Dimensions in Nim

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.shapeshifting.rst

The `permute` procedure reorders the dimensions of a tensor. The input includes the tensor and the desired new dimension order. This example shows how to reorder dimensions 1 and 2.

```nim
let a = toSeq(1..24).toTensor.reshape(2,3,4)
echo a

echo a.permute(0,2,1) # dim 0 stays at 0, dim 1 becomes dim 2 and
dim 2 becomes dim 1
```

--------------------------------

### Reshaping a Tensor in Nim

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.shapeshifting.rst

The `reshape` function changes the shape of a tensor. The number of elements in the new and old shape must be the same. This example demonstrates reshaping a 1D sequence into a 3D tensor.

```nim
let a = toSeq(1..24).toTensor().reshape(2,3,4)
```

--------------------------------

### Arraymancer Fused Multiply-Add with apply3_inline

Source: https://github.com/mratsim/arraymancer/blob/master/docs/uth.speed.rst

This Nim example illustrates the `apply3_inline` template for performing a fused multiply-add operation (`C += A *. B`) on three tensors. The template allows the operation to be executed in a single loop, with `x`, `y`, and `z` corresponding to the elements of the input tensors `c`, `a`, and `b` respectively. This minimizes memory allocations and improves cache efficiency.

```nim
import arraymancer

proc fusedMultiplyAdd[T: SomeNumber](c: var Tensor[T], a, b: Tensor[T]) =
  ## Implements C += A *. B, *. is the element-wise multiply
  apply3_inline(c, a, b):
    x += y * z
```

--------------------------------

### Concatenating Tensors Along an Axis in Nim

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.shapeshifting.rst

Tensors can be concatenated along a specified axis using the `concat` procedure. This example demonstrates concatenating tensors `a`, `b`, and `c0` along axis 0, and `a`, `b`, and `c1` along axis 1.

```nim
import ../arraymancer, sequtils


let a = toSeq(1..4).toTensor.reshape(2,2)

let b = toSeq(5..8).toTensor.reshape(2,2)

let c = toSeq(11..16).toTensor
let c0 = c.reshape(3,2)
let c1 = c.reshape(2,3)

echo concat(a,b,c0, axis = 0)

echo concat(a,b,c1, axis = 1)
```

--------------------------------

### Mutate Arraymancer Tensor with Another Tensor Slice

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.slicing.rst

Illustrates mutating a tensor slice by assigning values from another slice of the same tensor. This example highlights in-place modification where the source and destination slices might overlap or interact in a non-obvious way due to the lack of temporary copies.

```Nim
foo[^2..^1,2..4] = foo[^1..^2|-1, 4..2|-1]

echo foo
# Tensor[system.int] of shape [5, 5]" on backend "Cpu"
# |111    222       1      1       1|
# |333    444       8    999     999|
# |3        9      27    999     999|
# |4       16    3125    625     125|
# |5       25     125    625    3125|
```

--------------------------------

### Define Fizzbuzz Neural Network with Arraymancer

Source: https://github.com/mratsim/arraymancer/blob/master/README.md

This snippet defines a simple fully-connected neural network, `FizzBuzzNet`, using Arraymancer. It demonstrates the use of `Linear` layers and `relu` activation for a Fizzbuzz prediction task, along with basic model initialization and optimization setup.

```Nim
import arraymancer

const
  NumDigits = 10
  NumHidden = 100

network FizzBuzzNet:
  layers:
    hidden: Linear(NumDigits, NumHidden)
    output: Linear(NumHidden, 4)
  forward x:
    x.hidden.relu.output

let
  ctx = newContext Tensor[float32]
  model = ctx.init(FizzBuzzNet)
  optim = model.optimizer(SGD, 0.05'f32)
# ....
echo answer
# @["1", "2", "fizz", "4", "buzz", "6", "7", "8", "fizz", "10",
#   "11", "12", "13", "14", "15", "16", "17", "fizz", "19", "buzz",
#   "fizz", "22", "23", "24", "buzz", "26", "fizz", "28", "29", "30",
#   "31", "32", "fizz", "34", "buzz", "36", "37", "38", "39", "40",
#   "41", "fizz", "43", "44", "fizzbuzz", "46", "47", "fizz", "49", "50",
#   "fizz", "52","53", "54", "buzz", "56", "fizz", "58", "59", "fizzbuzz",
#   "61", "62", "63", "64", "buzz", "fizz", "67", "68", "fizz", "buzz",
#   "71", "fizz", "73", "74", "75", "76", "77","fizz", "79", "buzz",
#   "fizz", "82", "83", "fizz", "buzz", "86", "fizz", "88", "89", "90",
#   "91", "92", "fizz", "94", "buzz", "fizz", "97", "98", "fizz", "buzz"]
```

--------------------------------

### Custom Fold Function for Checking All Odd Elements in Nim

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.map_reduce.rst

Provides an example of a custom Nim procedure designed for Arraymancer's `fold` operation. This function accumulates a boolean result, checking if all elements in a tensor are odd, demonstrating `fold`'s flexibility with different accumulator types.

```Nim
proc was_a_odd_and_what_about_b[T: SomeInteger](a: bool, b: T): bool =
  return a and (b mod 2 == 1) # a is the result of previous computations, b is the new integer to check.
```

--------------------------------

### Initializing Arraymancer Tensors with Various Methods

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.first_steps.rst

This section illustrates multiple ways to create and initialize Arraymancer tensors. It covers converting nested sequences/arrays using `toTensor`, creating tensors with default values using `newTensor`, and generating tensors filled with zeros or ones using `zeros`, `ones`, `zeros_like`, and `ones_like`.

```nim
import arraymancer

let c = [
              [
                [1,2,3],
                [4,5,6]
              ],
              [
                [11,22,33],
                [44,55,66]
              ],
              [
                [111,222,333],
                [444,555,666]
              ],
              [
                [1111,2222,3333],
                [4444,5555,6666]
              ]
            ].toTensor()
echo c

# Tensor[system.int] of shape "[4, 2, 3]" on backend "Cpu"
#           0                      1                      2                      3
# |1          2       3| |11        22      33| |111      222     333| |1111    2222    3333|
# |4          5       6| |44        55      66| |444      555     666| |4444    5555    6666|

let e = newTensor[bool]([2, 3])
# Tensor[bool] of shape "[2, 3]" on backend "Cpu"
# |false  false   false|
# |false  false   false|

let f = zeros[float]([4, 3])
# Tensor[float] of shape "[4, 3]" on backend "Cpu"
# |0.0    0.0     0.0|
# |0.0    0.0     0.0|
# |0.0    0.0     0.0|
# |0.0    0.0     0.0|

let g = ones[float]([4, 3])
# Tensor[float] of shape "[4, 3]" on backend "Cpu"
# |1.0    1.0     1.0|
# |1.0    1.0     1.0|
# |1.0    1.0     1.0|
# |1.0    1.0     1.0|

let tmp = [[1,2],[3,4]].toTensor()
let h = tmp.zeros_like
# Tensor[int] of shape "[2, 2]" on backend "Cpu"
# |0      0|
# |0      0|

let i = tmp.ones_like
# Tensor[int] of shape "[2, 2]" on backend "Cpu"
# |1      1|
# |1      1|
```

--------------------------------

### Accessing Arraymancer Tensor Properties

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.first_steps.rst

This snippet demonstrates how to inspect the fundamental properties of an Arraymancer tensor, including its rank, shape, strides, and offset. It shows how a tensor created from a nested sequence is represented and how these properties can be queried.

```nim
import arraymancer

let d = [[1, 2, 3], [4, 5, 6]].toTensor()

echo d
# Tensor[int] of shape "[2, 3]" on backend "Cpu"
# |1      2       3|
# |4      5       6|

echo d.rank # 2
echo d.shape # @[2, 3]
echo d.strides # @[3, 1] => Next row is 3 elements away in memory while next column is 1 element away.
echo d.offset # 0
```

--------------------------------

### Mutate Arraymancer Tensor with Nested Array or Sequence

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.slicing.rst

Demonstrates how to mutate a section of an Arraymancer tensor using a nested array or sequence. The example shows assigning a 2x2 nested array to a slice of the tensor, modifying its values in place.

```Nim
foo[0..1,0..1] = [[111, 222], [333, 444]]

echo foo
# Tensor[int] of shape "[5, 5]" on backend "Cpu"
# |111    222       1       1       1|
# |333    444       8     999     999|
# |3        9      27     999     999|
# |4       16      64     256    1024|
# |5       25     125     625    3125|
```

--------------------------------

### Implementing XOR with a Multilayer Perceptron in Arraymancer (Nim)

Source: https://github.com/mratsim/arraymancer/blob/master/docs/howto.perceptron.rst

This Nim code demonstrates how to build and train a simple multilayer perceptron (MLP) using the Arraymancer library to learn the XOR logical function. It defines a two-layer network with a ReLU activation in the hidden layer and a sigmoid output, trained using Stochastic Gradient Descent (SGD) with cross-entropy loss. The snippet illustrates data preparation, context creation, variable initialization, forward pass, loss calculation, backpropagation, and weight updates.

```Nim
import arraymancer

# Learning XOR function with a neural network.

# Autograd context / neuralnet graph
let ctx = newContext Tensor[float32]

let bsz = 32 # batch size

# We will create a tensor of size 3200 (100 batches of size 32)
# We create it as int between [0, 2[ and convert to bool
let x_train_bool = randomTensor([bsz * 100, 2], 2).astype(bool)

# Let's build our truth labels. We need to apply xor between the 2 columns of the tensors
let y_bool = x_train_bool[_,0] xor x_train_bool[_,1]

# Convert to float
let x_train = ctx.variable(x_train_bool.astype(float32), requires_grad = true)
let y = y_bool.astype(float32)

# We will build the following network:
# Input --> Linear(out_features = 3) --> relu --> Linear(out_features = 1) --> Sigmoid --> Cross-Entropy Loss

# First hidden layer of 3 neurons, shape [3 out_features, 2 in_features]
# We initialize with random weights between -1 and 1
let layer_3neurons = ctx.variable(
  randomTensor(3, 2, 2.0f) -. 1.0f,
  requires_grad = true
)

# Classifier layer with 1 neuron per feature. (In our case only one neuron overall)
# We initialize with random weights between -1 and 1
let classifier_layer = ctx.variable(
  randomTensor(1, 3, 2.0f) -. 1.0f,
  requires_grad = true
)

# Stochastic Gradient Descent
let optim = newSGD[float32](
  layer_3neurons, classifier_layer, 0.01f # 0.01 is the learning rate
)

# Learning loop
for epoch in 0..5:
  for batch_id in 0..<100:

    # minibatch offset in the Tensor
    let offset = batch_id * 32
    let x = x_train[offset ..< offset + 32, _]
    let target = y[offset ..< offset + 32, _]

    # Building the network
    let n1 = relu linear(x, layer_3neurons)
    let n2 = linear(n1, classifier_layer)
    let loss = n2.sigmoid_cross_entropy(target)

    echo "Epoch is:" & $epoch
    echo "Batch id:" & $batch_id
    echo "Loss is:" & $loss.value

    # Compute the gradient (i.e. contribution of each parameter to the loss)
    loss.backprop()

    # Correct the weights now that we have the gradient information
    optim.update()
```

--------------------------------

### Understanding In-Place Tensor Slicing Mutation in Arraymancer

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.slicing.rst

Provides a detailed breakdown of how tensor slicing mutation works in Arraymancer, demonstrating that no temporary copy is made. It shows the left-hand side (LHS) and right-hand side (RHS) slices before assignment, and then breaks down the single complex mutation into two sequential row-wise mutations to explain the final state.

```Nim
# first let's print the LHS we write to
echo foo[^2..^1, 2..4]
# Tensor[system.int] of shape [2, 3]" on backend "Cpu"
# |64     256     1024|
# |125    625     3125|

# now print the RHS we read from
echo foo[^1..^2|-1, 4..2|-1]
# Tensor[system.int] of shape [2, 3]" on backend "Cpu"
# |3125   625     125|
# |1024   256      64|

# this means we first perform this:
foo[^2, 2..4] = foo[^1, 4..2|-1]
echo foo
# Tensor[system.int] of shape [5, 5]" on backend "Cpu"
# |111    222       1      1       1|
# |333    444       8    999     999|
# |3        9      27    999     999|
# |4       16    3125    625     125|
# |5       25     125    625    3125|

# and then the following. At this step (compare output
foo[^1, 2..4] = foo[^2, 4..2|-1]
echo foo
# Tensor[system.int] of shape [5, 5]" on backend "Cpu"
# |111    222       1      1       1|
# |333    444       8    999     999|
# |3        9      27    999     999|
# |4       16    3125    625     125|
# |5       25     125    625    3125|
```

--------------------------------

### Normal Matrix Multiplication Syntax Comparison

Source: https://github.com/mratsim/arraymancer/blob/master/docs/Linear algebra notation comparison.md

Compares the syntax for standard matrix multiplication (dot product) across various numerical computing libraries and languages.

```Arraymancer
A * B
```

```Nim
A * B
```

```Julia
A * B
```

```Matlab
A * B
```

```Python
np.dot(A, B) or np.matmul(A, B) or A @ B
```

```R
A %*% B
```

```TensorFlow
tf.matmul(A, B) or A @ B
```

```Torch (Lua/C)
torch.mm(A,B) or torch.matmul(A,B)
```

```Theano
theano.tensor.dot(A, B)
```

--------------------------------

### Element-wise Matrix Multiplication (Hadamard Product) Syntax

Source: https://github.com/mratsim/arraymancer/blob/master/docs/Linear algebra notation comparison.md

Illustrates the syntax for element-wise matrix multiplication, also known as the Hadamard product, in different numerical libraries.

```Arraymancer
.*
```

```Nim
A \|*\| B
```

```Julia
.*
```

```Matlab
.*
```

```Python
np.multiply(A, B) or A * B
```

```R
A * B
```

```TensorFlow
tf.multiply(A, B)
```

```Torch (Lua/C)
torch.cmul(A, B)
```

```Theano
A * B
```

--------------------------------

### Vector-Vector Dot Product Syntax Comparison

Source: https://github.com/mratsim/arraymancer/blob/master/docs/Linear algebra notation comparison.md

Shows how to perform a dot product between two vectors using various numerical computing libraries and languages.

```Arraymancer
dot(A, B)
```

```Nim
A * B
```

```Julia
dot(A, B)
```

```Matlab
dot(A, B)
```

```Python
np.dot(A, B) or np.inner(A, B)
```

```R
A %*% B or dot(A, B)
```

```TensorFlow
tf.matmul(a, b, transpose_a=False, transpose_b=True) or tf.tensordot(a, b, 1) or tf.einsum('i,i->', x, y)
```

```Torch (Lua/C)
torch.dot(A, B) or torch.matmul(A, B)
```

```Theano
dot(A, B) or vdot(A, B) ?
```

--------------------------------

### Generate OpenCL Kernel for Element-wise Binary Operations in Nim

Source: https://github.com/mratsim/arraymancer/blob/master/docs/uth.opencl_cuda_nim.rst

This Nim template generates an OpenCL C kernel for element-wise binary infix operations (e.g., +, -). It demonstrates how Nim's metaprogramming can reduce boilerplate for OpenCL C, which lacks C++'s generics. The generated kernel processes elements using a grid-stride loop and handles tensor indexing.

```Nim
template gen_cl_apply3*(kern_name, ctype, op: string): string =
  ## Generates an OpenCL kernel for an elementwise binary infix operations (like +, -, ...)
  ## Input:
  ##   - The C type
  ##   - The C kernel name (this only helps debugging the C code)
  ##   - The C operation (+, -, ...)


  opencl_getIndexOfElementID() & """
  __kernel
  void """ & kern_name &
          """(const int rank,
              const int len,
              __global const int * restrict dst_shape,
              __global const int * restrict dst_strides,
              const int dst_offset,
              __global       """ & ctype & """ * restrict const dst_data,
              __global const int * restrict A_shape,
              __global const int * restrict A_strides,
              const int A_offset,
              __global const """ & ctype & """ * restrict const A_data,
              __global const int * restrict B_shape,
              __global const int * restrict B_strides,
              const int B_offset,
              __global const """ & ctype & """ * restrict const B_data)
  {
    // Grid-stride loop
    for (int elemID = get_global_id(0);
    elemID < len;
    elemID += get_global_size(0)) {
      const int dst_real_idx = opencl_getIndexOfElementID(rank, dst_shape, dst_strides, dst_offset, elemID);
      const int A_real_idx = opencl_getIndexOfElementID(rank, A_shape, A_strides, A_offset, elemID);
      const int B_real_idx = opencl_getIndexOfElementID(rank, B_shape, B_strides, B_offset, elemID);

      dst_data[dst_real_idx] = A_data[A_real_idx] """ & op & """ B_data[B_real_idx];
    }
  }
  """
```

--------------------------------

### Apply Element-wise Functions to Tensors using map and Universal Functions in Nim

Source: https://github.com/mratsim/arraymancer/blob/master/docs/howto.ufunc.rst

This Nim code demonstrates how to apply element-wise functions to tensors using the `map` function and universal functions. It illustrates mapping a boolean check (`isPowerOfTwo`) and converting tensor elements to floats before applying a universal function like `ln`. The output tensors show the results of these transformations.

```nim
echo foo.map(x => x.isPowerOfTwo) # map a function (`=>` comes from the future module )

# Tensor of shape 5x5 of type "bool" on backend "Cpu"
# |true   true    true    true    true|
# |true   true    true    true    true|
# |false  false   false   false   false|
# |true   true    true    true    true|
# |false  false   false   false   false|

let foo_float = foo.map(x => x.float)
echo ln foo_float # universal function (convert first to float for ln)

# Tensor of shape 5x5 of type "float" on backend "Cpu"
# |0.0    0.0     0.0     0.0     0.0|
# |0.6931471805599453     1.386294361119891       2.079441541679836       2.772588722239781       3.465735902799727|
# |1.09861228866811       2.19722457733622        3.295836866004329       4.394449154672439       5.493061443340548|
# |1.386294361119891      2.772588722239781       4.158883083359671       5.545177444479562       6.931471805599453|
# |1.6094379124341        3.218875824868201       4.828313737302302       6.437751649736401       8.047189562170502|
```

--------------------------------

### Demonstrate Implicit Broadcasting for Element-wise Addition in Arraymancer

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.broadcasting.rst

This Nim code snippet illustrates implicit broadcasting in Arraymancer using the `+.` operator for element-wise addition. It initializes two tensors, `j` (4x1) and `k` (1x3), and shows how their shapes are automatically broadcasted to perform the addition, resulting in a 4x3 tensor.

```nim
let j = [0, 10, 20, 30].toTensor.reshape(4,1)
let k = [0, 1, 2].toTensor.reshape(1,3)

echo j +. k
# Tensor[int] of shape "[4, 3]" on backend "Cpu"
# |0       1       2|
# |10     11      12|
# |20     21      22|
# |30     31      32|
```

--------------------------------

### Mapping a Tensor with a Named Function in Nim

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.map_reduce.rst

Illustrates how to apply a pre-defined, named Nim procedure to each element of a tensor using Arraymancer's `map` function. This approach promotes code reusability for common transformations.

```Nim
proc plusone[T](x: T): T =
  x + 1
a.map(plusone) # Map the function plusone
```

--------------------------------

### Matrix-Vector Multiplication Syntax Comparison

Source: https://github.com/mratsim/arraymancer/blob/master/docs/Linear algebra notation comparison.md

Presents the syntax for multiplying a matrix by a vector across different numerical computing libraries and languages.

```Arraymancer
A * B
```

```Nim
A * B
```

```Julia
A * B
```

```Matlab
A * B
```

```Python
np.dot(A, B)
```

```R
A %*% B
```

```Torch (Lua/C)
torch.mv(A, B) or torch.dot(A, B)
```

```Theano
dot(A, B) or tensordot(A,B) ?
```

--------------------------------

### Understanding Tensor Copy Semantics in Arraymancer

Source: https://github.com/mratsim/arraymancer/blob/master/docs/tuto.first_steps.rst

This snippet highlights a crucial aspect of Arraymancer tensor handling: direct assignment (`var b = a`) results in shared data, meaning modifications to `b` will also affect `a`. It emphasizes the need to explicitly use the `clone` function for a full, independent copy, aligning with behavior seen in libraries like Numpy and Julia.

```nim
let a = toSeq(1..24).toTensor().reshape(2,3,4)
    var b = a
```

--------------------------------

### Implementing FizzBuzz with Neural Networks in Nim using Arraymancer

Source: https://github.com/mratsim/arraymancer/blob/master/docs/uth.opencl_cuda_nim.rst

This Nim code snippet demonstrates how to implement the classic FizzBuzz problem using a simple neural network built with the Arraymancer library. It includes functions for binary encoding numbers, encoding FizzBuzz outputs, generating training data (numbers 101-1023), defining a neural network architecture, training the model with SGD, and finally, applying the trained model to predict FizzBuzz for numbers 1-100.

```Nim
# A port to Arraymancer of Joel Grus hilarious FizzBuzz in Tensorflow:
# http://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow/

# Interviewer: Welcome, can I get you a coffee or anything? Do you need a break?
# ...
# Interviewer: OK, so I need you to print the numbers from 1 to 100,
#              except that if the number is divisible by 3 print "fizz",
#              if it's divisible by 5 print "buzz", and if it's divisible by 15 print "fizzbuzz".

# Let's start with standard imports
import ../src/arraymancer, math, strformat

# We want to input a number and output the correct "fizzbuzz" representation
# ideally the input is a represented by a vector of real values between 0 and 1
# One way to do that is by using the binary representation of number
func binary_encode(i: int, num_digits: int): Tensor[float32] =
  result = newTensor[float32](1, num_digits)
  for d in 0 ..< num_digits:
    result[0, d] = float32(i shr d and 1)

# For the input, we distinguishes 4 cases, nothing, fizz, buzz and fizzbuzz.
func fizz_buzz_encode(i: int): int =
  if   i mod 15 == 0: return 3 # fizzbuzz
  elif i mod  5 == 0: return 2 # buzz
  elif i mod  3 == 0: return 1 # fizz
  else              : return 0

# Next, let's generate training data, we don't want to train on 1..100, that's our test values
# We can't tell the neural net the truth values it must discover the logic by itself.
# so we use values between 101 and 1024 (2^10)
const NumDigits = 10

var x_train = newTensor[float32](2^NumDigits - 101, NumDigits)
var y_train = newTensor[int](2^NumDigits - 101)

for i in 101 ..< 2^NumDigits:
  x_train[i - 101, _] = binary_encode(i, NumDigits)
  y_train[i - 101] = fizz_buzz_encode(i)

# How many neurons do we need to change a light bulb, sorry do a division? let's pick ...
const NumHidden = 100

# Let's setup our neural network context, variables and model
let
  ctx = newContext Tensor[float32]
  X   = ctx.variable x_train

network ctx, FizzBuzzNet:
  layers:
    hidden: Linear(NumDigits, NumHidden)
    output: Linear(NumHidden, 4)
  forward x:
    x.hidden.relu.output

let model = ctx.init(FizzBuzzNet)
let optim = model.optimizer(SGD, 0.05'f32)

func fizz_buzz(i: int, prediction: int): string =
  [$i, "fizz", "buzz", "fizzbuzz"][prediction]

# Phew, finally ready to train, let's pick the batch size and number of epochs
const BatchSize = 128
const Epochs    = 2500

# And let's start training the network
for epoch in 0 ..< Epochs:
  # Here I should probably shuffle the input data.
  for start_batch in countup(0, x_train.shape[0]-1, BatchSize):

    # Pick the minibatch
    let end_batch = min(x_train.shape[0]-1, start_batch + BatchSize)
    let X_batch = X[start_batch ..< end_batch, _]
    let target = y_train[start_batch ..< end_batch]

    # Go through the model
    let clf = model.forward(X_batch)

    # Go through our cost function
    let loss = clf.sparse_softmax_cross_entropy(target)

    # Backpropagate the errors and let the optimizer fix them.
    loss.backprop()
    optim.update()

  # Let's see how we fare:
  ctx.no_grad_mode:
    echo &"\nEpoch #{epoch} done. Testing accuracy"

    let y_pred = model
                  .forward(X)
                  .value
                  .softmax
                  .argmax(axis = 1)
                  .squeeze

    let score = y_pred.accuracy_score(y_train)
    echo &"Accuracy: {score:.3f}%"
    echo "\n"


# Our network is trained, let's see if it's well behaved

# Now let's use what we really want to fizzbuzz, numbers from 1 to 100
var x_buzz = newTensor[float32](100, NumDigits)
for i in 1 .. 100:
  x_buzz[i - 1, _] = binary_encode(i, NumDigits)

# Wrap them for neural net
let X_buzz = ctx.variable x_buzz

# Pass it through the network
ctx.no_grad_mode:
  let y_buzz = model
                .forward(X_buzz)
                .value
                .softmax
                .argmax(axis = 1)
                .squeeze

# Extract the answer
var answer: seq[string] = @[]

for i in 1..100:
  answer.add fizz_buzz(i, y_buzz[i - 1])

echo answer
# @["1", "2", "fizz", "4", "buzz", "6", "7", "8", "fizz", "10",
#   "11", "12", "13", "14", "15", "16", "17", "fizz", "19", "buzz",
```

--------------------------------

### Tensor Creation and Slicing in Arraymancer

Source: https://github.com/mratsim/arraymancer/blob/master/README.md

This Nim code demonstrates how to create a tensor from a sequence of sequences (representing a Vandermonde matrix) and perform basic slicing operations using Arraymancer. It shows tensor initialization, population, and how to extract sub-tensors or reverse dimensions, along with the expected output.

```Nim
import math, arraymancer

const
    x = @[1, 2, 3, 4, 5]
    y = @[1, 2, 3, 4, 5]

var
    vandermonde = newSeq[seq[int]]()
    row: seq[int]

for i, xx in x:
    row = newSeq[int]()
    vandermonde.add(row)
    for j, yy in y:
        vandermonde[i].add(xx^yy)

let foo = vandermonde.toTensor()

echo foo

# Tensor[system.int] of shape "[5, 5]" on backend "Cpu"
# |1          1       1       1       1|
# |2          4       8      16      32|
# |3          9      27      81     243|
# |4         16      64     256    1024|
# |5         25     125     625    3125|

echo foo[1..2, 3..4] # slice

# Tensor[system.int] of shape "[2, 2]" on backend "Cpu"
# |16      32|
# |81     243|

echo foo[_|-1, _] # reverse the order of the rows

# Tensor[int] of shape "[5, 5]" on backend "Cpu"
# |5      25      125     625     3125|
```