nillanet module
- class nillanet.model.NN(features, architecture, activation, derivative1, resolver, derivative2, loss, derivative3, learning_rate, scheduler=None, dtype=cupy.float32, backup='/tmp/nn.pkl', initializer=None)[source]
Bases:
object
Minimal feedforward neural network using CuPy.
This class implements batched SGD with configurable activation, resolver, and loss functions. Inputs/targets are kept on device to avoid host↔device copies.
- Parameters:
features (int) – Columnar shape of inputs.
architecture (list[int]) – Units per layer, including output layer.
activation (Callable[[cupy.ndarray], cupy.ndarray]) – Hidden-layer activation function.
derivative1 (Callable[[cupy.ndarray], cupy.ndarray]) – Derivative of
activation
evaluated at pre-activations.resolver (Callable[[cupy.ndarray], cupy.ndarray]) – Output-layer transfer function (e.g., identity, sigmoid, softmax).
derivative2 (Callable[[cupy.ndarray], cupy.ndarray]) – Derivative of
resolver
evaluated at pre-activations.loss (Callable[..., cupy.ndarray]) – Loss function that accepts named arguments (e.g.,
yhat
,y
) and returns per-sample losses or their average.derivative3 (Callable[..., cupy.ndarray]) – Derivative of
loss
with respect to predictions (same signature asloss
).learning_rate (float) – SGD step size.
scheduler (Scheduler) – Learning rate scheduler.
dtype (cupy.dtype, optional) – Floating point dtype for parameters and data. Defaults to
cupy.float32
.backup (str) – Path for saving the highest performing model during training.
initializer (Initializer) – Function for initializing weights.
- W
Layer weight matrices;
W[i]
has shape (in_features_i, out_features_i).- Type:
- batch(x, y)[source]
Run a single forward/backward/update step.
- Parameters:
x (cupy.ndarray | numpy.ndarray) – Inputs, shape (B, D) or (D,).
y (cupy.ndarray | numpy.ndarray) – Targets, shape (B, K) or (K,).
- Returns:
the predictions as a tensor
- Return type:
q
- predict(input)[source]
Run a forward pass to produce predictions.
- Parameters:
input (cupy.ndarray | numpy.ndarray) – Inputs of shape (n_samples, n_features). If NumPy, it will be moved to device.
- Returns:
Model outputs of shape (n_samples, n_outputs).
- Return type:
- train(input, output, epoch=1, epochs=1, batch=0, verbose=False, step=1000, autosave=False, minloss=[999999999])[source]
Train the model for one epoch using simple SGD.
Each epoch is a full pass over the training data. Note that your own external training loop is expected.
- Parameters:
input (cupy.ndarray | numpy.ndarray) – Training inputs of shape (n_samples, n_features). If NumPy, it will be moved to device.
output (cupy.ndarray | numpy.ndarray) – Training targets of shape (n_samples, n_outputs). If NumPy, it will be moved to device.
epoch – Step number of epoch.
epochs – Expected number of SGD steps that will be run.
batch – One of: -
1
: sample a single example per step (pure SGD) -0
: use all samples per step (full batch) ->1
and< len(Y)
: use that mini-batch size per stepverbose (bool) – Print progress to stdout.
step (int) – Print progress every
step
epochs.autosave (bool) – Save the model with the highest loss to disk.
- Raises:
SystemExit – If
batch
is invalid.
- class nillanet.activations.Activations[source]
Bases:
object
Nonlinearities for NN class. When multiple forms are available, the activation is paired with the most compatible derivative.
- linear(x)[source]
Computes the identity function for the given input.
- Parameters:
x (tensor) – The values for which the identity function needs to be computed.
- Returns:
The values corresponding to the input x.
- Return type:
tensor
- linear_derivative(x)[source]
Computes the derivative of the identity function for the given input.
- Parameters:
x (tensor) – The values for which the derivative of the identity function needs to be computed.
- Returns:
The derivative values corresponding to the input x.
- Return type:
tensor
- relu(x)[source]
Computes the rectified linear unit (ReLU) function for the given input.
- Parameters:
x (tensor) – The values for which the ReLU function needs to be computed.
- Returns:
The ReLU values corresponding to the input x.
- Return type:
tensor
- relu_derivative(x)[source]
Computes the derivative of the rectified linear unit (ReLU) function for the given input.
- Parameters:
x (tensor) – The values for which the derivative of the ReLU function needs to be computed.
- Returns:
The derivative ReLU values corresponding to the input x.
- Return type:
tensor
- sigmoid(x)[source]
Computes the sigmoid function for the given input.
- Parameters:
x (tensor) – The values for which the sigmoid function needs to be computed.
- Returns:
The computed sigmoid values corresponding to the input x.
- Return type:
tensor
- sigmoid_derivative(x)[source]
Computes the derivative of the sigmoid function for the given input.
- Parameters:
x (tensor) – The values for which the derivative of the sigmoid function needs to be computed.
- Returns:
The computed derivative values corresponding to the input x.
- Return type:
tensor
- softmax(x)[source]
Computes the softmax function for the given input.
- Parameters:
x (tensor) – The input values for which the softmax function needs to be computed.
- Returns:
The computed softmax values corresponding to the input x.
- Return type:
tensor
- softmax_derivative(x)[source]
Computes the derivative of the sigmoid function as a proxy for the softmax derivative.
- Parameters:
x (tensor) – The input values for which the derivative of the softmax function needs to be computed.
- Returns:
The computed sigmoid derivative values corresponding to the input x.
- Return type:
tensor
- tanh(x)[source]
Computes the hyperbolic tangent function for the given input.
- Parameters:
x (tensor) – The values for which the hyperbolic tangent function needs to be computed.
- Returns:
The computed hyperbolic tangent values corresponding to the input x.
- Return type:
tensor
- tanh_derivative(x)[source]
Computes the derivative of the hyperbolic tangent function for the given input.
- Parameters:
x (tensor) – The values for which the derivative of the hyperbolic tangent function needs to be computed.
- Returns:
The computed derivative values corresponding to the input x.
- Return type:
tensor
- class nillanet.loss.Loss[source]
Bases:
object
Loss functions for NN class
- binary_crossentropy(y, yhat, epsilon=1e-15)[source]
Evaluates the distance between the true labels and the predicted probabilities by evaluating the logarithmic loss.
Parameters
- y: tensor
The true labels.
- yhat: tensor
The predicted probabilities.
Returns
- float:
The logarithmic loss between the true labels and the predicted probabilities.
- binary_crossentropy_derivative(y, yhat, epsilon=1e-15)[source]
Derivative of the binary cross-entropy loss function.
Parameters
- y: tensor
The true labels.
- yhat: tensor
The predicted probabilities.
Returns
- tensor:
The derivative of the binary cross-entropy loss function with respect to the inputs.
Raises
- RuntimeWarning: divide by zero or invalid value encountered in divide.
Fix afterward.
- mae(y, yhat)[source]
Mean Absolute Error (MAE) between predicted values and actual values.
Parameters
- yhat: tensor
The predicted values.
- y: tensor
The actual values.
Returns
- float:
The Mean Absolute Error between the predicted and actual values.
- mae_derivative(y, yhat, epsilon=1e-15)[source]
Derivative of the Mean Absolute Error (MAE) loss function.
Parameters
- yhat: tensor
The predicted values.
- y: tensor
The actual values.
Returns
- tensor
The derivative of the MAE loss function with respect to the inputs.
- class nillanet.distributions.Distributions[source]
Bases:
object
Random training distributions for test modules.
- arithmetic_distribution(depth, mode)[source]
predict arithmetic result from distributions of two input values
- Parameters:
- Returns:
tuple of (generated matrix, expected output)
- Raises:
SystemExit – If the provided mode is not “summation” or “one_hot”.
- linear_distribution(depth)[source]
linear regression that predicts y from x for x-values on a random line with slope and intercept
- Parameters:
depth (int) – The number of x-values to generate.
- Returns:
tuple of (generated vector of x-values, vector of expected y-values)
- summation(rows, cols, mode='one_hot')[source]
distributions of binary vectors for testing binary cross entropy (one-hot mode only)
- Parameters:
rows (int) – The number of rows for the generated binary matrix.
cols (int) – The number of columns for the generated binary matrix.
mode (str) – The mode of operation. Accepts either “summation” or “one_hot”. - “summation”: Produces a scalar count of the number of ones in each x vector. - “one_hot”: Produces a one-hot encoded representation of the count of ones in each x vector. Defaults to “one_hot”.
- Returns:
tuple of (generated binary matrix, expected output)
- Raises:
SystemExit if the provided mode is not "summation" or "one_hot". –
- class nillanet.initializer.Initializer(distribution=None, low=0.0, high=1.0, mean=0.0, std=1.0)[source]
Bases:
object
Weight distributions for custom initializations.
- __init__(distribution=None, low=0.0, high=1.0, mean=0.0, std=1.0)[source]
Models a configurable statistical distribution for initializing weights.
- Parameters:
distribution (function) – A function representing the desired distribution. If None, a default normal distribution will be used.
low (float) –
- normal distribution:
the lower boundary for standard deviations away from the mean
- otherwise:
the lower boundary of the abscissae of the distribution
Defaults to 0.0.
high (float) –
- normal distribution:
the upper boundary for standard deviations away from the mean
- otherwise:
the upper boundary of the abscissae of the distribution
Defaults to 1.0.
mean (float) – The mean value for the distribution. Defaults to 0.0.
std (float) – The standard deviation for the distribution. Defaults to 1.0.
- he(shape)[source]
Generates random numbers with variance equal to 2 over the square of n.
He demonstrated its usefulness for the relu activation function. The low and high parameters are interpreted as the lower and upper bounds of the plateau.
- Parameters:
shape (tuple) – The desired shape of the output array containing samples.
- Returns:
- numpy.ndarray
An array of random samples drawn from the truncated normal distribution.
- normal(shape)[source]
Generates random numbers from a bell-shaped distribution within the defined range.
Note that the low and high parameters are interpreted as standard deviations away from the mean.
- Parameters:
shape (tuple) – The desired shape of the output array containing samples.
- Returns:
- numpy.ndarray
An array of random samples drawn from the truncated normal distribution.
- uniform(shape)[source]
Generates random numbers from a plateau-shaped distribution within the defined range.
The low and high parameters are interpreted as the lower and upper bounds of the plateau.
- Parameters:
shape (tuple) – The desired shape of the output array containing samples.
- Returns:
- numpy.ndarray
An array of random samples drawn from the uniform distribution.
- xavier(shape)[source]
Generates random numbers with variance equal to 1 over the square of n.
Xavier demonstrated its usefulness for tanh or sigmoid activation functions. The low and high parameters are interpreted as the lower and upper bounds of the plateau.
- Parameters:
shape (tuple) – The desired shape of the output array containing samples.
- Returns:
- numpy.ndarray
An array of random samples drawn from the truncated normal distribution.
- class nillanet.scheduler.Scheduler(mode, lr, lowbound=1e-08, scaler=0, warmup=0, interval=1, maxsteps=0, custom=None)[source]
Bases:
object
Learning rate scheduler for NN class
- __init__(mode, lr, lowbound=1e-08, scaler=0, warmup=0, interval=1, maxsteps=0, custom=None)[source]
Read parameters for initializing the scheduler.
- Parameters:
mode (str) – The mode of learning rate decay. Required.
lr (float) – The initial learning rate. Required.
lowbound (float) – The lower bound for the learning rate. Default: 1e-8.
scaler (float) –
- Mode:
The scaling factor for the constant mode only.
- Range:
{ x | 0 < x < 1 }.
- Optional:
Set zero to skip.
warmup (int) – The number of epochs for an optional warmup period. Optional, set zero to skip.
interval (int) – The interval at which a step is applied. Default: 1.
maxsteps (int) – The maximum number of updates applied to the learning rate. Optional, set zero to skip.
custom (function) – A custom function for updating the learning rate. Optional, set None to skip.