nn
Neural Network Modules
Table of Contents
Regressors (see also kernel regressors)
- linear_operator_learning.nn.ridge_least_squares(cov_X, tikhonov_reg=0.0)[source]
Fit the ridge least squares estimator for the transfer operator.
- Parameters:
cov_X (
Tensor) – covariance matrix of the input data.tikhonov_reg (
float, optional) – Ridge regularization. Defaults to 0.0.
- Return type:
FitResult
- linear_operator_learning.nn.eig(fit_result, cov_XY)[source]
Computes the eigendecomposition of a regressor.
- Parameters:
fit_result (
FitResult) – Fit result as defined inlinear_operator_learning.nn.structs.cov_XY (
Tensor) – Cross covariance matrix between the input and output data.
- Return type:
EigResult
- Shape:
cov_XY: \((D, D)\), where \(D\) is the number of features.Output:
U, Vof shape \((D, R)\),svalsof shape \(R\) where \(D\) is the number of features and \(R\) is the rank of the regressor.
- linear_operator_learning.nn.evaluate_eigenfunction(eig_result, which, X)[source]
Evaluates left or right eigenfunctions of a regressor.
- Parameters:
eig_result (EigResult) – EigResult object containing eigendecomposition results
which (Literal['left', 'right']) – String indicating “left” or “right” eigenfunctions.
X (Tensor) – Feature map of the input data
- Shape:
eig_results:U, Vof shape \((D, R)\),svalsof shape \(R\) where \(D\) is the number of features and \(R\) is the rank of the regressor.X: \((N_0, D)\), where \(N_0\) is the number of inputs to predict and \(D\) is the number of features.Output: \((N_0, R)\)
Loss Functions
- class linear_operator_learning.nn.L2ContrastiveLoss[source]
NCP/Contrastive/Mutual Information Loss based on the \(L^{2}\) error by Kostic et al.[1].
\[\mathcal{L}(x, y) = \frac{1}{N(N-1)}\sum_{i \neq j}\langle x_{i}, y_{j} \rangle^2 - \frac{2}{N}\sum_{i=1}\langle x_{i}, y_{i} \rangle.\]- Parameters:
gamma (
float, optional) – Regularization strength. Defaults to 1e-3.regularizer (
literal, optional) – Regularizer. Eitherorthn_froororthn_logfro. Defaults toorthn_fro.
- forward(x, y)[source]
Forward pass of the L2 contrastive loss.
- Parameters:
x (
Tensor) – Input features.y (
Tensor) – Output features.
- Return type:
Tensor
- Shape:
x: \((N, D)\), where \(N\) is the batch size and \(D\) is the number of features.y: \((N, D)\), where \(N\) is the batch size and \(D\) is the number of features.
- class linear_operator_learning.nn.KLContrastiveLoss[source]
NCP/Contrastive/Mutual Information Loss based on the KL divergence.
\[\mathcal{L}(x, y) = \frac{1}{N(N-1)}\sum_{i \neq j}\langle x_{i}, y_{j} \rangle - \frac{2}{N}\sum_{i=1}\log\big(\langle x_{i}, y_{i} \rangle\big).\]- Parameters:
gamma (
float, optional) – Regularization strength. Defaults to 1e-3.regularizer (
literal, optional) – Regularizer. Eitherorthn_froororthn_logfro. Defaults toorthn_fro.
- forward(x, y)[source]
Forward pass of the KL contrastive loss.
- Parameters:
x (
Tensor) – Input features.y (
Tensor) – Output features.
- Return type:
Tensor
- Shape:
x: \((N, D)\), where \(N\) is the batch size and \(D\) is the number of features.y: \((N, D)\), where \(N\) is the batch size and \(D\) is the number of features.
- class linear_operator_learning.nn.VampLoss[source]
Variational Approach for learning Markov Processes (VAMP) score by Wu and Noé[2].
\[\mathcal{L}(x, y) = -\sum_{i} \sigma_{i}(A)^{p} \qquad \text{where}~A = \big(x^{\top}x\big)^{\dagger/2}x^{\top}y\big(y^{\top}y\big)^{\dagger/2}.\]- Parameters:
schatten_norm (
int, optional) – Computes the VAMP-p score withp = schatten_norm. Defaults to 2.center_covariances (
bool, optional) – Use centered covariances to compute the VAMP score. Defaults to True.gamma (
float, optional) – Regularization strength. Defaults to 1e-3.regularizer (
literal, optional) – Regularizer. Eitherorthn_froororthn_logfro. Defaults toorthn_fro.
- forward(x, y)[source]
Forward pass of VAMP loss.
- Parameters:
x (
Tensor) – Features for x.y (
Tensor) – Features for y.
- Raises:
NotImplementedError – If
schatten_normis not 1 or 2.- Return type:
Tensor
- Shape:
x: \((N, D)\), where \(N\) is the batch size and \(D\) is the number of features.y: \((N, D)\), where \(N\) is the batch size and \(D\) is the number of features.
- class linear_operator_learning.nn.DPLoss[source]
Deep Projection Loss by Kostic et al.[3].
\[\mathcal{L}(x, y) = -\frac{\|x^{\top}y\|^{2}_{{\rm F}}}{\|x^{\top}x\|^{2}\|y^{\top}y\|^{2}}.\]- Parameters:
relaxed (
bool, optional) – Whether to use the relaxed (more numerically stable) or the full deep-projection loss. Defaults to True.center_covariances (
bool, optional) – Use centered covariances to compute the Deep Projection loss. Defaults to True.gamma (
float, optional) – Regularization strength. Defaults to 1e-3.regularizer (
literal, optional) – Regularizer. Eitherorthn_froororthn_logfro. Defaults toorthn_fro.
- forward(x, y)[source]
Forward pass of DPLoss.
- Parameters:
x (
Tensor) – Features for x.y (
Tensor) – Features for y.
- Return type:
Tensor
- Shape:
x: \((N, D)\), where \(N\) is the batch size and \(D\) is the number of features.y: \((N, D)\), where \(N\) is the batch size and \(D\) is the number of features.
Modules
- class linear_operator_learning.nn.MLP[source]
Multi Layer Perceptron.
- Parameters:
input_shape (
int) – Input shape of the MLP.n_hidden (
int) – Number of hidden layers.layer_size (
int or list of ints) – Number of neurons in each layer. If an int is provided, it is used as the number of neurons for all hidden layers. Otherwise, the list of int is used to define the number of neurons for each layer.output_shape (
int) – Output shape of the MLP.dropout (
float) – Dropout probability between layers. Defaults to 0.0.activation (
torch.nn.Module) – Activation function. Defaults to ReLU.iterative_whitening (
bool) – Whether to add an IterNorm layer at the end of the network. Defaults to False.bias (
bool) – Whether to include bias in the layers. Defaults to False.
- class linear_operator_learning.nn.ResNet[source]
ResNet model from He et al.[4].
- Parameters:
block (
Type[Union[BasicBlock, Bottleneck]]) – Block type.layers (
List[int]) – Number of layers.channels_in (
int) – Number of input channels.num_features (
int) – Number of features.zero_init_residual (
bool) – Zero initialization of residual.groups (
int) – Number of groups.width_per_group (
int) – Width per group.replace_stride_with_dilation (
Optional[List[bool]]) – Replace stride with dilation.padding_mode (
str) – Padding mode for the convolutional layers.norm_layer (
Optional[Callable[..., nn.Module]]) – Normalization layer.
- class linear_operator_learning.nn.SimNorm[source]
Simplicial normalization from Lavoie et al.[5].
Simplicial normalization splits the input into chunks of dimension
dim, applies a softmax transformation to each of the chunks separately, and concatenates them back together.- Parameters:
dim (
int) – Dimension of the simplicial groups.
- class linear_operator_learning.nn.EMACovariance[source]
Exponential moving average of the covariance matrices.
Gives an online estimate of the covariances and means \(C\) adding the batch covariance \(\hat{C}\) via the following update forumla
\[C \leftarrow (1 - m)C + m \hat{C}\]- Parameters:
feature_dim – The number of features in the input and output tensors.
momentum – The momentum for the exponential moving average.
center – Whether to center the data before computing the covariance matrices.