Skip to content

Commit f0b5dea

Browse files
authored
Merge pull request #178 from ENSTA-U2IS-AI/dev
⚡ Version 0.5.1
2 parents 3fa0899 + 61791aa commit f0b5dea

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+592
-702
lines changed

auto_tutorial_source/Bayesian_Methods/tutorial_bayesian.py

Lines changed: 44 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,37 @@
11
# ruff: noqa: E402, E703, D212, D415, T201
22
"""
3-
Train a Bayesian Neural Network in Three Minutes
3+
Training a Bayesian Neural Network in 20 seconds
44
================================================
55
6-
In this tutorial, we will train a variational inference Bayesian Neural Network (BNN) LeNet classifier on the MNIST dataset.
6+
In this tutorial, we will train a variational inference Bayesian Neural Network (viBNN) LeNet classifier on the MNIST dataset.
77
88
Foreword on Bayesian Neural Networks
99
------------------------------------
1010
1111
Bayesian Neural Networks (BNNs) are a class of neural networks that estimate the uncertainty on their predictions via uncertainty
1212
on their weights. This is achieved by considering the weights of the neural network as random variables, and by learning their
13-
posterior distribution. This is in contrast to standard neural networks, which only learn a single set of weights, which can be
14-
seen as Dirac distributions on the weights.
13+
posterior distribution. This is in contrast to standard neural networks, which only learn a single set of weights (this can be
14+
seen as Dirac distributions on the weights).
1515
16-
For more information on Bayesian Neural Networks, we refer the reader to the following resources:
16+
For more information on Bayesian Neural Networks, we refer to the following resources:
1717
1818
- Weight Uncertainty in Neural Networks `ICML2015 <https://arxiv.org/pdf/1505.05424.pdf>`_
19-
- Hands-on Bayesian Neural Networks - a Tutorial for Deep Learning Users `IEEE Computational Intelligence Magazine <https://arxiv.org/pdf/2007.06823.pdf>`_
19+
- Hands-on Bayesian Neural Networks - a Tutorial for Deep Learning Users `IEEE Computational Intelligence Magazine
20+
<https://arxiv.org/pdf/2007.06823.pdf>`_
2021
2122
Training a Bayesian LeNet using TorchUncertainty models and Lightning
2223
---------------------------------------------------------------------
2324
24-
In this part, we train a Bayesian LeNet, based on the model and routines already implemented in TU.
25+
In this first part, we train a Bayesian LeNet, based on the model and routines already implemented in TU.
2526
2627
1. Loading the utilities
2728
~~~~~~~~~~~~~~~~~~~~~~~~
2829
2930
To train a BNN using TorchUncertainty, we have to load the following modules:
3031
31-
- our TUTrainer
32-
- the model: bayesian_lenet, which lies in the torch_uncertainty.model
33-
- the classification training routine from torch_uncertainty.routines
32+
- our TUTrainer to improve the display of our metrics
33+
- the model: bayesian_lenet, which lies in the torch_uncertainty.model.classification.lenet module
34+
- the classification training routine from torch_uncertainty.routines module
3435
- the Bayesian objective: the ELBOLoss, which lies in the torch_uncertainty.losses file
3536
- the datamodule that handles dataloaders: MNISTDataModule from torch_uncertainty.datamodules
3637
@@ -46,39 +47,43 @@
4647
from torch_uncertainty import TUTrainer
4748
from torch_uncertainty.datamodules import MNISTDataModule
4849
from torch_uncertainty.losses import ELBOLoss
49-
from torch_uncertainty.models.classification import bayesian_lenet
50+
from torch_uncertainty.models.classification.lenet import bayesian_lenet
5051
from torch_uncertainty.routines import ClassificationRoutine
5152

53+
# We also define the main hyperparameters, with just one epoch for the sake of time
54+
BATCH_SIZE = 512
55+
MAX_EPOCHS = 2
56+
5257
# %%
5358
# 2. Creating the necessary variables
5459
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5560
#
5661
# In the following, we instantiate our trainer, define the root of the datasets and the logs.
5762
# We also create the datamodule that handles the MNIST dataset, dataloaders and transforms.
58-
# Please note that the datamodules can also handle OOD detection by setting the eval_ood
59-
# parameter to True. Finally, we create the model using the blueprint from torch_uncertainty.models.
63+
# Please note that the datamodules can also handle OOD detection by setting the `eval_ood`
64+
# parameter to True, as well as distribution shift with `eval_shift`.
65+
# Finally, we create the model using the blueprint from torch_uncertainty.models.
6066

61-
trainer = TUTrainer(accelerator="gpu", devices=1, enable_progress_bar=False, max_epochs=1)
67+
trainer = TUTrainer(accelerator="gpu", devices=1, enable_progress_bar=False, max_epochs=MAX_EPOCHS)
6268

6369
# datamodule
6470
root = Path("data")
65-
datamodule = MNISTDataModule(root=root, batch_size=128, eval_ood=False)
71+
datamodule = MNISTDataModule(root=root, batch_size=BATCH_SIZE, num_workers=8)
6672

6773
# model
6874
model = bayesian_lenet(datamodule.num_channels, datamodule.num_classes)
6975

7076
# %%
7177
# 3. The Loss and the Training Routine
7278
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
73-
# Then, we just have to define the loss to be used during training. To do this,
74-
# we redefine the default parameters from the ELBO loss using the partial
75-
# function from functools. We use the hyperparameters proposed in the blitz
76-
# library. As we are train a classification model, we use the CrossEntropyLoss
77-
# as the likelihood.
78-
# We then define the training routine using the classification training routine
79-
# from torch_uncertainty.classification. We provide the model, the ELBO
79+
#
80+
# Then, we just define the loss to be used during training, which is a bit special and called
81+
# the evidence lower bound. We use the hyperparameters proposed in the blitz
82+
# library. As we are training a classification model, we use the CrossEntropyLoss
83+
# as the negative log likelihood. We then define the training routine using the classification
84+
# training routine from torch_uncertainty.classification. We provide the model, the ELBO
8085
# loss and the optimizer to the routine.
81-
# We will use the Adam optimizer with the default learning rate of 0.001.
86+
# We use an Adam optimizer with a learning rate of 0.02.
8287

8388
loss = ELBOLoss(
8489
model=model,
@@ -91,10 +96,7 @@
9196
model=model,
9297
num_classes=datamodule.num_classes,
9398
loss=loss,
94-
optim_recipe=optim.Adam(
95-
model.parameters(),
96-
lr=1e-3,
97-
),
99+
optim_recipe=optim.Adam(model.parameters(), lr=2e-2),
98100
is_ensemble=True,
99101
)
100102

@@ -105,25 +107,26 @@
105107
# Now that we have prepared all of this, we just have to gather everything in
106108
# the main function and to train the model using our wrapper of Lightning Trainer.
107109
# Specifically, it needs the routine, that includes the model as well as the
108-
# training/eval logic and the datamodule
110+
# training/eval logic and the datamodule.
109111
# The dataset will be downloaded automatically in the root/data folder, and the
110112
# logs will be saved in the root/logs folder.
111113

112114
trainer.fit(model=routine, datamodule=datamodule)
113115
trainer.test(model=routine, datamodule=datamodule)
114-
115116
# %%
116117
# 5. Testing the Model
117118
# ~~~~~~~~~~~~~~~~~~~~
118119
#
119120
# Now that the model is trained, let's test it on MNIST.
120121
# Please note that we apply a reshape to the logits to determine the dimension corresponding to the ensemble
121-
# and to the batch. As for TorchUncertainty 0.2.0, the ensemble dimension is merged with the batch dimension
122+
# and to the batch. As for TorchUncertainty 0.5.1, the ensemble dimension is merged with the batch dimension
122123
# in this order (num_estimator x batch, classes).
124+
123125
import matplotlib.pyplot as plt
124126
import numpy as np
125127
import torch
126128
import torchvision
129+
from einops import rearrange
127130

128131

129132
def imshow(img) -> None:
@@ -134,32 +137,33 @@ def imshow(img) -> None:
134137
plt.show()
135138

136139

137-
dataiter = iter(datamodule.val_dataloader())
138-
images, labels = next(dataiter)
140+
images, labels = next(iter(datamodule.val_dataloader()))
139141

140142
# print images
141143
imshow(torchvision.utils.make_grid(images[:4, ...]))
142144
print("Ground truth: ", " ".join(f"{labels[j]}" for j in range(4)))
143145

144146
# Put the model in eval mode to use several samples
145-
model = model.eval()
146-
logits = model(images).reshape(16, 128, 10) # num_estimators, batch_size, num_classes
147+
model = routine.eval()
148+
logits = routine(images[:4, ...])
149+
print("Output logit shape (Num predictions x Batch) x Classes: ", logits.shape)
150+
logits = rearrange(logits, "(m b) c -> b m c", b=4) # batch_size, num_estimators, num_classes
147151

148-
# We apply the softmax on the classes and average over the estimators
152+
# We apply the softmax on the classes then average over the estimators
149153
probs = torch.nn.functional.softmax(logits, dim=-1)
150-
avg_probs = probs.mean(dim=0)
151-
var_probs = probs.std(dim=0)
154+
avg_probs = probs.mean(dim=1)
155+
var_probs = probs.std(dim=1)
152156

153-
_, predicted = torch.max(avg_probs, 1)
157+
predicted = torch.argmax(avg_probs, -1)
154158

155159
print("Predicted digits: ", " ".join(f"{predicted[j]}" for j in range(4)))
156160
print(
157161
"Std. dev. of the scores over the posterior samples",
158-
" ".join(f"{var_probs[j][predicted[j]]:.3}" for j in range(4)),
162+
" ".join(f"{var_probs[j][predicted[j]]:.3f}" for j in range(4)),
159163
)
160164
# %%
161165
# Here, we show the variance of the top prediction. This is a non-standard but intuitive way to show the diversity of the predictions
162-
# of the ensemble. Ideally, the variance should be high when the average top prediction is incorrect.
166+
# of the ensemble. Ideally, the variance should be high when the prediction is incorrect.
163167
#
164168
# References
165169
# ----------

auto_tutorial_source/Bayesian_Methods/tutorial_muad_mc_drop.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,7 @@ def enet_weighing(dataloader, num_classes, c=1.02):
212212
optim_recipe={"optimizer": optimizer, "lr_scheduler": lr_updater},
213213
)
214214

215-
trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=NB_EPOCHS, enable_progress_bar=True)
215+
trainer = TUTrainer(accelerator="gpu", devices=1, max_epochs=NB_EPOCHS, enable_progress_bar=False)
216216
# %%
217217
# 6. Training the model
218218
# ~~~~~~~~~~~~~~~~~~~~~

auto_tutorial_source/Classification/tutorial_bayesian.py

Lines changed: 0 additions & 174 deletions
This file was deleted.

auto_tutorial_source/Classification/tutorial_distribution_shift.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,9 @@
107107
# We specify the maximum number of epochs, the precision and the device to be used.
108108

109109
# Initialize the TUTrainer with a maximum of 10 epochs and the specified device
110-
trainer = TUTrainer(max_epochs=10, precision="16-mixed", accelerator="cuda", devices=1)
110+
trainer = TUTrainer(
111+
max_epochs=10, precision="16-mixed", accelerator="cuda", devices=1, enable_progress_bar=False
112+
)
111113

112114
# Begin training the model using the CIFAR-10 DataModule
113115
trainer.fit(routine, datamodule=datamodule)

auto_tutorial_source/Classification/tutorial_evidential_classification.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
from pathlib import Path
3636

3737
import torch
38-
from torch import nn, optim
38+
from torch import optim
3939

4040
from torch_uncertainty import TUTrainer
4141
from torch_uncertainty.datamodules import MNISTDataModule

0 commit comments

Comments
 (0)