Skip to content

Commit ff8c211

Browse files
pretrained dino backbone (#303)
* dino implementation * dino/mae cleanup * update docs * fix tests
1 parent 10706f2 commit ff8c211

File tree

12 files changed

+155
-382
lines changed

12 files changed

+155
-382
lines changed

docs/api/lightning_pose.models.backbones.vit_mae.ViTVisionEncoder.rst

Lines changed: 0 additions & 17 deletions
This file was deleted.

docs/api/lightning_pose.models.backbones.vit_sam.SamVisionEncoderHF.rst renamed to docs/api/lightning_pose.models.backbones.vit_sam.SamVisionEncoder.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
1-
SamVisionEncoderHF
2-
==================
1+
SamVisionEncoder
2+
================
33

44
.. currentmodule:: lightning_pose.models.backbones.vit_sam
55

6-
.. autoclass:: SamVisionEncoderHF
6+
.. autoclass:: SamVisionEncoder
77
:show-inheritance:
88

99
.. rubric:: Methods Summary
1010

1111
.. autosummary::
1212

13-
~SamVisionEncoderHF.forward
13+
~SamVisionEncoder.forward
1414

1515
.. rubric:: Methods Documentation
1616

docs/modules/lightning_pose.models.backbones.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,6 @@ lightning\_pose.models.backbones
44
.. automodapi:: lightning_pose.models.backbones.torchvision
55
:no-inheritance-diagram:
66

7-
.. automodapi:: lightning_pose.models.backbones.vit_mae
8-
:no-inheritance-diagram:
9-
107
.. automodapi:: lightning_pose.models.backbones.vit_sam
118
:no-inheritance-diagram:
129

docs/source/user_guide/config_file.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,9 @@ The following parameters relate to model architecture and unsupervised losses.
207207
* efficientnet_b0: EfficientNet-B0 pretrained on ImageNet
208208
* efficientnet_b1: EfficientNet-B1 pretrained on ImageNet
209209
* efficientnet_b2: EfficientNet-B2 pretrained on ImageNet
210+
* vits_dino: Vision Transformer (Small) pretrained on ImageNet with DINO
211+
* vitb_dino: Vision Transformer (Base) pretrained on ImageNet with DINO
212+
* vitb_imagenet: Vision Transformer (Base) pretrained on ImageNet with MAE loss
210213
* vitb_sam: Segment Anything Model (Vision Transformer Base)
211214

212215
Note: the file size for a single ResNet-50 network is approximately 275 MB.

lightning_pose/models/backbones/__init__.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
"efficientnet_b0",
1717
"efficientnet_b1",
1818
"efficientnet_b2",
19-
"vitb_sam",
19+
"vits_dino",
20+
"vitb_dino",
2021
"vitb_imagenet",
22+
"vitb_sam",
2123
]

lightning_pose/models/backbones/vit_mae.py

Lines changed: 0 additions & 238 deletions
This file was deleted.

lightning_pose/models/backbones/vit_sam.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,11 @@
77

88
# to ignore imports for sphix-autoapidoc
99
__all__ = [
10-
"SamVisionEncoderHF",
10+
"SamVisionEncoder",
1111
]
1212

1313

14-
class SamVisionEncoderHF(nn.Module):
14+
class SamVisionEncoder(nn.Module):
1515
"""Wrapper around HuggingFace's SAM Vision Encoder."""
1616

1717
def __init__(

0 commit comments

Comments
 (0)