Skip to content

Commit 38276d1

Browse files
docs + config cleanup (closes #128)
* fiftyone: get rid of "fast" dataset creation option (always include confidence) * fiftyone: remove unused video plotting functionality * pca: remove some options from configs to clean up * fiftyone: update colab notebook * [docs] data config and nan uniform heatmaps * flake + isort + tests * PR updates * make dali matrix an array rather than a scalar
1 parent c9fde49 commit 38276d1

22 files changed

+192
-412
lines changed

docs/api/lightning_pose.utils.fiftyone.FiftyOneFactory.rst

Lines changed: 0 additions & 17 deletions
This file was deleted.

docs/api/lightning_pose.utils.fiftyone.FiftyOneImagePlotter.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,19 +11,39 @@ FiftyOneImagePlotter
1111
.. autosummary::
1212

1313
~FiftyOneImagePlotter.image_paths
14+
~FiftyOneImagePlotter.img_height
15+
~FiftyOneImagePlotter.img_width
16+
~FiftyOneImagePlotter.model_names
17+
~FiftyOneImagePlotter.num_keypoints
1418

1519
.. rubric:: Methods Summary
1620

1721
.. autosummary::
1822

23+
~FiftyOneImagePlotter.build_single_frame_keypoints
1924
~FiftyOneImagePlotter.create_dataset
25+
~FiftyOneImagePlotter.dataset_info_print
2026
~FiftyOneImagePlotter.get_gt_keypoints_list
27+
~FiftyOneImagePlotter.get_keypoints_per_image
28+
~FiftyOneImagePlotter.get_model_abs_paths
29+
~FiftyOneImagePlotter.get_pred_keypoints_dict
30+
~FiftyOneImagePlotter.load_model_predictions
2131

2232
.. rubric:: Attributes Documentation
2333

2434
.. autoattribute:: image_paths
35+
.. autoattribute:: img_height
36+
.. autoattribute:: img_width
37+
.. autoattribute:: model_names
38+
.. autoattribute:: num_keypoints
2539

2640
.. rubric:: Methods Documentation
2741

42+
.. automethod:: build_single_frame_keypoints
2843
.. automethod:: create_dataset
44+
.. automethod:: dataset_info_print
2945
.. automethod:: get_gt_keypoints_list
46+
.. automethod:: get_keypoints_per_image
47+
.. automethod:: get_model_abs_paths
48+
.. automethod:: get_pred_keypoints_dict
49+
.. automethod:: load_model_predictions

docs/api/lightning_pose.utils.fiftyone.FiftyOneKeypointBase.rst

Lines changed: 0 additions & 47 deletions
This file was deleted.

docs/api/lightning_pose.utils.fiftyone.FiftyOneKeypointVideoPlotter.rst

Lines changed: 0 additions & 19 deletions
This file was deleted.

docs/api/lightning_pose.utils.fiftyone.check_unique_tags.rst

Lines changed: 0 additions & 6 deletions
This file was deleted.

docs/source/faqs.rst

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,27 @@ Note that both semi-supervised and context models will increase memory usage
2424
If you encounter this error, reduce batch sizes during training or inference.
2525
You can find the relevant parameters to adjust in :ref:`The configuration file <config_file>`
2626
section.
27+
28+
.. _faq_nan_heatmaps:
29+
30+
**Q: Why does the network produce high confidence values for keypoints even when they are occluded?**
31+
32+
Generally, when a keypoint is briefly occluded and its location can be resolved by the network, we are fine with
33+
high confidence values (this will happen, for example, when using temporal context frames).
34+
However, there may be scenarios where the goal is to explicitly track whether a keypoint is visible or hidden using
35+
confidence values (e.g., quantifying whether a tongue is in or out of the mouth).
36+
In this case, if the confidence values are too high during occlusions, try the suggestions below.
37+
38+
First, note that including a keypoint in the unsupervised losses - especially the PCA losses -
39+
will generally increase confidence values even during occlusions (by design).
40+
If a low confidence value is desired during occlusions, ensure the keypoint in question is not
41+
included in those losses.
42+
43+
If this does not fix the issue, another option is to set the following field in the config file:
44+
``training.uniform_heatmaps_for_nan_keypoints: true``.
45+
[This field is not visible in the default config but can be added.]
46+
This option will force the model to output a uniform heatmap for any keypoint that does not have
47+
a ground truth label in the training data.
48+
The model will therefore not try to guess where the occluded keypoint is located.
49+
This approach requires a set of training frames that include both visible and occluded examples
50+
of the keypoint in question.

docs/source/user_guide/config_file.rst

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,25 @@ The config file contains several sections:
2121
* ``losses``: hyperparameters for unsupervised losses
2222
* ``eval``: paths for video inference and fiftyone app
2323

24+
Data parameters
25+
===============
26+
27+
* ``data.imaged_orig_dims.height/width``: the current version of Lightning Pose requires all training images to be the same size. We are working on an updated version without this requirement. However, if you plan to use the PCA losses (Pose PCA or multiview PCA) then all training images **must** be the same size, otherwise the PCA subspace will erroneously contain variance related to image size.
28+
29+
* ``data.image_resize_dims.height/width``: images (and videos) will be resized to the specified height and width before being processed by the network. Supported values are {64, 128, 256, 384, 512}. The height and width need not be identical. Some points to keep in mind when selecting
30+
these values: if the resized images are too small, you will lose resolution/details; if they are too large, the model takes longer to train and might not train as well.
31+
32+
* ``data.data_dir/video_dir``: update these to reflect your local paths
33+
34+
* ``data.num_keypoints``: the number of body parts. If using a mirrored setup, this should be the number of body parts summed across all views. If using a multiview setup, this number should indicate the number of keyponts per view (must be the same across all views).
35+
36+
* ``data.keypoint_names``: keypoint names should reflect the actual names/order in the csv file. This field is necessary if, for example, you are running inference on a machine that does not have the training data saved on it.
37+
38+
* ``data.columns_for_singleview_pca``: see the :ref:`Pose PCA documentation <unsup_loss_pcasv>`
39+
40+
* ``data.mirrored_column_matches``: see the :ref:`Multiview PCA documentation <unsup_loss_pcamv>`
41+
42+
2443
Model/training parameters
2544
=========================
2645

docs/source/user_guide_advanced/unsupervised_losses.rst

Lines changed: 22 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ and brief descriptions of some of the available losses.
1212
#. :ref:`Data requirements <unsup_data>`
1313
#. :ref:`The configuration file <unsup_config>`
1414
#. :ref:`Loss options <unsup_loss_options>`
15-
* :ref:`Temporal continuity <unsup_loss_temporal>`
16-
* :ref:`Pose plausibility <unsup_loss_pcasv>`
17-
* :ref:`Multiview consistency <unsup_loss_pcamv>`
15+
* :ref:`Temporal difference <unsup_loss_temporal>`
16+
* :ref:`Pose PCA <unsup_loss_pcasv>`
17+
* :ref:`Multiview PCA <unsup_loss_pcamv>`
1818

1919
.. _unsup_data:
2020

@@ -122,9 +122,18 @@ losses across multiple datasets, but we encourage users to test out several valu
122122
data for best effect. The inverse of this weight is actually used for the final weight, so smaller
123123
values indicate stronger penalties.
124124

125+
We are particularly interested in preventing, and having the network learn from, severe violations
126+
of the different losses.
127+
Therefore, we enforce our losses only when they exceed a tolerance threshold :math:`\epsilon`,
128+
rendering them :math:`\epsilon`-insensitive:
129+
130+
.. math::
131+
132+
\mathscr{L}(\epsilon) = \textrm{max}(0, \mathscr{L} - \epsilon).
133+
125134
.. _unsup_loss_temporal:
126135

127-
Temporal continuity
136+
Temporal difference
128137
-------------------
129138
This loss penalizes the difference in predictions between successive timepoints for each keypoint
130139
independently.
@@ -133,16 +142,17 @@ independently.
133142
134143
temporal:
135144
log_weight: 5.0
136-
epsilon: 20.0
137145
prob_threshold: 0.05
146+
epsilon: 20.0
147+
138148
139149
* ``log_weight``: weight of the loss in the final cost function
140-
* ``epsilon``: in pixels; temporal differences below this threshold are not penalized, which keeps natural movements from being penalized. The value of epsilon will depend on the size of the video frames, framerate (how much does the animal move from one frame to the next), the size of the animal in the frame, etc.
141150
* ``prob_threshold``: predictions with a probability below this threshold are not included in the loss. This is desirable if, for example, a keypoint is occluded and the prediction has low probability.
151+
* ``epsilon``: in pixels; temporal differences below this threshold are not penalized, which keeps natural movements from being penalized. The value of epsilon will depend on the size of the video frames, framerate (how much does the animal move from one frame to the next), the size of the animal in the frame, etc.
142152

143153
.. _unsup_loss_pcasv:
144154

145-
Pose plausibility
155+
Pose PCA
146156
-----------------
147157
This loss penalizes deviations away from a low-dimensional subspace of plausible poses computed on
148158
labeled data.
@@ -186,7 +196,7 @@ If instead you want to include the ears and tailbase:
186196
columns_for_singleview_pca: [1, 2, 4]
187197
188198
See
189-
`these config files <https://github.com/danbider/lightning-pose/tree/feature/docs/scripts/configs>`_
199+
`these config files <https://github.com/danbider/lightning-pose/tree/main/scripts/configs>`_
190200
for more examples.
191201

192202
Below are the various hyperparameters and their descriptions.
@@ -197,19 +207,15 @@ Besides the ``log_weight`` none of the provided values need to be tested for new
197207
pca_singleview:
198208
log_weight: 5.0
199209
components_to_keep: 0.99
200-
empirical_epsilon_percentile: 1.00
201-
empirical_epsilon_multiplier: 1.0
202210
epsilon: null
203211
204212
* ``log_weight``: weight of the loss in the final cost function
205213
* ``components_to_keep``: predictions should lie within the low-d subspace spanned by components that describe this fraction of variance
206-
* ``empirical_epsilon_percentile``: the reprojecton error on labeled training data is computed to arrive at a noise ceiling; reprojection errors from the video data are not penalized if they fall below this percentile of labeled data error (replaces ``epsilon``)
207-
* ``empirical_epsilon_multiplier``: this allows the user to increase the epsilon relative the the empirical epsilon error; with the multiplier the effective epsilon is `eff_epsilon = percentile(error, empirical_epsilon_percentile) * empirical_epsilon_multiplier`
208-
* ``epsilon``: absolute error (in pixels) below which pca loss is zeroed out; if not null, this parameter takes precedence over ``empirical_epsilon_percentile``
214+
* ``epsilon``: if not null, this parameter is automatically computed from the labeled data
209215

210216
.. _unsup_loss_pcamv:
211217

212-
Multiview consistency
218+
Multiview PCA
213219
---------------------
214220
This loss penalizes deviations of predictions across all available views away from a 3-dimensional
215221
subspace computed on labeled data.
@@ -273,12 +279,8 @@ Besides the ``log_weight`` none of the provided values need to be tested for new
273279
pca_multiview:
274280
log_weight: 5.0
275281
components_to_keep: 3
276-
empirical_epsilon_percentile: 1.00
277-
empirical_epsilon_multiplier: 1.0
278282
epsilon: null
279283
280284
* ``log_weight``: weight of the loss in the final cost function
281-
* ``components_to_keep``: predictions should lie within the 3D subspace
282-
* ``empirical_epsilon_percentile``: the reprojecton error on labeled training data is computed to arrive at a noise ceiling; reprojection errors from the video data are not penalized if they fall below this percentile of labeled data error (replaces ``epsilon``)
283-
* ``empirical_epsilon_multiplier``: this allows the user to increase the epsilon relative the the empirical epsilon error; with the multiplier the effective epsilon is `eff_epsilon = percentile(error, empirical_epsilon_percentile) * empirical_epsilon_multiplier`
284-
* ``epsilon``: absolute error (in pixels) below which pca loss is zeroed out; if not null, this parameter takes precedence over ``empirical_epsilon_percentile``
285+
* ``components_to_keep``: should be set to 3 so that predictions lie within a 3D subspace
286+
* ``epsilon``: if not null, this parameter is automatically computed from the labeled data

lightning_pose/data/dali.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ def video_pipe(
112112
else:
113113
# choose arbitrary scalar (rather than a matrix) so that downstream operations know there
114114
# is no geometric transforms to undo
115-
matrix = -1
115+
matrix = np.array([-1])
116116
# video pixel range is [0, 255]; transform it to [0, 1].
117117
# happens naturally in the torchvision transform to tensor.
118118
video = video / 255.0

0 commit comments

Comments
 (0)