|
1 |
| -# Objects With Lighting |
| 1 | +# Objects with Lighting |
| 2 | + |
| 3 | +## [Paper](https://arxiv.org/abs/2401.09126) |
| 4 | + |
| 5 | +This repo is the code distribution for the _Objects with Lighting_ dataset. |
| 6 | +It contains the evaluation script (`scripts/evaluation.py`) and the tools used for building the dataset. |
| 7 | + |
| 8 | +If you find the data or code useful please cite |
| 9 | +```bibtex |
| 10 | +@inproceedings{Ummenhofer2024OWL, |
| 11 | + author = {Benjamin Ummenhofer and |
| 12 | + Sanskar Agrawal and |
| 13 | + Rene Sep{\'{u}}lveda and |
| 14 | + Yixing Lao and |
| 15 | + Kai Zhang and |
| 16 | + Tianhang Cheng and |
| 17 | + Stephan R. Richter and |
| 18 | + Shenlong Wang and |
| 19 | + Germ{\'{a}}n Ros}, |
| 20 | + title = {Objects With Lighting: {A} Real-World Dataset for Evaluating Reconstruction |
| 21 | + and Rendering for Object Relighting}, |
| 22 | + booktitle = {3DV}, |
| 23 | + publisher = {{IEEE}}, |
| 24 | + year = {2024} |
| 25 | +} |
| 26 | +``` |
| 27 | + |
| 28 | +## Downloads |
| 29 | + |
| 30 | +Please download the dataset from the current release or use the links below for the latest version. Extracting the files in the repository root will create the `dataset` directory. |
| 31 | + |
| 32 | +- [objects-with-lighting-dataset-v1_1.tgz](https://github.com/isl-org/objects-with-lighting/releases/download/v1/objects-with-lighting-dataset-v1_1.tgz) |
| 33 | +- [objects-with-lighting-dataset-v1_2.tgz](https://github.com/isl-org/objects-with-lighting/releases/download/v1/objects-with-lighting-dataset-v1_2.tgz) |
| 34 | + |
| 35 | +```bash |
| 36 | +wget https://github.com/isl-org/objects-with-lighting/releases/download/v1/objects-with-lighting-dataset-v1_{1,2}.tgz |
| 37 | +``` |
| 38 | + |
| 39 | +## Directory structure |
| 40 | + |
| 41 | +``` |
| 42 | +├─ calibration # Data files for calibration and generated calibration parameters |
| 43 | +├─ dataset # Contains the data meant for consumption |
| 44 | +├─ docs # Images and markdown for documentation |
| 45 | +├─ methods # Documentation and scripts for the baseline method and other state-of-the-art methods |
| 46 | +├─ scripts # This dir contains scripts for creating data and evaluation |
| 47 | +├─ utils # Utility python modules used by all scripts |
| 48 | +``` |
| 49 | + |
| 50 | +## Evaluation script |
| 51 | + |
| 52 | +The script `scripts/evaluate.py` can be used to compute the common metrics PSNR, SSIM, LPIPS for predicted images. |
| 53 | +The results will be stored in a json file. |
| 54 | + - Supported image file formats are `.png`, `.exr`, and `.npy`. |
| 55 | + - We assume `.exr` and `.npy` files store unclipped linear images, while `.png` stores values after applying the tonemapping as used for the dataset. |
| 56 | + - For linear images the evaluation script computes the optimal exposure value minimizing the least squares error before computing the error metrics. |
| 57 | + - The predicted images have to be stored with the same folder structure as the dataset and should be named `pr_image_xxxx.{npy,exr,png}`. |
| 58 | + |
| 59 | +The script can be invoked as |
| 60 | +```bash |
| 61 | +python scripts/evaluate.py -p path/to/predictions results.json |
| 62 | +``` |
| 63 | + |
| 64 | +## Dataset format |
| 65 | + |
| 66 | +``` |
| 67 | +├─ dataset |
| 68 | + ├─ object_name # The dataset is grouped into objects |
| 69 | + ├─ test # Files in the test dir are meant for evaluation |
| 70 | + ├─ inputs # The inputs dir contains all files that are allowed to be used by the methods |
| 71 | +``` |
| 72 | + |
| 73 | +Each of the test directories contains the following data. |
| 74 | + |
| 75 | +| File | Description | |
| 76 | +| --- | --- | |
| 77 | +| `inputs/` | This directory contains all data for reconstructing the object. For a fair evaluation only data inside this folder may be used. | |
| 78 | +| `inputs/image_xxxx.png` | Image files with 8-bit RGB images after tonemapping. | |
| 79 | +| `inputs/camera_xxxx.txt` | Camera parameters for the corresponding image file. | |
| 80 | +| `inputs/mask_xxxx.png` | An approximate mask for methods that require it. | |
| 81 | +| `inputs/exposure.txt` | The exposure value that has been used in the tonemapping. | |
| 82 | +| `inputs/object_bounding_box.txt` | The axis aligned bounding box of the object. This box is not a tight bounding box. | |
| 83 | +| `env.hdr` | An equirectangular image of the environment where the input images have been taken. This image is provided for debugging purposes and should not be used for reconstruction or evaluation. | |
| 84 | +| `env_512_rotated.hdr` | This environment map is downscaled to 1024x512 and has been rotated with the 'world_to_env' transform for easier usage. This image is provided for debugging purposes and should not be used for reconstruction or evaluation. | |
| 85 | +| `world_to_env.txt` | The 4x4 world to camera transform that transforms a point into the coordinate system of the equirectangular image 'env.hdr'. | |
| 86 | +| `gt_image_xxxx.png` | A ground truth image used in evaluation. | |
| 87 | +| `gt_camera_xxxx.txt` | The corresponding camera parameters for a ground truth image. | |
| 88 | +| `gt_mask_xxxx.png` | The mask used for evaluation. Valid pixels are marked with the value 255. | |
| 89 | +| `gt_exposure_xxxx.txt` | The exposure used in the tonemapping of the corresponding ground truth image. | |
| 90 | +| `gt_env_xxxx.hdr` | An equirectangular image of the environment where the corresponding ground truth image was taken. | |
| 91 | +| `gt_world_to_env_xxxx.txt` | The 4x4 world to camera transform that transforms a point into the coordinate system of the equirectangular image 'gt_env_xxxx.hdr'. | |
| 92 | +| `gt_env_512_rotated_xxxx.hdr` | This environment map is downscaled to 1024x512 and has been rotated with the 'world_to_env' transform for easier usage. | |
| 93 | + |
| 94 | +### Tone mapping |
| 95 | + |
| 96 | +We generate the tonemapped 8-bit images with the following function. |
| 97 | + |
| 98 | +$$ y = 255 (x 2^\text{exposure})^\gamma $$ |
| 99 | + |
| 100 | +We use $\gamma=1/2.2$ for all images in the dataset. |
| 101 | +The exposure values used may differ for input and test images. The exposure values can be found in the corresponding `exposure.txt` files. |
| 102 | +The values $y$ are clipped to the range 0 to 255. |
| 103 | + |
| 104 | +### `.txt` files |
| 105 | + |
| 106 | +#### Camera parameters |
| 107 | +The camera parameters are defined by the intrinsic matrix $K$, and the extrinsics $R,t$. |
| 108 | +We can project a 3D point $X$ to the camera coordinate system with |
| 109 | + |
| 110 | +$$ x = K(R X + t) $$ |
| 111 | + |
| 112 | +Note that $x$ is in homogeneous coordinates. |
| 113 | +The camera parameters are stored in the `*camera_xxx.txt` files in the following format. |
| 114 | +``` |
| 115 | +k11 k12 k13 |
| 116 | +k21 k22 k23 |
| 117 | +k31 k32 k33 |
| 118 | +r11 r12 r13 |
| 119 | +r21 r22 r23 |
| 120 | +r31 r32 r33 |
| 121 | +tx ty tz |
| 122 | +width height channels |
| 123 | +``` |
| 124 | + |
| 125 | +The following snippet can be used to parse the file with numpy. |
| 126 | +```python |
| 127 | +params = np.loadtxt('path/to/camera_xxxx.txt') |
| 128 | +K, R, t, (width, height, channels) = params[:3], params[3:6], params[6], params[7].astype(int) |
| 129 | +``` |
| 130 | + |
| 131 | +#### Object bounding box |
| 132 | +The `object_bounding_box.txt` files describe an axis aligned bounding box. |
| 133 | +The format used in the text file is |
| 134 | +``` |
| 135 | +xmin xmax ymin ymax zmin zmax |
| 136 | +``` |
| 137 | + |
| 138 | +#### World to environment map transforms |
| 139 | +The `*world_to_env*.txt` files describe a transformation from the world coordinate system into the coordinate system of the omnidirectional camera that captures the environment. |
| 140 | +The text file stores a 4x4 transformation matrix and transforms a homogeneous 3D point to the camera coordinate system. |
| 141 | +Usually we make the assumption that the environment is infinitely far away from the object and we are only interested in directions. |
| 142 | +In this case we only the rotational part of the 4x4 matrix in the upper left corner is of interest. |
| 143 | +With $R$ as the rotation and $t$ as the translation the format of the text file is |
| 144 | +``` |
| 145 | +r11 r12 r13 tx |
| 146 | +r21 r22 r23 ty |
| 147 | +r31 r32 r33 tz |
| 148 | +0 0 0 1 |
| 149 | +``` |
| 150 | + |
| 151 | +#### Exposure |
| 152 | +The exposure values are single scalars stored in the `*exposure*.txt` files. |
| 153 | + |
| 154 | +## Coordinate systems |
| 155 | + |
| 156 | +The dataset uses right-handed coordinate systems. |
| 157 | + |
| 158 | +### Cameras |
| 159 | +Cameras look in positive z-direction. |
| 160 | +The intrinsic and extrinsic camera parameters can be used to directly project a 3D point $X$ to image space coordinates. |
| 161 | + |
| 162 | +$$ x = K(R X + t) $$ |
| 163 | + |
| 164 | +$x$ is a homogeneous point describing a position in the image. |
| 165 | + |
| 166 | + |
| 167 | +### Images |
| 168 | +The x-axis for images points to the right and the y-axis points down following the memory order. |
| 169 | +The coordinates $(x,y)$ of the top left corner are $(0,0)$. |
| 170 | +The center of the first pixel is at $(0.5, 0,5)$. |
| 171 | +The bottom right corner for an image with witdth $w$ and height $h$ is at $(w, h)$. |
| 172 | + |
| 173 | + |
| 174 | +### Environment maps |
| 175 | +Environment maps are stored as equirectangular images. |
| 176 | +We use a normalized coordinate system similar to regular images. |
| 177 | +The u-axis points to the right and the v-axis points down following the memory order. |
| 178 | +The coordinates $(u,v)$ of the top left corner are $(0,0)$. |
| 179 | +The bottom right corner is at $(1, 1)$ irrespective of the size of the environment map. |
| 180 | +This corresponds to the texture coordinate convention used by DirectX. |
| 181 | + |
| 182 | + |
| 183 | + |
| 184 | +Directions map to the equirectangular image as shown in the image below. |
| 185 | +The direction +Z $(0,0,1)$ maps to the upper border of the environment map and -Z $(0,0,-1)$ to the lower border. |
| 186 | ++X $(1,0,0)$ maps to the center and -X $(-1,0,0)$ maps to the vertically centered point on the right and left border. |
| 187 | ++Y $(0,1,0)$ and -Y $(0,-1,0)$ map to the uv coordinates $(0.25,0.5)$ and $(0.75,0.5)$ respectively. |
| 188 | + |
| 189 | + |
| 190 | + |
| 191 | +The following shows the left half of the environment map mapped to a sphere. |
| 192 | + |
2 | 193 |
|
3 |
| -This is the repository for the paper _Objects With Lighting: A Real-World Dataset for Evaluating Reconstruction and Rendering for Object Relighting_. |
4 | 194 |
|
5 |
| -## TODOs |
6 |
| -- [ ] Code for evaluation and dataset creation |
7 |
| -- [ ] Upload dataset |
|
0 commit comments