Skip to content

Groundtruth Camera Rotation is not SO(3) #12

@hongsukchoi

Description

@hongsukchoi

Hi! I struggled a lot debugging the dataset's calibration. Could you review my thought process below? Thank you!

You obtained the transformation matrix using procrustes alignment as written in the paper.

Camera Calibration and 3D Localization. For the ego- centric cameras, we obtain the intrinsic parameters of the cus- tom lens from the factory calibration and the per-timestamp extrinsic parameters using state-of-the-art visual-inertial odometry (VIO) [76]. As the VIO algorithm only provides individual egocentric camera trajectories in an arbitrary coor- dinate system, we merge multiple egocentric camera trajec- tories together with the stationary secondary cameras into a single frame of reference by using procrustes-alignment [75] and structure-from-motion [102] (c.f.,Fig. 2b).

The procrustes alignment code in this repo:

assert(len(S1) == len(S2))

    ## make T out of the scale, R and t
    T = np.eye(4)
    T[:3, :3] = scale*R
    # T[:3, :3] = R

    T[:3, 3] = t.reshape(-1)

    output = {'scale': scale, 'R': R, 't': t}

However, this R from Procrustes alignment is not SO(3). Unfortunately, colmap_from_aria_transforms.pkl is containing this corrupted rotation matrices.

As a result, the extrinsic parameters that the dataloader is using is wrong, because self.primary_transform wrong, which is from colmap_from_aria_transforms.pkl.

EgoExoScene code:

##------transform from aria1 coordinate system to colmap
        self.anchor_ego_camera = self.cfg.CALIBRATION.ANCHOR_EGO_CAMERA
        self.primary_transform = self.colmap_from_aria_transforms[self.anchor_ego_camera]
        # self.primary_transform = np.eye(4) # Hongsuk Choi - TEMP

        ##----------------load the scene point cloud-----------------
        ## measure the time for this function
        self.scene_vertices, self.scene_ground_vertices, self.ground_plane = self.load_scene_geometry()

        ##------------------------ego--------------------------
        self.aria_human_names = [human_name for human_name in sorted(os.listdir(self.ego_dir)) if human_name not in self.cfg.INVALID_ARIAS and human_name.startswith('aria')]

        self.aria_humans = {}
        for person_idx, aria_human_name in enumerate(self.aria_human_names):
            coordinate_transform = np.dot(
                                np.linalg.inv(self.colmap_from_aria_transforms[aria_human_name]), 
                                self.primary_transform
                            ) 
            self.aria_humans[aria_human_name] = AriaHuman(
                            cfg=cfg,
                            root_dir=self.ego_dir, human_name=aria_human_name, \
                            human_id=person_idx, ground_plane=self.ground_plane, \
                            coordinate_transform=coordinate_transform)

        self.total_time = self.aria_humans[self.aria_human_names[0]].total_time
        self.time_stamp = 0 ## 0 is an invalid time stamp, we start with 1

        ##------------------------exo--------------------------
        self.exo_camera_mapping = self.get_colmap_camera_mapping()
        self.exo_camera_names = [exo_camera_name for exo_camera_name in sorted(os.listdir(self.exo_dir)) if exo_camera_name not in self.cfg.INVALID_EXOS and exo_camera_name.startswith('cam')]
        self.colmap_reconstruction = pycolmap.Reconstruction(self.colmap_dir) ## this is the bottleneck
        self.exo_cameras = {exo_camera_name: ExoCamera(cfg=cfg, root_dir=self.exo_dir, colmap_dir=self.colmap_dir, \
                            exo_camera_name=exo_camera_name, coordinate_transform=self.primary_transform, reconstruction=self.colmap_reconstruction, \
                            exo_camera_mapping=self.exo_camera_mapping) \
                            for exo_camera_name in sorted(self.exo_camera_names)}  

The correct way to update the extrinsics is probably this with the current data:

    ##--------------------------------------------------------
    def update(self, time_stamp):
        self.extrinsics_image, self.extrinsics = self.set_closest_calibration(time_stamp=time_stamp)
        self.raw_extrinsics = np.concatenate([self.extrinsics, [[0, 0, 0, 1]]], axis=0) ## 4 x 4

        # Hongsuk
        raw_R, raw_T = self.raw_extrinsics[:3, :3], self.raw_extrinsics[:3, 3]
        
        R = self.coordinate_transform[:3, :3]
        T = self.coordinate_transform[:3, 3]
        c = np.sqrt((R @ R.T)[0,0])
        self.mystery_scale = c
        R = R / c
        # T = T / c
        new_R = raw_R @ R
        # new_T = raw_R @ T + raw_T
        new_T = (raw_R @ T + raw_T) / c

        self.extrinsics = np.concatenate([new_R, new_T.reshape(-1, 1)], axis=1)
        self.extrinsics = np.concatenate([self.extrinsics, [[0, 0, 0, 1]]], axis=0) ## 4 x 4
        # Hongsuk

        # Wrong
        # self.extrinsics = np.dot(self.raw_extrinsics, self.coordinate_transform)
        # Oritinally, this is doing this:
        # new_R = raw_R @ cR  # R is a valid SO(3) matrix
        # new_T = raw_R @ T + raw_T

However, my modification is assuming that the COLMAP raw extrinsics are already scaled to the correct scale. But COLMAP itself usually does not give the absolute scale, so I am curious how you obtained the absolute scale. In my qualtiative check in one of the basketball scenes, it seemed ok!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions