How to calculate the transformation matrix between NED and OpenCV coordinate systems

NED (North-East-Down) Coordinate System

  • A north-east-down (NED) system uses the Cartesian coordinates (xNorth, yEast, zDown) to represent position relative to a local origin.

  • In NED, we have the x-axis points forward, the y-axis to the right, and the z-axis downward.

                  / +x (to forward, North)
                /
              /
 (Origin O) /_ _ _ _ _ _ _ _ +y (to right, East)
            |
            |
            |
            | +z (Down)

    NED Coordinate, Right-hand Coordinate System,
    assuming your eye is behind the y-O-z plane and seeing +x forward.

OpenCV Coordinate System

  • OpenCV coordinate system uses the Cartesian coordinates as the x-axis pointing to the right, the y-axis downward, and the z-axis forward.
                  / +z (to Forward)
                /
              /
 (Origin O) /_ _ _ _ _ _ _   +x (to Right)
            |
            |
            |
            | +y (Down)

    OpenCV Coordinate, Right-hand Coordinate System,
    assuming your eye is behind the x-O-y plane and seeing +z forward. 

Notation

Assume we have the following coordinate systems:

  • wned: the world coordinate in NED (x Forward, y Right, z Down) format;
  • cned: the camera coordinate in NED (x Forward, y Right, z Down) format;
  • w: the world coordinate in OpenCV style (x Right, y Down, z Forward);
  • c: the camera coordinate in OpenCV style (x Right, y Down, z Forward);

Why We Need OpenCV-style Camera Pose

It is because we use the following pipeline to connect RGB, camera, and world:

RGB image (x,y) with x pointing to the right, y down, and image origin in the left-top corner —> camera intrinsic matirx K and inverse K1 —> camera points Pc = (Xc,Yc,Zc) —> camera extrinsic matrix E and inverse E1 —> world points Pw = (Xw,Yw,Zw).

How to get the transformation matrix from NED to OpenCV Style

  • The matrix is defined as Twwned to map the points Pwned to the points Pw, i.e.,

Pw = Twwned * Pwned

  • The matrix is also defined as Tccned to map the points Pcned to the points Pc, i.e.,

Pc = Tccned * Pcned

  • To find Twwned is to project (or to calculate the dot-product between) each axis (as a unit vector) of xwned, ywned, zwned, into the axis xw, yw, zw.

  • So we can get this matrix as:

    T = np.array([
                  [0,1,0,0],
                  [0,0,1,0],
                  [1,0,0,0],
                  [0,0,0,1]], dtype=np.float32)
  • And we have Twwned = Tccned = T.

How to map the camera pose in NED to OpenCV Style:

  • OpenCV style camera-to-world pose is calculated as:

Twc = Twwned * Twnedcned * Tcnedc

  • note: $T^{w}_{c}$ etc are in LaTex style if not shown correctly.

Apply Chain Rule

  • We want to find the pose between c and w in OpenCV style coordinates;
  • That is to say to find the cam-to-world pose Twc, which do the mapping Pw=TwcPc;
  • Using the chain rule, we have:

Twc = Twwned * Twnedcned * Tcnedc = T * "camera-to-world-pose-NED" * inv(T)

where, we assume the camera-to-wolrd pose in NED is provided by the dataset (e.g., TartanAir dataset). Please see my answer to this issue in the TartanAir Dataset repo.

Written on March 14, 2024