How to get depth at specific Main/CV Camera pixel position?

Hi, I'm wondering how to query the depth of an object at a specific pixel location from the Main/CV camera?

I've copied from the depth camera sample code and am reading from both the Main RGB camera and Depth camera in my application. I detect some markers/objects using the RGB camera, and am wondering about the pixel/coordinate mapping between the depth and main camera such that, for example:
if I read from the RGB Main/CV camera at max resolution of 4096x3072, and detect the corner of a 2D marker at pixel coordinate (1000, 500), how could I find the corresponding (x, y) position in the 544x480 depth image?

Ideally, this would be like a function such as
MLVec2f RgbPosToDepthPos(MLVec2f rgbPixelCoords, MLVec2f rgbImageResolution);
(resolution of rgb image is included because the aspect ratio and resolution of your rgb camera feed would affect the pixel mapping)

I wrote an app to read and save corresponding RGB and depth frames, and I noticed the depth camera has a much larger FOV than the Main/CV camera. Also, the depth camera has a bit of a fish eye lens compared to the Main camera. I was going to see if I could find a mapping between the two cameras myself, but the fish-eye distortion of the depth camera and the fact that the depth camera can't see things like markings on a ruler/tape measure make it very difficult to line up images, so I thought I'd ask on here if such a function already exists.

Also, what's the accuracy of the depth sensor when working at a distance of about 1 meter?

Additionally, can you clarify the depth measurement?
This is an example screenshot from blender showing a camera's frustum looking at some flat surface. When you move further away from the camera's center FOV, the rays between the camera and the surface get longer and more diagonal. If you query the depth of some pixel at the top of the camera's field of view, is the depth returned equal to the length of the straight line (yellow line in the image) between the camera and that point, or the orthogonal distance (green line) between the camera and the plane?


If the depth returned is the yellow depth, is there an easy way to convert that to the orthogonal (green) depth from the camera using some built-in camera intrinsics/calibrations, such that the depth would be similar to the "z" component of a camera based coordinate system?

Thanks

And building on that last question, if the depth measurement is based on that yellow line (direct ray from camera center to object), and if you knew the physical characteristics of the camera, like FOV and focal length, etc, then could you not treat each pixel in the depth camera image as a ray trace and use depthValue * cameraRayDirectionAtPixel(x, y) to get a 3D (x, y, z) point cloud for every pixel in the depth image? Is there an accessible API for doing so?

1 Like

The depth returned will be the yellow line, the radial distance from the depth camera to the real-world location. Regarding the second question, we do not have a helper class for this but I will put in a request.

For our platform I subtract the principal point and divide by the focal length to get the x and y of the ray and set the z to 1. Then multiply the whole thing by the depth.

OOT, but would you mind sharing a sample of a code of how to save both RGB and depth frames?

1 Like

@fangetafe
Unfortunately I don't have a minimal example prepared, and my project is a bit of a mess, but I can give some pointers.

I find it's easier to start from one of the sample projects that reads from the RGB camera, then add the Depth Camera reading to it, copying from the Depth Camera sample project (am assuming you're using the Android Native API).

Basically in my class's constructor, I init the depth camera settings with

      // Initialize depth camera related structures.
      // (depth_cam_settings is a member variable of type "MLDepthCameraSettings")
      // (and depth_cam_data_ is a member variable of type "MLDepthCameraData")
      MLDepthCameraSettingsInit(&depth_cam_settings_); 
      depth_cam_settings_.flags = MLDepthCameraFlags_DepthImage | MLDepthCameraFlags_Confidence | MLDepthCameraFlags_AmbientRawDepthImage;
      depth_cam_settings_.mode = MLDepthCameraMode_LongRange;
      MLDepthCameraDataInit(&depth_cam_data_);  // data from depth cam frames are read into here

Then edit the SetupRestrictedResources() function to include the depth camera

  void SetupRestrictedResources() {
    ASSERT_MLRESULT(SetupCamera());
    ASSERT_MLRESULT(StartCapture());

    // connect to depth camera
    ASSERT_MLRESULT(MLDepthCameraConnect(&depth_cam_settings_, &depth_cam_context_));
  }

And then the tricky part is that the Depth Camera has a far lower frame rate (really slow) than the RGB camera. Also the depth camera data is read by querying it yourself rather than with a callback. So what I did was inside the RGB camera's video feed callback function, I query the depth camera to check if a new depth camera frame is ready, and assume that that depth frame was taken at the same time as the current RGB frame.

// The RGB camera's callback
  static void OnVideoAvailable(const MLCameraOutput *output, const MLHandle metadata_handle,
                               const MLCameraResultExtras *extra, void *data)
{
   ... do some stuff

    // replace "CameraPreviewApp" with whatever class your sample code project is called
    CameraPreviewApp* pThis = reinterpret_cast<CameraPreviewApp*>(data);

    // Query depth camera for if frame is ready
    int timeout = 0;
    MLDepthCameraData* data_ptr = &(pThis->depth_cam_data_);
    if (MLDepthCameraGetLatestDepthData(pThis->depth_cam_context_, timeout, &data_ptr) == MLResult_Ok)
    {
        MLDepthCameraFrameBuffer* depthBuffer = pThis->depth_cam_data_.depth_image;
        float* data = (float*)depthBuffer->data;


        // do what you want with the data

        // Free/release the depth data when done
        UNWRAP_MLRESULT(MLDepthCameraReleaseDepthData(pThis->depth_cam_context_, data_ptr)); ///< Always remember to release the data, otherwise buffer overload will happen.
    }

   ... do some stuff
}
2 Likes

I apologize for the necro thread, but I am interested in the approach you ended up taking to create your MLVec2f RgbPosToDepthPos(MLVec2f rgbPixelCoords, MLVec2f rgbImageResolution) function. How did you ultimately end up mapping the RGB pixels to the depth pixels?

Sorry @rscanlo2, I never did end up writing something to do that. If you need the depth of something detected in the RGB camera, the best workaround I can think of is seeing if you can detect the same object in the raw depth image (being essentially a monochrome Infrared image), if possible (since not all objects may not show up as well in IR light or low res), and then grab the depth at the pixel location of the object in the raw depth image. Probably not what you're looking for, but I don't have an easy generic pixel->pixel correspondence between RGB and depth images.

1 Like

Were you finally able to place the point cloud image onto the object? I could extract the point cloud image based on the suggestion of @kbabilinski and convert the radial distance to a straight distance (green line) but the transformation is wrong.
Is there anything to do with Unity and Magic Leap's coordinates?

Unfortunately, I have not @GspLi. My company just recently got acquired by another company and we were all moved onto a different project, so we haven't worked on this in a while. I wish you luck, though

Do you mind creating a new post for this question to not trigger notifications for everyone in this thread?

This topic was automatically closed after 17 hours. New replies are no longer allowed.