Help with camera capture, data flow, and body alignment

Hi,

I’m working on an academic project using Magic Leap 2 and am hoping to get some advice. The goal is to capture images or videos of a person standing in front of the device, analyze these frames (e.g. using a remote AI server for face or body recognition), and display a 3D model or avatar on top of the detected person, aligned correctly in the scene.

So far I’ve looked at the developer documentation, implemented a basic image capture using MLCamera.CaptureImageAsync in Unity C#, and tested saving the images locally (e.g. with Application.persistentDataPath). I’ve also tried sending the captured images to a server for external processing.
But I can’t save the images and therefore send them.

I’m having some difficulties at the moment:

Sometimes the images don’t seem to be reliably saved to disk, or I’m not sure which path to use.

I don’t know whether to stick with MLCamera, look at the Camera2 API, or use something else for this type of continuous image capture and streaming.

I’m trying to figure out how to overlay and align a 3D model in Unity to match the position of the person detected in the real world.

I would like to know if Magic Leap 2 provides built-in support for body tracking or human pose detection, or if this is something I will have to handle entirely with external tools.

If anyone has any tips, examples, or can point me to best practices for capturing and saving images, sending frames to an external server, and aligning the 3D models with the people in the scene, I would greatly appreciate it.

Thanks a lot for your help

Overlaying a virtual avatar on a person—with AI handling detection and alignment—is beyond the scope of this forum. Because the RGB camera provides only 2-D data, you would need extra logic to derive an accurate 3-D pose. Depth estimation, triangulation, or a fusion of multiple sensors can help, but implementing this reliably is non-trivial and can lead to unstable results.

Saving individual frames from the video feed can also introduce performance issues, as each frame must be encoded (e.g., to JPEG or PNG) and then written to storage or transmitted over the network.

Depending on your setup, consider the Magic Leap Unity WebRTC example, which demonstrates how to stream the camera image using the Unity WebRTC package:

https://github.com/magicleap/MagicLeap2UnityWebRTCExample