You can see the Mixed Reality Capture Example Script for Unity in this post: Questions about eye tracking and Pixel Sensors since Oct Release OS 1.10.0 - #16 by kbabilinski
As for the offset, I'm not sure about your custom implementation, but the physical camera is not directly at eye level, which could cause an offset if you overlayed the content from the Main Camera. In OpenXR you can enabled Secondary View in the OpenXR features, inside your Project Setting, which renders a secondary camera at the position of the RGB camera to capture the virtual content without an offset.
Note this uses additional resources, but can be helpful when using the MR Capture methods that @cfeist mentioned.