iPhone 3D Scanner app RGBD object classification using CNN

I have been experimenting with the iPhone 3D Scanner App. Modern iPhones include a LiDAR sensor that captures depth maps aligned with the regular camera images. I wanted to show how to take those RGB and depth frames, combine them into simple RGBD images, and use them for object classification.

The process is straightforward. I pick a few objects around the house and record a short scan of each one. Afterward, I crop the data to remove the unnecessary background, leaving only the object. 

The “All Data” export option gives me the raw camera frames, depth maps, and confidence images. Each depth map aligns with its corresponding RGB frame, forming a four-channel representation. Three channels hold the color information, and one carries the distance to the camera. This simple combination already adds a sense of shape that pure RGB cannot provide. For small experiments, it is more than enough.

A compact CNN can then learn to classify the object based on these RGBD inputs. The goal is not to chase accuracy or build a classification benchmark. The value lies in hands-on learning. I collect the data myself, understand its noise and limitations, and train a model on data captured by my own camera. That experience teaches more than working with polished 3D objects downloaded from the internet.

Later, I will attach the GitHub repository with the supporting code.



As an Amazon Associate I earn from qualifying purchases.

No comments:

Post a Comment

Post Scriptum

The views in this article are mine and do not reflect those of my employer.
I am preparing to cancel the subscription to the e-mail newsletter that sends my articles.
Follow me on:
X.com (Twitter)
LinkedIn
Google Scholar

Popular Recent Posts

Most Popular Articles

apt quotation..