Assigning realistic materials to 3D models remains a significant challenge in computer graphics. We propose MatCLIP, a novel method that extracts shape- and lighting-insensitive descriptors of Physically Based Rendering (PBR) materials to assign plausible textures to 3D objects based on images, such as the output of Latent Diffusion Models (LDMs) or photographs. Matching PBR materials to static images is challenging because the PBR representation captures the dynamic appearance of materials under varying viewing angles, shapes, and lighting conditions. By extending an AlphaCLIP-based model on material renderings across diverse shapes and lighting, and encoding multiple viewing conditions for PBR materials, our approach generates descriptors that bridge the domains of PBR representations with photographs or renderings, including LDM outputs. This enables consistent material assignments without requiring explicit knowledge of material relationships between different parts of an object. MatCLIP achieves a top-1 classification accuracy of 76.6\%, outperforming state-of-the-art methods such as PhotoShape and MatAtlas by over 15 percentage points on publicly available datasets. Our method can be used to construct material assignments for 3D shape datasets such as ShapeNet, 3DCoMPaT++, and Objaverse.
Currently, our method assigns materials based on only a single input image. While extending our method to multiview would be straightforward, the even safer and easier option to obtain a coherent result is to propagate materials from visible to invisible parts based on adjacency information and part semantics. Here, we just want to highlight the problem by assigning a random non-fitting material to shape parts which are invisible in the input image.
@article{birsak2025matclip,
author = {Birsak, Michael and Femiani, John and Zhang, Biao and Wonka, Peter},
title = {MatCLIP: Light- and Shape-Insensitive Assignment of PBR Material Models},
journal = {arXiv preprint arXiv:2501.15981},
year = {2025},
}