TM-NET: Deep Generative Networks for Textured Meshes


Lin Gao1          Tong Wu1          Yu-Jie Yuan1         Ming-Xian Lin1         


Yu-Kun Lai2       Hao(Richard) Zhang3



1Institute of Computing Technology, Chinese Academy of Sciences


2Cardiff University       3Simon Fraser University








Figure: Our deep generative network for textured meshes, TM-NET, can automatically synthesize multiple textures for the same input 3D shape (top) and blend between textured meshes via latent space interpolation (bottom). The chair meshes have a relatively low resolution (less than 4,000 vertices) but the generated textures exhibit the appearance of topological details on the chair backs.






We introduce TM-NET, a novel deep generative model capable of generating meshes with detailed textures, as well as synthesizing plausible textures for a given shape. To cope with complex geometry and structure, inspired by the recently proposed SDM-NET, our method produces texture maps for individual parts, each as a deformed box, which further leads to a natural UV map with minimum distortions. To provide a generic framework for different application scenarios, we encode geometry and texture separately and learn the texture probability distribution conditioned on the geometry. We address challenges for textured mesh generation by sampling textures on the conditional probability distribution. Textures also often contain highfrequency details (e.g. wooden texture), and we encode them effectively with a variational autoencoder (VAE) using dictionary-based vector quantization. We also exploit the transparency in the texture as an effective approach to modeling highly complicated topology and geometry. This work is the first to synthesize high-quality textured meshes for shapes with complex structures. Extensive experiments show that our method produces high-quality textures, and avoids the inconsistency issue common for novel view synthesis methods where textured shapes from different views are generated separately.




paper thumbnail



TM-NET: Deep Generative Networks for Textured Meshes

[ArXiv Preprint]


















Overview of TM-NET



Figure:An overview of the key components of TM-NET. Each part is encoded using two Variational Autoencoders (VAEs): PartVAE for geometry with EncP as the encoder and DecP as the decoder, and TetureVAE for texture with EncT as the encoder and DecT as the decoder. To make geometry guided texture generation possible, the conditional autoregressive generative model PixelSNAIL[Chen et al. 2018] is adopted here, which takes the latent vector of PartVAE as condition input and outputs discrete featuremaps that are to be decoded as texture images for the input geometry.




TextureVAE and PartVAE for Encoding a Textured Part



Figure: Our architecture for representing a textured part, which involves a PartVAE for encoding the geometry and a TextureVAE for encoding the texture.




TextureVAE for Texture Encoding



Figure: The architecture of TextureVAE. The encoder maps the input image patch onto two continuous feature maps. Then the dictionary-based vector quantization is performed to make these feature maps discrete. The decoder takes the discrete feature maps as input and reconstruct the image.




PixelSNAIL for Conditional Texture Generation



Figure:The architecture of auto-regressive generative model PixelSNAIL [Chen et al. 2018].








Shape Interpolation




Figure: Interpolation between two models with different textures in test set. The first column and the last column are the shapes to be interpolated. The other columns are the in-between textured shapes by linear interpolation in the PartVAE and TextureVAE latent spaces.




Geometry Guided Texture Generation




Figure: Automatic texture generation. The first column is the input shape without texture. The four remaining columns are automatic texture generation results after sampling four times with the same geometry conditional input.




Textured Mesh Generation




Figure: Representative results of generated shapes with textures. We first randomly sample on the latent space of SP-VAE to generate structured meshes. We then use geometry latent as condition for PixelSNAIL to generate desired textures.











@article {gao2020tmnet,
    author = {Lin Gao and Tong Wu and Yu-Jie Yuan and Ming-Xian Lin and Yu-Kun Lai and Hao Zhang},
    title = {TM-NET: Deep Generative Networks for Textured Meshes},
    journal = {CoRR},
    year = {2020},
    url = {},
    archivePrefix = {arXiv},



Last updated on Oct, 2020.