This study proposes ConditionCDVAE+, a crystal diffusion variational autoencoder (CDVAE) based deep generative model for inverse design of van der Waals (vdW) heterostructures. To address the challenges of traditional experimental methods relying on trial-and-error and existing models struggling to incorporate target property constraints, this work achieves breakthroughs through three innovative stages: (1) introduce the SE(3)-equivariant graph neural network EquiformerV2 as the encoder-decoder within the CDVAE framework to enhance the generation quality of the model; (2) design a module integrating Low-rank Multimodal Fusion and Generative Adversarial Networks to map properties and structures into a joint latent space; and (3) for the first time propose a generative model for the vdW heterostructures, by conducting experimental validation on the dataset constructed from Janus III-VI vdW heterostructures. Experiments demonstrate that ConditionCDVAE+ achieves optimal root mean square error for crystal reconstruction, with improved generation quality. Density Functional Theory calculations confirms 99.51% of generated samples converge to energy minima, indicating superior ground-state convergence. The effectiveness of the model under conditional guidance has also been extensively validated. This framework provides an efficient solution for target-oriented design of vdW heterostructures and holds promise for accelerating the development of novel optoelectronic devices.
© 2025. The Author(s).