Image fusion aims to combine the complementary features of different modalities to produce an informative fused image. Due to the different imaging mechanisms, information conflicts may arise from infrared and visible source images. Existing infrared and visible fusion methods are devoted to preserving the features of source images as much as possible. However, handling conflicting information is often overlooked. Thus, we leverage the powerful generative priors of diffusion and propose a dual-diffusion structure, termed DSFuse, to handle conflicting information and achieve feature fidelity during image fusion processing. Diffusion modules are introduced to guide the fusion network to understand the meaningful information of the source image easily. First, the fusion network is used to retain features in the fused image as much as possible. Then, diffusion modules are used to reconstruct source images from noise based on the output of the fusion network. Finally, feedback from the diffusion modules forces the fusion network to aggregate modality information to ensure fidelity; the high quality of the fusion result is also profitable for a better reconstruction of diffusion modules, forming a positive feedback loop. In addition, we release a new dataset for infrared/visible fusion to support the fusion network training and evaluation, named the multiscene infrared and visible (MSIV) images dataset. Extensive experiments demonstrate that DSFuse outperforms other state-of-the-art (SOTA) fusion methods.