U1 small nuclear RNA (snRNA) mutations are recurrent non-coding alterations found in various malignancies, yet their identification has proven challenging due to their repetitive nature. We characterized the complex interindividual diversity and genomic architecture of U1 snRNA loci using sequencing data and a pangenome reference. Our analysis uncovered copy number variations and the diversity of single-nucleotide variants in regions not predicted to have significant functional impact. Compared to traditional linear reference-based analyses for mutations, the pangenome graph demonstrated the best accuracy, successfully identifying previously undetectable mutations. This underscores the utility of pangenome graph references for cancer genome research, particularly in repetitive and highly diverse genomic regions. Additionally, we developed mutation detection methods employing targeted capture sequencing, rapid quantitative polymerase chain reaction, and a machine learning approach based on splicing patterns, all exhibiting high precision in identifying U1 snRNA mutations. Our findings elucidate the structural complexity of U1 snRNA loci and establish robust methodologies for precise mutation detection in these regions.
Keywords: diagnostic methods; graph genome; medulloblastoma; pangenome reference; segmental duplications.
© 2025 The Author(s). Cancer Science published by John Wiley & Sons Australia, Ltd on behalf of Japanese Cancer Association.