An improvement of shotgun proteomics analysis by adding next-generation sequencing transcriptome data in orange

PLoS One. 2012;7(6):e39494. doi: 10.1371/journal.pone.0039494. Epub 2012 Jun 29.

Abstract

Background: Shotgun proteomics data analysis usually relies on database search. Because commonly employed protein sequence databases of most species do not contain sufficient protein information, the application of shotgun proteomics to the research of protein sequence profile remains a big challenge, especially to the species whose genome has not been sequenced yet.

Methodology/principal findings: In this paper, we present a workflow with integrated database to partly address this problem. First, we downloaded the homologous species database. Next, we identified the transcriptome of the sample, created a protein sequence database based on the transcriptome data, and integtrated it with homologous species database. Lastly, we developed a workflow for identifying peptides simultaneously from shotgun proteomics data.

Conclusions/significance: We used datasets from orange leaves samples to demonstrate our workflow. The results showed that the integrated database had great advantage on orange shotgun proteomics data analysis compared to the homologous species database, an 18.5% increase in number of proteins identification.

MeSH terms

  • Citrus sinensis / genetics*
  • Citrus sinensis / metabolism*
  • Databases, Genetic
  • Databases, Protein*
  • Gene Expression Regulation, Plant
  • Molecular Sequence Annotation
  • Peptides / metabolism
  • Plant Leaves / genetics
  • Plant Leaves / metabolism
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Proteomics / methods*
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Transcriptome / genetics*

Substances

  • Peptides
  • Plant Proteins