What is PITDB?
PITDB is a publicly available database for sharing results from published PIT (proteomics informed by transcriptomics) experiments. The PIT approach [1] involves the analysis of a given sample by both RNA-seq and proteomic mass spectrometry followed by sequence-level integration of the acquired data to provide an unprecedented insight into which genomic elements are being transcribed and translated within a given sample. The benefit of this approach is that any expressed polypeptide can be detected, unlike traditional proteomics which is constrained to searching spectra against a list of known protein sequences.
What is a translated genomic element (TGE)?
We define a TGE as an amino acid chain produced from a genomic locus through the processes of transcription and translation. The typical example of a TGE is a protein, encoded by a gene. However, other types of TGE have been reported in the literature, such as short open reading frames (sORF) and products of so-call non coding RNA (ncRNA).
How do I browse PITDB?
You can browse the experiments and samples that make up PITDB by clicking Browse in the top right of the PITDB web interface. More usefully, you to access specific information in PITDB via four simple searches:
- Search by TGE to find PIT evidence for a specific translated genomic element (TGE) by searching for its accession number (e.g. TGE0000273) or amino acid sequence. Each TGE is supported by one or more TGE observations, comprising a transcript from RNA-seq and peptide evidence of translation of the transcript from mass spectrometry. Each TGE is also assigned a type (e.g. known variant) through comparison with reference databases. A single TGE (as defined by its unique sequence) may have been observed across multiple species.
- Search by experiment or sample (e.g. EXP000001 or SAMP000001) to view summary information for a particular experiment that has been submitted to PITDB. An experiment may contain one or more samples. The summary information includes information about the number of TGEs identified, the distribution of these among the samples, and a brief description of the experiment.
- Search by protein or gene name to find TGEs corresponding to a specific protein, identified by its Uniprot ID (e.g. Q52L50). This shows any TGEs corresponding to the protein and is useful in seeking experimental evidence for a theoretical protein, or investigating isoforms of known proteins. For species with a genome assembly, TGEs are shown in their genomic context alongside their corresponding genes.
- Search by species to find all TGEs observed for a particular species. For non-model organisms this essentially provides a draft proteome for further analysis. However, it is important to note that the completeness of the proteome will depend on the diversity of samples analysed, e.g. whether they came from multiple tissues.
In all the database views, specific TGEs can be located using the search box just above the TGE table.
Downloading data from PITDB
Detailed information from a particular database view can be downloaded for further analysis by clicking the relevant Download button above the table of interest and selecting the items that you want to download, and the format you want to download them in.
Submitting data to PITDB
We welcome submission of data to PITDB, in support of published work. To begin a submission, click Submission in the top right of PITDB web interface and follow the instructions.
- Evans, V.C., et al., De novo derivation of proteomes from transcriptomes for transcript and protein identification. Nature Methods, 2012. 9(12): p.1207-1211.