creating peptide library from uniprot database human

support
creating peptide library from uniprot database human sponce1  2019-05-13 09:37
 

i downloaded all of the file formats (fasta, XML, ts, etc.) available from uniprot human

https://www.uniprot.org/uniprot/?query=proteome:UP000005640 reviewed:yes

and couldn't get them to load in the Peptide settings -> Building Library -> Input Files -> error
"fasta.gz is not a valid library input file.

I've attached my skyline file, a dataset and the fasta file. please make it work!

thanks.
-Sean

 
 
sponce1 responded:  2019-05-13 09:39

the skyline document didnt load, so i removed the input file data. and resubmitted.

 
Nick Shulman responded:  2019-05-13 10:27
Sean,

I am not sure what you are trying to do.

You would need peptide search results in order to build a spectral library. Here is a list of the peptide search result formats that Skyline can handle:
https://skyline.ms/wiki/home/software/BiblioSpec/page.view?name=BlibBuild

A spectral library contains a list of peptides and the MS2 spectra that they were detected in. Skyline uses the MS2 spectra to predict how the peptide is likely to fragment.

The thing that you downloaded from uniprot was a FASTA file. A FASTA file contains protein sequences.
You can add all of the proteins from a FASTA file to you Skyline document using the menu item:
File > Import > FASTA

Hope this helps. If this information is not helpful, let us know what you are trying to do.
-- Nick
 
Brian Pratt responded:  2019-05-13 10:44
Also you'll need to uncompress that fasta file before Skyline can work with it (.gz indicates that it is compressed with gzip).

- Brian
 
sponce1 responded:  2019-05-15 09:25
i'm like totally confused as to the inner mechanics of Skyline. The following thread helped me get very, very far https://skyline.ms/announcements/home/support/thread.view?rowId=2254, and right now I am in the process of making the NIST library.

I have DDA data that was acquired on a Lumos, this data is of human cancer cells hence why I was trying to use human fasta. I tried doing the fasta import but that started adding all the proteins (20k) in that file to my skyline document (so adding FASTA is to expedite adding peptides/proteins into my Skyline document?).

I wanted to do full-scan MS1 filtering of the data but I thought you need a spectral library to match the MS2 against (thats why i was trying to upload the fasta But now realize this is just protein sequences not MSMS data) (that matching of MS2 data with spectral library then facilitates the peak identification which Skyline then generates the peak shape by the matching of MS2 spectra and plotting it's associated intensities over chromatographic time). However, since my document has transitions in it, does Skyline automatically peak filter, i.e. extract matching cooridinates, based on the document's listed transitions for each peptide? Or do you need a correct spectral library (like which one from PeptideAtlas or Spectrast or NIST do i download?) along with the transitions I wish to examine? I also do not know the retention times of my analytes. Besides the NIST library, Im also processing all the DDA on Proteome Discover to make the library that way too, and see which works better for me. I wanted to thank you for forcing me to really get into the details of whats going on, like sometimes you can read something but until you do it, its not quite the same thing.

To each experiment, I added equimolar (10pmol) heavy label peptides (the 25 peptides in my skyline document) for quantification. The transitions in my document were chosen as best flyers by infusion of each standard peptide. I was hoping to use the heavy labeled peptides to then quantify the cognate light peptides by the MS1 intensities, just to get a rough idea (i'm aware of the pitfalls of MS1 quant). I'm not sure if Skyline produces MS2 quantification on DDA data. Then i wanted to use the DDA data (identify whats in there and get rough idea of quantification) and switch into targetted MRM to verify my findings. In MRM experiments, I still don't understand the purpose of a spectral library (maybe to provide prominent, specific transitions? or to verify that your observed transition data correlate with actual, known spectra?).

Sincerely THANKS!