High error rate in Skyline daily's peak picking

support
High error rate in Skyline daily's peak picking graceac9  2018-06-20 16:45
 

I am doing a DIA project with oysterseed proteomic data. The protocol I have calls for checking Skyline's peak-picking accuracy. I have gone through it a few separate times and picked 100 random peptides and gave them a score of 0 (incorrect peak; no peak; incorrect peak boundaries) or a 1 (correct peak and correct boundaries). The error rates for samples were around 50%.

Two other people in my lab have done similar projects within the last 1.5 years, and their error rates were around 30%. I have gone through the protocol and checked settings again and again, but my error rates are still very high.

Please let me know if there is any more information I can provide on this to help resolve this issue! Thank you.

Attached is my file containing my peak-picking error rates

 
 
Brendan MacLean responded:  2018-06-22 15:49

Hi Grace,
We will need at least a Skyline document with the peptides you sampled and all of the chromatograms for those. You can post that by using File > Share - Complete to get a ZIP file and post that to

http://skyline.ms/files.url

Then we may be able to learn more about why Skyline is doing so poorly on your data set.

Thanks for reporting this to the Skyline support board.

--Brendan

 
graceac9 responded:  2018-06-25 11:22

Hi Brendan,

Thank you! I just uploaded the ZIP file to the url you provided.

  • Grace
 
Nick Shulman responded:  2018-06-25 14:08
Hi Grace,

I see that you have a spectral library with 12755 peptides in it. This library contains theoretical spectra (that is, the spectra have peaks for each b and y ion of the peptide, and the intensity of each peak is exactly 100).
Where did the list of peptides in the spectral library come from? Do you have a sense for what fraction of these peptides you would expect to be able to detect in one of your samples?

When you extract chromatograms from result files, Skyline tries to find the best looking peak for each peptide. In this first sort of peak finding, Skyline only looks at a 7 different characteristics (features) of the peak, including total intensity, number of co-eluting transitions, difference in retention time from prediction, whether there are any MS2 ID's at the time, dot product with the library intensities, and a couple of other characteristics that I am not sure what they are.

If you want Skyline to look at more features, and also if you want Skyline to assign peaks a Q-value (for false-discovery rate), then you should add some decoy peptides to your document, extract chromatograms for the decoys, and then train a peak picking model.

You should take a look at the Advanced Peak Picking Models tutorial:
https://skyline.ms/wiki/home/software/Skyline/page.view?name=tutorial_peak_picking