Comparing Peak Scoring / Sample Specific Peak Scoring Models

Comparing Peak Scoring / Sample Specific Peak Scoring Models peter r mosen  2021-06-17

Hello Skyline team, hello Skyline community,

Is there any tutorial etc. which deals with Skylines “Comparing Peak Scoring“ function? I would like to understand the plots (e.g. the different y-axis options) as well as its application better. The tutorials 14, 15 and 18 cover peak model building in general but I didn’t find anything on the “Comparing Peak Scoring” option.

Briefly about my situation: In my targeted MRM assay I am monitoring peptides across multiple organs (9x) and cell lines (5x). For data analysis I am using mProphet peak scoring models. I was now wondering if it is best to use for each organ type its own peak scoring model, or if a mixed-organ respectively mixed cell line peak scoring model can be used for the individual organs. Anyone with any experience in that? Any suggestions, any hints ?

Best, Peter

Nick Shulman responded:  2021-06-17
I do not know the answer to your question but if you send us your Skyline document we can take a look at it.

In Skyline you can use the menu item:
File > Share
to create a .zip file containing your Skyline document and supporting files including extracted chromatograms.

If that .zip file is less than 50MB you can attach it to this support request.
Otherwise you can upload it here:

Did you collect data for decoy peptides when you acquired your MRM data? It is always best to be able to use decoys when you are training your mProphet model. If you are using "second best peaks" you probably will not be able to trust the false discovery rate at all.

My guess would be that it would be best to use one peak scoring model for all of your organ types, I definitely might be wrong.

In general, if you trust your decoys, then whatever peak scoring model produces the best separation of targets and decoys is the best to use.
I am not very familiar with the Compare Peak Scoring window. I am not sure whether that window will be useful to you, or maybe it was only intended to be useful to the programmer who was working on implementing the mProphet peak scoring feature in Skyline.
-- Nick
peter r mosen responded:  2021-06-21
Hi Nick,
Thank you for your quick reply and please apologize my delayed response. Due to the size of our MRM – assay (> 400 peptides, 800 transitions), we were not able to include/measure decoy peptides. We planned to use now the second best peak option and I performed some (in my eyes) successful tests with it. Composite score distribution and p/q –value distribution looked good. Maybe you can briefly comment on your statement that we “will not be able to trust the FDR at all”. I don’t understand.

Screening through older Skyline post (e.g. “Compare mProphet models” and “How are "second best peaks" determined during mProphet modeling?” ) I understood Brendans comments, that the 2nd best peak option is maybe not the best solution (compared to real decoys) however it is still a valid approach. Or can the 2nd best peak scoring option only be applied for full MS scan approaches (DIA/PRM)?

Regarding the generation of individual peak scoring models (per tissue/organ) or mixed scoring models, I will get back to you.
Thank you for your help, Best, Peter
Nick Shulman responded:  2021-06-21
The way that Skyline figures out the false discovery rate for a given cutoff score is, I believe, by looking at what fraction of the peaks with a score greater than that cutoff are decoys. When you are using second best peaks as the decoy strategy, every target (i.e. best) peak is always going to have a higher score than its corresponding decoy (i.e. second-best) peak. For this reason, in the "model scores" graph, the Targets distribution would always be shifted a little to the right relative to the second best peak distribution, even if none of the target peptides were present in the sample. This is the main reason that I don't trust the false discovery rate with second-best-peaks.

The model training relies on the assumption that the feature scores for incorrect peak identifications will be randomly distributed. For this reason, if a larger-than-expected-by-chance gap is seen between targets and decoys for a particular score, it is assumed that the reason for this gap is that some of the target peaks are correct identifications. This ends up being a bit of a problem for the score "Retention Time Difference". The best peak and the second best peak cannot have the same retention time, because they cannot overlap with each other. Because of this, the target and decoy "Retention time difference" scores end up having an artificially large gap between them, and will receive an unduly large weighting when you train a model using second best peaks. You should uncheck the "Retention time difference" score if you are using "second best peaks" as your decoy strategy.
Hope this helps,
-- Nick
Brendan MacLean responded:  2021-07-28
Nick says, "I believe, by looking at what fraction of the peaks with a score greater than that cutoff are decoys."

This is not true. Peaks are assigned p values based a normal distribution fit to the decoy distribution. Each round of unsupervised learning is computed using all second-best scoring peaks against only the highest scoring peaks above a cut-off which reaches 0.01 by the end.

Otherwise, I mostly agree with what Nick has said.