How are "second best peaks" determined during mProphet modeling?

How are "second best peaks" determined during mProphet modeling? a. schroeder  2019-02-02

Dear Skyline Team,

during mProphet modeling "second best peaks" can be used instead of (or in parallel to) decoy peptides.
As far as I have gotten into the subject, Reiter et al. only mention decoy peptides in their mProphet publication.

Is the use of second best peaks Skyline-specific?
And how are they determined during modeling?
Basically: What are second best peaks?

Thanks a lot for your great work!

Brendan MacLean responded:  2019-02-04

Hi Ayla,
They are always calculated during modeling by scoring all peaks with the current model (in the first iteration a bootstrap model is used). Then the "target" values are taken as the best scoring peaks for each target (i.e. peptide) and the "second best" decoy peak is taken as the second-best scoring peak, assuming it should not be a true peak if the best peak is, which means it is a random occurrence.

The tricky thing with these peaks is their retention time, which is by definition not independent of the target peaks, since they can't both occupy the same retention time. So, for a truly random set of decoy scores where targets and decoys are independent when you are using second best peaks, you should exclude retention time scores.

That's my thinking anyway. This feature just comes out of experimental thinking at the time the mProphet support was added to Skyline. I have never seen a paper that proves it works or that supports my assertion above. Some people have had success with it, and if you are trying to use mProphet with true targeted methods where you forgot to include decoy targets, then it may be your only option. It has been used in published papers.

Hope this helps. Sorry, I can't make any claims about the validity of the models using this feature. It is sort of use at your own discretion.


a. schroeder responded:  2019-02-06

Hi Brendan,

thank you very much for your detailed response!