qick question feature description

qick question feature description Tobi  2020-04-08

Hey everyone,

can you please tell me what exactly the Library intensity dot-product from the mProphet scoring is? During recent webinar it was explained that it is the dotp between data and library. This opens the question on how can it be part of target decoy scoring if there are normally no dotp values for the decoys. It is quite important to know as this is the attribute with often the highest weight.

With best wishes,

Nick Shulman responded:  2020-04-08
When you generate decoys, Skyline remembers the original peptide sequence that the decoy was generated from.

Decoys are generated through a combination of permuting the original peptide sequence, and/or shifting by a certain mass.

The decoy peptide always has the same number of amino acids as the original peptide, so it always has the exact same set of ions (e.g. "y7+"), but the m/z of any particular ion is different between the decoy and original peptide.

When Skyline needs to know what the library intensity is of a particular (e.g. "y7+") decoy transition, Skyline looks at the spectrum of the original peptide, and uses the m/z value of the corresponding ion (i.e. "y7+") in the original peptide.

-- Nick
Tobi responded:  2020-04-09
Dear Nick,

thank you very much for the response.

Just to make sure I understand you, Skyline generates decoy library spectra by taking corresponding target library spectra and just shifting the peaks a little left and right to adjust for new mz values, but intensities stay the same. For mProphet Skyline then calculates decoy dotp as usual for the scoring.

Do you have data showing how closely those decoy library spectra reflect the same degree of quality as the target library spectra?

Thank you very much and with best regards,
Brendan MacLean responded:  2020-04-09
Hi Tobi,
The original mProphet implementation, which you can read about in the paper used shifts in the MS/MS m/z dimension. However, it is now far more common to use some permutation of the targeted peptide sequences: 1) shuffled, 2) reversed. Both preserve the N-terminal K or R and permute the remaining amino acid residues. After the permutation, the same ions are targeted (e.g. y5 - which likely has a changed but theoretically achievable m/z) and assigned the same intensity as they had in the original library spectra. So, for example:

PEPTIDER++ (478.7 m/z) -> y5+ = TIDER = 633.3 m/z with intensity 5000

reversed decoy

EDITPEPR++ (488.7 m/z) -> y5+ =TPEPR = 599.3 m/z with intensity 5000

shuffled decoy

ITPEEDPR++ (488.7 m/z) -> y5+ = EEDPR = 645.3 m/z with intensity 5000

I think a good way to gain a better understanding of this is to reduce you Skyline document to just 1 library peptide and then apply the decoy generation algorithms and inspect the results very closely. You can even use the Document Grid.

In Skyline, the precursor m/z is also shifted by 10 m/z, which gives you a decent likelihood of extracting the fragment ions from a different spectrum than the source peptide in DIA. Though, not necessarily when the isolation scheme uses m/z ranges wider than 10 m/z, e.g. the original SWATH method with its 25 m/z ranges.

I will admit this makes me think our current decoy criteria, which require only that the sequence is different from any of the other targets. So for the above:

PPETIDER++ (488.7 m/z) -> y5+ = TIDER = 633.3 m/z

Is perfectly valid, and yet it would extract exactly the same chromatogram as its source target from the original SWATH isolation scheme with a spectrum isolated from the m/z range 474.5 to 500.5. And in fact, this decoy might have very nice coelution on y3, y4, and y5, since they all have the same m/z as the source. Maybe a reason to prefer reversed over shuffled.

Anyway, a little more detail. Hope it is helpful.

Tobi responded:  2020-04-10
Dear Brendan and all,

thank you for the extensive response. Might not have been a quick question after looking at it now (:

The example nicely illustrates what we described before and I was using reverse decoys anyway. I also noticed that the decoy mass shift is applied just to the precursor but not to the fragments. While the shift is usually +10, why is it -23 sometimes when reversing (perhaps based on a DIA isolation scheme)?

Also, I can generate decoys of the same sequence as an existing target peptide, so what are decoy criteria and how are they applied? Because I do not see exclusion of decoys sharing the same sequence as a target neither in target list nor in the scoring.

Will take a closer look and come back to you with more questions and ideas.

Thank you very much and all the best,