SWATH Identification

SWATH Identification Lehnert  2019-01-10

Dear Brendan,

we have done a SWATH project with a large library consisting of thousandes of proteins and are now trying to analyze our DIA run (HCP-Project). Unfortunately Skyline identifies and quantifies almost any protein present in the library although for most peptides there is only background noise. So far I am unable to remove those false postitives since I have not found a way to define a peak detection threshold. Is there any way to do that (or what am I doing wrong) and if not what can I do to remove those false positives?

Tahnks and best regards,


Nick Shulman responded:  2019-01-10
It sounds like you might want to add decoys to your dataset, and then train an mProphet model in order to come up with a false discovery rate.

Take a look at the Advanced Peak Picking tutorial:

If that's not what you're looking for, can you be more specific about what you were hoping Skyline would see in your library to determine which peptides to not quantify?
-- Nick
Lehnert responded:  2019-01-17
Dear Nick,

thank you very much for your answer.

The mProphet (reverse decoys) works just fine with a very small library (only 100 proteins from one DDA run) producing reasonable numbers.

When I use all my DDA data (42 runs - 21 from supernatants and 21 from cell pellets) and use mProphet, almost nothing gets identified no matter what q-value I use (from 0.001 to 0.9) except the two main proteins which is impossible given the nature of the sample (when mProphet is not used, everything gets identified). I attached the p-value ditribution graph of the sample after training.

When I use only the supernatant files (21 files - sample is a cleaned supernatant), I get again reasonable numbers (q-value distribution also attached). However we need to use all files.

I tried absolute and indexed retention times (since we also have samples with shorter gradients) and the results stay the same. Is there any possibility to get reasonable results with all DDA files? What q-values cutoff is reasonable (I normally use 0.01)?

Thank you very much in advance and best regards,