mProphet discrimmination between heavy and light pairs

mProphet discrimmination between heavy and light pairs Fabian  2018-11-21

Dear Skyline team and users,

I have a DIA data set containing peptides with synthetic heavy peptides. In every sample the heavy peptide is well identifiable. The endogeneous light variants are well identifiable in every sample. In such cases I would like to have missing values for the light variant.
To do so, I trained a mProhet model which showed nice discrimmination between forward and decoys (also gausian shaped).
I was surprised to see that skyline kept all integrations in the light variants, although they were sometimes pretty bad.
I figured out, that when I delet a heavy peptide, that in such cases (no heavy counterpart) not all peaks were integrated in the light variant. So in these cases it seemed to work, although not so discrimminative as I hoped, but thats another question...

So it seems to me that the scoring is not done separatedly between H and L?

Kind regards

Brendan MacLean responded:  2018-11-21

Hi Fabian,
Yes, Skyline scores a range of time in which your target peptide is believed to be eluting. If correct, that makes all measurements on your targets valid and usable in a strict analytical chemistry sense. I don't understand your reasoning for why you would want to assign your light measurements a "missing value" status, simply because the signal you see on them during the time the peptide is known to be eluting is not well formed. I know from the statisticians working on MSstats that if you use MSstats for quantitative analysis after using this approach, then MSstats will be forced to impute a value for the signal of those missing values, and I know they will assume they are missing due to left-censoring (i.e. the signal is too low to be measured). The simplest imputation strategy, then, is to assign those values zero, but there are other more complicated methods for assigning a lower-bound value to a missing value, all in an effort to reduce bias due to the imputation method. Certainly, you can see that assigning zero will introduce bias toward lower than actual measurements since zero is as low as you can possibly go. Everything below your FDR threshold for L jumps immediately to zero. Pretty hard to imagine using that in a rigorous response curve setting for calculating LOD and LOQ.

My understanding of the more rigorous response curve setting for calculating figures of merit for an analyte is that you want the true signal on your light channel even (and especially) when you are measuring a blank sample where you know the analyte is not present, because that gives you useful information about the chemical noise your chosen transitions allow even when your analyte is present.

So, why exactly would you want to coerce your light signal to a missing value as soon as that signal becomes poorly formed? That seems to me like it has a serious problem with left-biasing your data and more for the transitions with the most interference since they will fail scoring cut-offs with higher analyte signal because of the interference.

True, we except this limitation in unlabeled DIA when we can't truly know the time of elution for an analyte, but I can't understand why you would want to imitate it in an experiment with heavy labeled standards that clearly demarcate the analyte elution time. I guess that is why the Skyline implementation works the way it does, because it gets a lot of use in that more rigorous quantitative setting. Can you explain why you think it is wrong or inferior to assigning a score to light as if it were part of an unlabeled experiment?

Thanks for posting to the Skyline support board.


Fabian responded:  2018-11-21

Dear Brendan,

thank you for your fast response!

My reasons are the following:

Of course, having the heavy variant has the advantage that the peak boundaries are known also in the light variant. This also (in principle) allows to tolerate signals lower in S/N and worse in shape than w/o heavy variants.
Nevertheless, especially in heterogeneous and complex samples, (in our cases FFPE slices of different human individuals) there could be other signals as well in the same m/z and time frame - interference. This can become really an issue and can screw up the quantification especially in low abundant targets!

Having only peaks quantified with a certain quality filter and missing values in samples in which the signal was too worse or absent or only interference allows me to do statistical imputation, this means I can add values which are below the least quantified value.
In that case, I can control that interference do not screw up the quantification!

If you like I can send you some examples?

Just to add, personally I think it makes totally sense to perform match between runs or match between H to L. For instances in cases the Light variant did not make the global Q-value. But I think it is much better to work with a lowered Q-value instead w/o any quality control. There is a point statistical imputation gets superior to raw data based "imputation" because of interference.

Kind regards


Brendan MacLean responded:  2018-11-21

Hi Fabian,
But that same interference is in your measurements even when you L signal is high. It just becomes more apparent when your L signal is low. So, you are inflating your measurements when signal is higher and probably deflating them when the interference distorts your signal to the point that your scoring model no longer trusts your measurement.

The original mProphet paper doesn't cover what you are requesting, and I think you would really have to create 2 mProphet models to achieve it. One model would treat your light precursors as a label-free experiment, as you said you are mimicking by deleting all of the heavy precursors, and the other model would be either only heavy precursors or all of the scores allowed by having both (some of which are proposed in the original mProphet paper). You certainly can't just use a model trained on heavy only for scoring your light precursors.

Is there a paper you know of that describes this and proves it works for quantitative analysis?

I think you now have an accurate understanding of what Skyline is doing and what it allows. To achieve your desired outcome, I would probably run your analysis once with all precursors and once with only light precursors and then use R to import the reports for both analyses and assign NA to the light peaks in the full analysis where the light analysis either had a q value below your cut-off or chose a peak with RT outside the bounds of the peak chosen in the full analysis.

If you are doing statistical imputation, you should be able to achieve this in a relatively straightforward manner, and I think this would be the most valid way for Skyline to achieve what you request. Run two models and filter as I have described.

I am still not sure I think it is a good idea. I would need to see a more rigorous proof that this statistical exclusion combined with imputation produces better quantitative results. Seems counter-intuitive from what I know of more precise calibrated quantification.


Fabian responded:  2018-11-23

Dear Brendan,

i think, there are actually different scenarios.

  1. same interference in the samples, scales with the actual signal.
  2. same interference in the samples, but does not scale with the actual signal - as lower the (L) signal as higher the quantification error based on the interference.
  3. interference signals are different between the samples.
  4. no real signal at all in the sample.

The latter three scenarios can really screw up the quantification. This is of course commonly related to low abundant signals. Therefore having no quantification at all for some of theses cases is beneficial. The imputation allows you to control that the imputed values are below the trustworthy quantified signals (avoiding interference issues). Now you can make all kind of statistical tests which have problems with missing values and afterwards you can delete the imputed values from the report. I think thats the best way to handle such cases.

Thank you for your solution! However, the results from using only the light signals as training set for mProphet did not convinced me in my case.
I did as follows; I imported the results for H and L - based on the good H signals the correct boundaries were chosen in all cases.
Know, I used "edit - refine - advanced" to get rid of the light signals, then i trained mProphet and via "edit - refine - advanced" i re-added the light signals.
The next step was that I deleted the heavy signals with the same approach and applied mProphet "trained on the H signals" only to the L signals.
I played with the Q-value to find a value I deemed useful. Afterwards i re-added the H-signals.
This reduced the afford for the final manually refinement step which was still necessary.
Not sure if this is the optimal possible solution, but this was what worked best for me.

Kind regards


Brendan MacLean responded:  2018-11-23

Hi Fabian,
I still don't totally agree with you on your approach, but I expect I am not going to convince you. I do think that the execution you described at the end would not pass muster in a peer review journal. At least I hope not. You can't train a model on injected standard signal and then apply it to your light-only signal.

I am suggesting:

  1. Do your experiment as normal with both light and heavy, and decoys based on this set-up and model on the imported results. Export a report.
  2. Do your experiment as if you had not injected heavy peptides at all and you have only label-free data with decoys based on this and model on the imported results. Export a report.

Then use a statistical programming language like R or Python to combine these two processing approaches and set all light values to NA in report 1 where report 2 did not achieve q value < 0.01 or the chosen peak RT is not between the integration boundaries of report 1 values.

Anything else and I have a very hard time imagining the statistics are valid and as a reviewer of your paper, I would want you to prove that they are or cite a paper that has made this proof.

Hope this helps.


Brendan MacLean responded:  2018-11-23

In the 4 points you made, I think most analytical chemistry would assume #2 and #3 would be considered the real problem (and #1 as an extremely unlikely quirk resulting, amazingly, in interference with no impact). However, #3 is the real problem no matter what you do. Having heavy standards you would hope to rule it out with measures like the rdotp (Ratio Dot Product) value Skyline provides. Showing high correlation in rdotp then limits your chance of #3 to another analyte with the same relative ion abundance on all measured transitions, another extremely unlikely occurrence.

Beyond that, I am doubtful of your claim that you think you can impute away the effect of interference and so reliably increase your dynamic range to lower your LOD or LOQ, by choosing a set of measurements on which to perform this imputation by statistical inference. Again as a reviewer, I would want you to prove that to me with your own experiments or a citation. If you have either now, I would love to hear more.

Fabian responded:  2018-12-03

Hi Brendan,

thank you for your response!

As far as I understand your approach, that would mean that you loose the advantage of the H signal for recognition at all.
The mProphet concept assumes that the a priori detection of the forward peptides is majoritan correct. Otherwise the training of differentiating true from false is not working convincingly. In my case, having mainly low abundant light targets, the mProphet training on the light signals (detected without help of H signal) has not convinced me. But that is of course a specific scenario. What worked well for me was the approach I described, with that I am quite sure, that the filtering is of high standard but of course manually refinement was necessary so that this approach is not applicable for global analysis. All other statistical-only approaches did not work so well.
But I will keep your idea in mind and will try it the next time again.

Regarding your second answer. Yes using the rdotp is very helpful!
But that means that one uses a quality measure to decide If one wants to quantify a peptide/signal or not.
Meaning missing values will occur.
And to avoid them, there are two concepts available – statistical imputation and “match between runs (or H to L signal)”
If the signal is well enough, match between run is superior to statistical imputation because it gives a quantitation of the real signal.
If the signal is crappy, statistical imputation helps keep the interference issue under control.
Quality criteria like rdotp are should be used in the decision which is the better approach.

Speaking of LOD or LOQ is tough in that sense. But the discussion of this might go too far at the moment. Because I am not sure, if we understand each other totally.

As you sad, possibly we cannot solve the issues solely on theoretical considerations. I will try to make the experiments and analysis available to you as soon as I am able to do so.

Kind regards


Btw. the issue with the same relative ion abundances on all transitions - which can be used for quantitation - is seldom but not extremely unlikelyin my experience – think of I L swaps, a scenario in which is very helpful to have a H signal ;)