Questions on baseline intensity and peak calling

Questions on baseline intensity and peak calling lparsons  2019-05-07

I am using a method similar to these two papers ( that combines MS1-only scanning with directed feature ID.

Briefly, our process starts by running two samples in triplicate MS1-only scans. We now use OpenMS for feature detection and consensus mapping to produce a list of peaks of interest that are selected as a result of group t-tests (3 runs of sample A vs 3 runs sample B). These features are then used to produce an inclusion list for the instrument to then do a directed DDA where features are pre-selected for MSMS. The MSMS data then is run through MaxQuant, and the MaxQuant data is fed into Skyline. In Skyline we then omit the raw MSMS data, instead using the raw MS1-only data in its place so that the peak intensities from the MS1-only data will be used for the MSMS features.

The results of this generally look pretty good. However, there are some features that come out of this with vastly different intensities from Skyline than what they had in OpenMS. In particular, there are some features where we know that sample B should not show any intensity while sample A should. On some of these samples, Skyline reports very low intensity (exported as "Total Area MS1" in Peptide Quantification report) where we expect none and yet on others Skyline reports very high intensities (up to 10^6 or more) when we expect none.

This leads me to a couple questions

  1. Is there a way to find out what Skyline sees for baseline intensity at a given time in a given file? I would expect this could be helpful for trying to figure out if the lower intensity peaks are just baseline AUC.
  2. Does Skyline do any kind of baseline correction for Total Area MS1? I didn't see a parameter for it though I may have missed one along the way.
  3. Is there a parameter I can change that will set the tolerances (RT, M/Z) for peak calling? I am wondering if the peaks might be called too generously here and the intensity is coming from neighboring unrelated peaks.

I also looked at the dot products of these features to see if that could lend some insight. When I plot median dot product vs median B intensity - for the peaks where I expect to see zero intensity from B but am getting positive "Total Area MS1" for B - I don't see a strong correlation though there is a wide distribution of median Total Area MS1 in the range of median dot products > .085 and < 1.

thank you

Brendan MacLean responded:  2019-05-22

Hi Lee,
Sorry for taking so long to get to this. I will attempt to answer your questions:

  1. Not so much. We have implemented the ability to export the entire extracted chromatograms using Skyline reports. Otherwise, Skyline is limited to exporting peak statistics and not a lot about arbitrary times in its extracted chromatograms.
  2. All Area statistics in Skyline have background subtraction applied. You can read more about how the peak statistics are calculated in this tip:
  3. Skyline is not good at picking noise when a peak is truly absent unless you have an isotope labeled internal standard with a nice peak to show Skyline where the elution actually occurred. Right now Skyline is too generous about finding what it considers the most likely representation of your target among all detected peaks, regardless of how likely this is. When you peak is not present, Skyline may pick something minutes away from the expected time. Wish I could say we are close to fixing this one, but unfortunately, it has been with us for a long time. In DIA/SWATH, people use mProphet scores to set a q value cut-off and treat peaks picked with q value below the cut-off as left censored (low intensity) missing.

Again sorry for the long wait on a response. Also, that I don't have an easy solution for better detection of label-free missing signal. Hope this helps.


lparsons responded:  2019-05-29


Thank you for the response, that was really helpful! This potentially explains the differences between the Skyline intensities and the ones we get for the same from OpenMS. For what it's worth, we found that the majority of peaks come in very similarly between the two methods (particularly once some simple corrections are applied across the board) but we had a few that had particularly large differences.

It's also really helpful to get developer feedback that I'm applying Skyline in a way that is outside the norms or intents for Skyline itself.