Advanced peak picking models for small molecules

support
Advanced peak picking models for small molecules Mathias Kuhring  2017-12-13 08:43
 
Are there any plans for advanced peak picking models for small molecules? The machine learning approaches (such as mProphet) rely on decoys, which seem to be available for peptides only. Will there be other algorithms or at least more parameters (e.g. smoothing, selecting biggest peak instead of closest RT, ...) to improve peak detection and integration for small molecules?
In particular with low concentrations and intensities, I often observe that a) a small noise peak closer to expected RT is preferred to considerably bigger peaks (which are often detected and marked with the dashed lines, but not selected). And b), rather jagged peaks are identified as several peaks (which often contributes to problem a). I think, having some control of peak integration parameters might help here.

Best, Mathias
 
 
Brian Pratt responded:  2017-12-15 09:37
Hi Mathias,

Peak picking is famously tricky, of course. And unfortunately I haven't yet encountered anyone with an idea of what decoys would look like for generalized small molecules.

Our guidance from the small molecule community to date has been that they believe their chromotography to be solid enough that we should put great weight on the explicit RT value when provided, so as you have observed we don't always go for the nearby big peak when there's a smaller one that looks good from an RT point of view.

This is an excellent conversation to be having, and we would value any specific suggestions (and example data sets) that you think could improve things.

Thanks,

Brian Pratt
 
Mathias Kuhring responded:  2018-02-21 09:40
Hey Brian,

I finally got around to come up with some examples (as screenshots and partly with raw data) to demonstrate our problems with low concentration/intensity samples and the difference in detection of Skyline and Xcalibur/TraceFinder. I attached screenshots and corresponding raw data. However, unfortunately I don't have clearance to share data of the second example.

Example 1 ("uridine-neg_K01" and "uridine-neg_K03") suffers from RT drifts (as generally observable in the low concentration samples of the corresponding data set). Using Thermo's TraceFinder (basically Xcalibur algorithms), I'm able to detect the correct peaks by detecting the highest peak instead of the nearest and in case of K01 by additionally increasing the smoothing.

Example 2 ("uridine_CalG-04") features less drift, however the peak is really jagged and thus only a part is selected by Skyline. In contrast, selecting the highest peak in TraceFinder would provide me at least a proper peak height. Alternatively, increasing the smoothing even results in the whole peal detected.

I think having the options for smoothing and selecting the highest peak will result in highly improved automated peak detection. And since higher concentration/intensity feature clearer chromatograms anyway, optimizing low intensity peak detection with these parameters won't affect them. Except in special cases like isoforms maybe, but in the end an option is called like that for a reason ;-).

Other than that I really prefer Skyline over Xcalibur/TraceFinder due to several reasons. However, this issue in particular is holding me back in using Skyline regularly and eliminates my leverage to establish it in my lab. So I hope my examples provide proper insight into the issue and demonstrate the usefulness of these parameters :-). Please tell me, if you need any more information.

Best, Mathias

Little side note: all your mails (@proteinms.net) are considered as spam in Gmail. It complains that your mails are not authenticated (https://support.google.com/mail/answer/180707?visit_id=1-636548254540712293-22150597&p=email_auth&hl=en&rd=1).
 
Brian Pratt responded:  2018-02-22 08:45
That's very helpful, thanks. May we also have any corresponding Skyline documents to complete the picture?

Thanks also for the note about spam troubles. That's the first we've heard of it (and its odd, because I'm pretty sure proteinms.net is actually implemented using gmail). We'll look into it.

Thanks

Brian
 
Mathias Kuhring responded:  2018-02-28 04:40
Hey Brian,

I attached the Skyline document (saved in the same folder as the raw files).

Best, Mathias
 
Brian Pratt responded:  2018-03-01 10:04
Thanks, that's very helpful!

Brian