Issue 610: Fix MaxQuant library builder to extract original spectra from source data files

Assigned To:Guest
Opened:2018-11-12 by Brendan MacLean
Changed:2019-07-12 by Brendan MacLean
Resolved:2019-01-16 by Matt Chambers
Closed:2019-07-12 by Brendan MacLean
2018-11-12 Brendan MacLean
Title»Fix MaxQuant library builder to extract original spectra from source data files
Assigned ToGuest»
It has been pointed out many times over many years now that the MaxQuant Andromeda output BiblioSpec uses for spectra in its spectral libraries (found in the msms.txt file) are actually isotope deconvoluted spectra and not the original raw spectra. This is not a huge problem for doubly charged precursors where doubly charged fragments are infrequent, but with triply charged and higher it becomes a bigger problem.

For a number of years, we have hoped the MaxQuant team would offer us a solution that would provide the raw spectra in the msms.txt file, but this has not happened, and both MaxQuant and Skyline have become more popular, along with DIA methods that rely heavily on library spectra.

So, we need to fix the MaxQuant parser in BiblioSpec to get its spectra from the source data files and error if they cannot be found, as we do for other results parsers, such as pepXML and mzIdentML. We should leave the code in place to get spectra from msms.txt for when/if we get a better solution from MaxQuant. Though, even then, we will want to keep code for returning to source files in order to handle MaxQuant results that are either older or just don't contain raw spectra for some reason.

2018-11-19 Matt Chambers
You mention erroring if the source data files can't be found, but what if the source data is inaccessible? Like, for the 2 existing MQ unit tests. I can't find sources for either D20110201_Exp2_TBK1 or JD071913. And certainly users could have data they want to import which they don't have the raw data for anymore. So perhaps make the error optional but make erroring the default?

2018-11-19 Brendan MacLean
Hmmm. Good point. Skyline doesn't have a great way of dealing with this type of extra option for a specific file type, since it now handles around 20+ different search pipelines. Perhaps we could force users to rename the "msms.txt" file to indicate that it should be allowed to build without raw data? Maybe "msms-deconv.txt"? To indicate that the user wants to build a library with the isotope deconvoluted spectra?

2018-11-19 Matt Chambers
Doesn't seem like a very user friendly option by itself. In addition to that, we could add a custom dialog for the error message BiblioSpec gives for MQ files it can't find sources for. It would tell the user where to put the raw files, but if they don't have access to them, it would offer to rename the input file and rerun the import.

2018-11-19 Brendan MacLean
Well, if we are going to go that far, we could just make it a command-line option on BlibBuild and switch to using it based on the user feedback.

2018-11-19 Matt Chambers
That's reasonable too. Like --preferEmbeddedSpectra . It could also apply to other formats, although perhaps not on first implementation.

2019-01-16 Matt Chambers
resolve as Fixed
Assigned»Brendan MacLean

2019-07-12 Brendan MacLean
Assigned ToBrendan MacLean»Guest