Correct spectrum selection for non-redundant library Juan C. Rojas E.  2020-09-30

Dear support,

Attached you can find a few slides displaying my issue.

In multiple instances I have observed that the spectrum selected for the non-redundant library is suboptimal compared to some other spectra available (Slide 1 and 2). For DDA files the problem is easily circumvented by working with the redundant library, but for library generation for DIA, MRM, and/or PRM methods could lead to comparisons to suboptimal representative library spectrum.

The .mzXML files (exported from PEAKS) for data acquired in resolution mode are kept in resolution mode format when exported (Slide 3). Maybe this is the reason for the mismatch due to better random matching (i.e. in mass tolerance consideration) to some of the "split" peaks of some spectra compared to others even if the absolute abundance is lower.

Is it possible to manually exchange the best representative spectrum for the non-redundant library?

If not, could a peak picking step be included in the library building procedure? Or should I just perform it with MSConvert externally?

As always, thank you for your time and support.

Nick Shulman responded:  2020-09-30

Yes, I think the easiest thing for you to do would be to have centroided (peak-picked) .mzXML at the time that you are building your spectral library.

The .mzXML (or .mzML) files that BiblioSpec uses to build the spectral library do not have to be exactly the same files as you did your peptide search on. So, if you have already done your peptide search using profile data, you can still keep those peptide search results, but make sure that your new centroided mzXML files are next to your peptide search results when you build your spectral library.

By the way, the step of deciding which of the redundant spectra should be kept for the non-redundant library is performed by a tool called "BlibFilter". You can read more about BlibFilter here:
You do not actually need to know anything about how BlibFilter works, since Skyline always invokes it while building a spectral library, but you might find some useful information on that page. BlibFilter selects the best spectrum by performing dot products between all of the redundant spectra, so that it ends up choosing the one which looks most like the others. I could imagine that using profile instead of centroided data might mess things up in the way that you are describing, but I am not sure whether the issue has ever been explored by anyone.

If centroiding those mzXML files does not improve things, let us know. We might be able to give you more information about what is going wrong.
-- Nick