How Skyline Builds Spectral Libraries


Skyline builds spectral libraries using a separate program called BiblioSpec, which has two main components. BlibBuild is called to build the redundant library, which is then filtered by BlibFilter to create the non-redundant library. The BlibBuild page contains information on the various search engines that are supported, along with information about their respective file formats and the scores used with the cut-off value specified in Skyline.

BlibFilter chooses the best spectrum within a group by simply using the one with the best score. If there are multiple spectra tied for the best score, the one with the highest TIC is selected. In the past, BlibFilter chose the spectrum with the highest average dot product when compared to all other spectra within the same group, but this method occasionally produced poor results. A similar method, computing a consensus spectrum and its dot product against the related spectra, also produced inferior results as it sometimes resulted in high-noise spectra being chosen.

Skyline with BiblioSpec supports building libraries from the following peptide spectrum matching pipeline outputs:

Database searchPeptide ID file extensionSpectrum file extension
*RAW includes vendor formats like RAW, WIFF, .D, etc.
Score UsedNotes
Generic SSL.ssl score columnA generic format for encoding spectrum library entries.
ByOnic.mzid.MGF, .mzXML, .mzMLAbsLogProb 
Comet/SEQUEST/Percolator.perc.xml, .sqt.cms2, .ms2, .mzXMLq-valuePercolator v1.17 does not include sequence modification information therefore the .sqt file from the SEQUEST search must be present in the same directory, the directory containing the cms2/ms2 spectrum files, or the current working directory.
DIA-NN.speclib noneNo separate spectrum file. In the current implementation, no score is imported from the library, so all spectra are imported.
IDPicker.idpXML.mzXML, .mzMLFDRThe name(s) of the spectrum file(s) are given in the .idpXML file.
MS Amanda.pep.xml, .pepXML.mzML, .mzXML, .MGF, RAW*q-value 
MSFragger.pep.xml, .pepXML.mzML, .mzXML, .MGF, RAW*q-value 
MSGF+.mzid, .pepXML.mzML, .mzXML, .MGF, RAW*expectation value 
Mascot.dat expectation valueNo separate spectrum file.
MaxQuant Andromedamsms.txt + evidence.txt + mqpar.xml + modifications.xml.mzML, .mzXML, .MGF, RAW*PEPIt is possible to use peaks embedded in the msms.txt, but external spectra files are preferred because the embedded peaks are charge deconvoluted. mqpar.xml must be located in the grandparent, parent, or same directory. A custom modifications.xml, modifications.local.xml, or modification.xml can be placed in the same directory as the search results (or specified using the -x option).
Morpheus.pep.xml, .pepXML.mzXML, .mzMLq-valueThe names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. Spectra are looked up by index, which is calculated using (scan number - 1).
OMSSA.pep.xml, .pepXML.mzXML, .mzMLexpectation valueThe names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory.
OpenSWATH.tsv m_score columnNo separate spectrum file.
PEAKS DB.pep.xml, .pepXML.mzXML, .mzMLconfidence scoreThe names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory.
PLGS MSefinal_fragment.csv score columnThere need not be a . before 'final_fragment'..
PRIDE.pride.xml variousNo separate spectrum file.
PeptideProphet/iProphet.pep.xml, .pepXML.mzML, .mzXML, .MGF, RAW*probability scoreThe names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory.
PeptideShaker.mzid.MGFconfidence score 
Protein confidence scoreNo separate spectrum file.
Protein Prospector.pep.xml, .pepXML.mzML, .mzXML, .MGF, RAW*expectation value 
Proteome Discoverer.msf, .pdResult q-valueNo separate spectrum file. Libraries cannot be built from databases that do not contain q-values, unless a cutoff score of 0 is explicitly specified.
Proxl XML.proxl.xml.mzML, .mzXML, .MGF, RAW*q-value 
Scaffold.mzid.MGF, .mzXML, .mzMLpeptide probability 
Spectronaut.csv noneSpectronaut Assay Library export. No separate spectrum file.
Spectrum Mill.pep.xml, .pepXML.mzXML, .mzMLexpectation valueThe names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory.
X! Tandem.xtan.xml expectation valueNo separate spectrum file.


Importing Existing Spectral Libraries

Skyline can also directly read existing spectral libraries (without using BlibBuild) including:

  • SpectraST (.sptxt) 
  • theGPM  X! Hunter (.hlf) 
  • Shimadzu (.mlb)
  • Golm Metabolome Database (.msp)
  • NIST (.msp)

Working with NIST files

If your library contains spectra for multiple instruments and conditions (e.g. various CE values) it is important to use the NIST-supplied filtering tools to produce a subset of spectra appropriate to your experimental conditions. Each molecule+adduct (or peptide+charge) pair can appear in a .blib file only once, and without thoughtful filtering you will almost certainly produce a .msp file that can't be used by Skyline because it contains more than one instance of a molecule+adduct (or peptide+charge) pair.

expand all collapse all