Table of Contents |
guest 2023-03-24 |
BiblioSpec is a suite of software tools for creating and searching MS/MS peptide spectrum libraries.
BiblioSpec 2.0 stores spectrum libraries as sqlite3 files. Sqlite3 is a light-weight, open-source database format which can be read and manipulated with any sqlite3 tools in addition to BiblioSpec. For more information about the library format, see the file formats page. The new format is a departure from version 1.0 which uses a unique binary format. This means that tools and libraries from the two versions are not compatible. There is, however, a conversion tool for turning a version 1.0 library into a sqlite3 library.
The BiblioSpec package contains the following programs:
BiblioSpec is freely available under the BSD license. Click here to go to the Download and build page.
Several reference libraries will be available soon for download.
An overview of all file formats including a list of all the database search files that can be used to build libraries.
Creates a library of spectra with known peptide and/or small molecule identifications. Typically, these identifications are done with a database search such as SEQUEST or Mascot, sometimes followed by an evaluation step such as percolator or Peptide Prophet. BlibBuild accepts files from a variety of database search programs, as well as some other spectral library formats. File formats are identified by file extension, which are given in the table below. In many cases, the peptide identification (peptide sequence, charge state and optional score) are in a separate file from the spectrum information. Unless noted, it is assumed that both files will be in the same directory.
Database search | Peptide ID file extension | Spectrum file extension
*RAW includes vendor formats like RAW, WIFF, .D, etc. | Score Used | Notes |
Generic SSL | .ssl | score column | A generic format for encoding spectrum library entries. | |
ByOnic | .mzid | .MGF, .mzXML, .mzML | AbsLogProb | |
Comet/SEQUEST/Percolator | .perc.xml, .sqt | .cms2, .ms2, .mzXML | q-value | Percolator v1.17 does not include sequence modification information therefore the .sqt file from the SEQUEST search must be present in the same directory, the directory containing the cms2/ms2 spectrum files, or the current working directory. |
DIA-NN | .speclib | none | No separate spectrum file. In the current implementation, no score is imported from the library, so all spectra are imported. | |
IDPicker | .idpXML | .mzXML, .mzML | FDR | The name(s) of the spectrum file(s) are given in the .idpXML file. |
MS Amanda | .pep.xml, .pepXML | .mzML, .mzXML, .MGF, RAW* | q-value | |
MSFragger | .pep.xml, .pepXML | .mzML, .mzXML, .MGF, RAW* | q-value | |
MSGF+ | .mzid, .pepXML | .mzML, .mzXML, .MGF, RAW* | expectation value | |
Mascot | .dat | expectation value | No separate spectrum file. | |
MaxQuant Andromeda | msms.txt + evidence.txt + mqpar.xml + modifications.xml | .mzML, .mzXML, .MGF, RAW* | PEP | It is possible to use peaks embedded in the msms.txt, but external spectra files are preferred because the embedded peaks are charge deconvoluted. mqpar.xml must be located in the grandparent, parent, or same directory. A custom modifications.xml , modifications.local.xml , or modification.xml can be placed in the same directory as the search results (or specified using the -x option). |
Morpheus | .pep.xml, .pepXML | .mzXML, .mzML | q-value | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. Spectra are looked up by index, which is calculated using (scan number - 1). |
OMSSA | .pep.xml, .pepXML | .mzXML, .mzML | expectation value | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. |
OpenSWATH | .tsv | m_score column | No separate spectrum file. | |
PEAKS DB | .pep.xml, .pepXML | .mzXML, .mzML | confidence score | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. |
PLGS MSe | final_fragment.csv | score column | There need not be a . before 'final_fragment'.. | |
PRIDE | .pride.xml | various | No separate spectrum file. | |
PeptideProphet/iProphet | .pep.xml, .pepXML | .mzML, .mzXML, .MGF, RAW* | probability score | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. |
PeptideShaker | .mzid | .MGF | confidence score | |
Protein Pilot | .group.xml | confidence score | No separate spectrum file. | |
Protein Prospector | .pep.xml, .pepXML | .mzML, .mzXML, .MGF, RAW* | expectation value | |
Proteome Discoverer | .msf, .pdResult | q-value | No separate spectrum file. Libraries cannot be built from databases that do not contain q-values, unless a cutoff score of 0 is explicitly specified. | |
Proxl XML | .proxl.xml | .mzML, .mzXML, .MGF, RAW* | q-value | |
Scaffold | .mzid | .MGF, .mzXML, .mzML | peptide probability | |
Spectronaut | .csv | none | Spectronaut Assay Library export. No separate spectrum file. | |
Spectrum Mill | .pep.xml, .pepXML | .mzXML, .mzML | expectation value | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. |
X! Tandem | .xtan.xml | expectation value | No separate spectrum file. |
BlibBuild [options] <peptide id file>[+] <library name>
<peptide id file>
– A file containing peptide spectrum matches to be included in the library. The associated spectrum files should be in the same directory as the peptide id file but should not be given on the command line. See the above table for recognized formats. Multiple files may be listed together.<library name>
– The name of the library being created. An existing library may be overwriten or added to.A spectrum library in in sqlite3 format.
Create a library from an existing one such that the new library has only one spectrum for each peptide ion. The representative spectrum is chosen by taking the dot product of all pairs of spectra for a peptide and selecting the one with the highest average score.
BlibFilter [options] <redundant-library> <filtered-library>
<initial library>
– A library file with multiple spectra for all or some peptide ions.<output library>
– The name to be given to the resulting library.A library of spectra for the same peptides as the initial library, but with only one spectrum per peptide ion.
-m [ --memory-cache ] <size>
– SQLite memory cache size in Megs. Default 250M.-n [ --min-peaks ] <num>
– Only include spectra with at least this many peaks. Default 20.-s [ --min-score ] <score>
– Best spectrum must have at least this average score to be included. Default 0.-p [ --parameter-file ] <file>
– File containing search parameters. Command line values override file values.-v [ --verbosity ] <level>
– Control the level of output to stderr. (silent, error, status, warn, debug, detail, all) Default status.-h [ --help ]
– Print help message.Search a spectrum library for matches to query spectra.
BlibSearch [options] <spectrum filename> <library filename>[+]
<spectrum filename>
– A file containing spectra to search. File formats accepted are .ms2, .cms2, .mzXML, .mzML, .MGF, and .wiff (Windows only).<library name>
– The library to be searched for matches to the query. Libraries may be filtered (the output of BlibFilter) or redundant (the output of BilbBuild). More than one library can be listed on the command line.Results are printed to a report file (tab-delimited text). The file may be named with the --report-file
option or by default it is named after the spectrum file with the extension replaced with .report. A seprate report file is written for any decoy spectra searched. An optional sqlite .psm file may also be produced.
-c [ --clear-precursor ] <true|false>
– Remove the peaks in a X m/z window around the precursor from the query and library spectrum. Default true.--topPeaksForSearch <num>
– Use this many of the highest intensity peaks. Default 100.-w [ --mz-window ] <size>
– Compare query to library spectra with precursor m/z +/- size. Default 3.-L [ --low-charge <charge>
– ] Search only spectra with charge no less than this. Default 1.-H [ --high-charge ] <charge>
– Search only spectra with charge no higher than this. Default 5.-m [ --report-matches ] <num>
– Return this number of the best matches for each query. Use -1 to report all. Default 5.--psm-result-file <name>
– Return results in a .psm file of the given name. Default no .psm file.-R [ --report-file ] <name>
– Return results in report file of the given nam. Default is .report.--preserve-order
– Search spectra in the order they appear in the file. Default to search as sorted by precursor m/z.-p [ --parameter-file ] <name>
– File containing search parameters. Command line values override file values.-v [ --verbosity ] <level>
– Control the level of output to stderr. (silent, error, status, warn, debug, detail, all) Default status.-h [ --help ]
– Print help message.Write an MS2 file that contains all spectra in a library.
BlibToMS2 [options] <library>
<library>
– a spectrum library file, filtered or redundant.The spectra are printed to a file named <library>.ms2 in the MS2 format. The scan number is replaced with the library ID number. Two 'D' lines contain the peptide sequence with and without modifications.
-f [ --file-name ] <ms2 file>
– Use this name for the output MS2 file rather than the default name, <library>.ms2.-m [ --mz-precision ] <num>
– Write the peak m/z values with this many digits of precision. Default 2.-i [ --intenisty-precision ] <num>
– Write the peak intensity values with this many digits of precision. Default 1.-p [ --parameter-file ] <file>
– Specify parameters in a separate file. Command line vales override the file.-v [ --verbose ] <
silent|error|status|warn
> – Set the verbosity level of the output to stderr. The default level is status.-h [ --help ]
– Print the help message.Converts a BiblioSpec 1.0 library to a 2.0 library in sqlite3 format.
LibToSqlite3 <old version lib> <new lib name>
<old version lib>
– A BiblioSpec 1.0 library file.<new lib name>
– The name to be given to the converted library.A spectrum library in in sqlite3 format.
Database search | Peptide ID file extension | Spectrum file extension
*RAW includes vendor formats like RAW, WIFF, .D, etc. | Score Used | Notes |
Generic SSL | .ssl | score column | A generic format for encoding spectrum library entries. | |
ByOnic | .mzid | .MGF, .mzXML, .mzML | AbsLogProb | |
Comet/SEQUEST/Percolator | .perc.xml, .sqt | .cms2, .ms2, .mzXML | q-value | Percolator v1.17 does not include sequence modification information therefore the .sqt file from the SEQUEST search must be present in the same directory, the directory containing the cms2/ms2 spectrum files, or the current working directory. |
DIA-NN | .speclib | none | No separate spectrum file. In the current implementation, no score is imported from the library, so all spectra are imported. | |
IDPicker | .idpXML | .mzXML, .mzML | FDR | The name(s) of the spectrum file(s) are given in the .idpXML file. |
MS Amanda | .pep.xml, .pepXML | .mzML, .mzXML, .MGF, RAW* | q-value | |
MSFragger | .pep.xml, .pepXML | .mzML, .mzXML, .MGF, RAW* | q-value | |
MSGF+ | .mzid, .pepXML | .mzML, .mzXML, .MGF, RAW* | expectation value | |
Mascot | .dat | expectation value | No separate spectrum file. | |
MaxQuant Andromeda | msms.txt + evidence.txt + mqpar.xml + modifications.xml | .mzML, .mzXML, .MGF, RAW* | PEP | It is possible to use peaks embedded in the msms.txt, but external spectra files are preferred because the embedded peaks are charge deconvoluted. mqpar.xml must be located in the grandparent, parent, or same directory. A custom modifications.xml , modifications.local.xml , or modification.xml can be placed in the same directory as the search results (or specified using the -x option). |
Morpheus | .pep.xml, .pepXML | .mzXML, .mzML | q-value | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. Spectra are looked up by index, which is calculated using (scan number - 1). |
OMSSA | .pep.xml, .pepXML | .mzXML, .mzML | expectation value | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. |
OpenSWATH | .tsv | m_score column | No separate spectrum file. | |
PEAKS DB | .pep.xml, .pepXML | .mzXML, .mzML | confidence score | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. |
PLGS MSe | final_fragment.csv | score column | There need not be a . before 'final_fragment'.. | |
PRIDE | .pride.xml | various | No separate spectrum file. | |
PeptideProphet/iProphet | .pep.xml, .pepXML | .mzML, .mzXML, .MGF, RAW* | probability score | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. |
PeptideShaker | .mzid | .MGF | confidence score | |
Protein Pilot | .group.xml | confidence score | No separate spectrum file. | |
Protein Prospector | .pep.xml, .pepXML | .mzML, .mzXML, .MGF, RAW* | expectation value | |
Proteome Discoverer | .msf, .pdResult | q-value | No separate spectrum file. Libraries cannot be built from databases that do not contain q-values, unless a cutoff score of 0 is explicitly specified. | |
Proxl XML | .proxl.xml | .mzML, .mzXML, .MGF, RAW* | q-value | |
Scaffold | .mzid | .MGF, .mzXML, .mzML | peptide probability | |
Spectronaut | .csv | none | Spectronaut Assay Library export. No separate spectrum file. | |
Spectrum Mill | .pep.xml, .pepXML | .mzXML, .mzML | expectation value | The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. |
X! Tandem | .xtan.xml | expectation value | No separate spectrum file. |