BlibBuild: /home/software/BiblioSpec

BlibBuild

Description

Creates a library of spectra with known peptide and/or small molecule identifications. Typically, these identifications are done with a database search such as SEQUEST or Mascot, sometimes followed by an evaluation step such as percolator or Peptide Prophet. BlibBuild accepts files from a variety of database search programs, as well as some other spectral library formats. File formats are identified by file extension, which are given in the table below. In many cases, the peptide identification (peptide sequence, charge state and optional score) are in a separate file from the spectrum information. Unless noted, it is assumed that both files will be in the same directory.

Database search	Peptide ID file extension	Spectrum file extension *RAW includes vendor formats like RAW, WIFF, .D, etc.	Score Used	Notes
Generic SSL	.ssl		score column	A generic format for encoding spectrum library entries.
ByOnic	.mzid	.MGF, .mzXML, .mzML	AbsLogProb
Comet/SEQUEST/Percolator	.perc.xml, .sqt	.cms2, .ms2, .mzXML	q-value	Percolator v1.17 does not include sequence modification information therefore the .sqt file from the SEQUEST search must be present in the same directory, the directory containing the cms2/ms2 spectrum files, or the current working directory.
DIA-NN	.speclib and .tsv or .parquet		Global.Q.Value	No separate spectrum file, but results for individual runs are read from a TSV or Parquet file in the same directory as the speclib.
IDPicker	.idpXML	.mzXML, .mzML	FDR	The name(s) of the spectrum file(s) are given in the .idpXML file.
MS Amanda	.pep.xml, .pepXML	.mzML, .mzXML, .MGF, RAW*	q-value
MSFragger	.pep.xml, .pepXML	.mzML, .mzXML, .MGF, RAW*	q-value
MSGF+	.mzid, .pepXML	.mzML, .mzXML, .MGF, RAW*	expectation value
Mascot	.dat		expectation value	No separate spectrum file.
MaxQuant Andromeda	msms.txt + evidence.txt + mqpar.xml + modifications.xml	.mzML, .mzXML, .MGF, RAW*	PEP	It is possible to use peaks embedded in the msms.txt, but external spectra files are preferred because the embedded peaks are charge deconvoluted. `mqpar.xml` must be located in the grandparent, parent, or same directory. A custom `modifications.xml`, `modifications.local.xml`, or `modification.xml` can be placed in the same directory as the search results (or specified using the `-x` option).
Morpheus	.pep.xml, .pepXML	.mzXML, .mzML	q-value	The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. Spectra are looked up by index, which is calculated using (scan number - 1).
OMSSA	.pep.xml, .pepXML	.mzXML, .mzML	expectation value	The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory.
OpenSWATH	.tsv		m_score column	No separate spectrum file.
PEAKS DB	.pep.xml, .pepXML	.mzXML, .mzML	confidence score	The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory.
PLGS MS^e	final_fragment.csv		score column	There need not be a . before 'final_fragment'..
PRIDE	.pride.xml		various	No separate spectrum file.
PeptideProphet/iProphet	.pep.xml, .pepXML	.mzML, .mzXML, .MGF, RAW*	probability score	The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory.
PeptideShaker	.mzid	.MGF	confidence score
Protein Pilot	.group.xml		confidence score	No separate spectrum file.
Protein Prospector	.pep.xml, .pepXML	.mzML, .mzXML, .MGF, RAW*	expectation value
Proteome Discoverer	.msf, .pdResult		q-value	No separate spectrum file. Libraries cannot be built from databases that do not contain q-values, unless a cutoff score of 0 is explicitly specified.
Proxl XML	.proxl.xml	.mzML, .mzXML, .MGF, RAW*	q-value
Scaffold	.mzid	.MGF, .mzXML, .mzML	peptide probability
Spectronaut	.csv		none	Spectronaut Assay Library export. No separate spectrum file.
Spectrum Mill	.pep.xml, .pepXML	.mzXML, .mzML	expectation value	The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory.
X! Tandem	.xtan.xml		expectation value	No separate spectrum file.

Usage

BlibBuild [options] <peptide id file>[+] <library name>

Input

<peptide id file> – A file containing peptide spectrum matches to be included in the library. The associated spectrum files should be in the same directory as the peptide id file but should not be given on the command line. See the above table for recognized formats. Multiple files may be listed together.
<library name> – The name of the library being created. An existing library may be overwriten or added to.

Output

A spectrum library in in sqlite3 format.

Options

-o Overwrite existing library. Default append.
-S <filename> Read from file as though it were stdin.
-s Result file names from stdin. e.g. ls *sqt | BlibBuild -s new.blib.
-u Ignore peptides except those with the unmodified sequences from stdin.
-U Ignore peptides except those with the modified sequences from stdin.
-H Use more than one decimal place when describing mass modifications.
-C <file size> Minimum file size required to use caching for .dat files. Specifiy units as B,K,G or M. Default 800M.
-c <cutoff> Score threshold (0-1) for PSMs to be included in library. Higher threshold is more exclusive.
-v <level> Level of output to stderr (silent, error, status, warn). Default status.
-L Write status and warning messages to log file.
-m <size> SQLite memory cache size in Megs. Default 250M.
-l <level> ZLib compression level (0-?). Default 3.
-i <library_id> LSID library ID. Default uses file name.
-a <authority> LSID authority. Default proteome.gs.washington.edu.
-x <filename> Specify the path of XML modifications file for parsing MaxQuant files.
-p <filename> Specify the path of XML parameters file for parsing MaxQuant files.
-P <float> Specify pusher interval for Waters final_fragment.csv files.
-d [<filename>] Document the .blib format by writing SQLite commands to a file, or stdout if no filename is given.
-E Prefer reading peaks from embedded spectra (currently only affects MaxQuant msms.txt)
-A Output messages noting ambiguously matched spectra (spectra matched to multiple peptides)
-K Keep ambiguously matched spectra

MacCoss Lab Software

MacCoss Lab Software

BlibBuild

BlibBuild

Description

Usage

Input

Output

Options

Search

Pages

MacCoss Lab Software

MacCoss Lab Software

BlibBuild

BlibBuild

Description

Usage

Input

Output

Options

Search

Scope ?

Categories ?

Sort ?

Pages