Table of Contents

BiblioSpec input and output file formats
   Example .ssl file
   Example .ms2 file
   Example .ssl file for small molecules

BiblioSpec input and output file formats

BiblioSpec makes use of several file formats for input and output. Below are descriptions of these along with links to additional information.

Database search result files

In most cases libraries are built from database search result files. Supported formats are listed on the BlibBuild page

BlibBuild .ssl file

For peptide or small molecule identifications that do not come from one of the supported database searches, BiblioSpec supports a generic tab-delimited text file format refered to as ssl (spectrum sequence list). Here is a small example file. An ssl file must have a header line with the following column names in it (the score-type, score, and retention-time columns are optional):

file       scan    charge  sequence        score-type      score   retention-time

additional columns for small molecule use may be included (the sequence column should be omitted for small molecule libraries - here is a small example file):

adduct     chemicalformula moleculename    inchikey        otherkeys

Each of the following lines contains information for one spectrum. The first column contains a full or relative path to a file containing the spectrum. The second column has an id for that spectrum, typically a scan number or index number. The third column is the charge state of the spectrum. The fourth column contains the peptide sequence, with the addition of any modifications given as a mass shift (the difference between the modified and unmodified residue) following the modified residues. For example,


Peptides with n-terminal modifications should have these mass shift follow the first residue.

The score-type column can be any of the following:
















and the score column is a floating point value representing the spectrum's score of that type. The retention time column can be used to specify retention times in minutes; otherwise the values from the spectrum file will be used. Scores fall into three categories: probability that identification is correct, probability that identification is incorrect, or not a probability score. This information can be found in the ScoreTypes table.

Library files

BiblioSpec library files are in the sqlite3 format, usually with a ".blib" filename extension. Each library is a small database that you can search and manipulate with standard SQL commands using, for example, the sqlite3 command line tools or SQLite Expert Personal.

BiblioSpec does not require that you know any SQL, but should you be interested in using these files outside of the BiblioSpec context the sqlite3 commands for building an empty library file with fully annotated tables are available here.

The libraries consist of these tables: LibInfo, Modifications, RefSpectra, RefSpectraPeaks, RefSpectraPeakAnnotations, SpectrumSourceFiles, ScoreTypes, and IonMobilityTypes.

Library as text

BlibToMs2 allows you to view the spectra in your library in the .ms2 text format. This format is recongnized by proteowizard's msconvert and can be converted into other formats such as .mzXML.

In an .ms2 file there are four types of lines. Lines beginning with 'H' are header lines and contain information about how the data was collected as well as comments. They appear at the beginning of the file. Lines beginning with 'S' are followed by the scan number and the precursor m/z. Lines beginning with 'Z' give the charge state followed by the mass of the ion at that charge state. Lines beginning with 'D' contain information relevant to the preceeding charge state. BlibToMs2's output will include D-lines with the sequence and modified sequence. The file is arranged with these S, Z and D lines for one spectrum followed by a peak list: a pair of values giving each peaks m/z and intensity. Here is an example file

Report files

BlibSearch writes results to a tab-delimited text file refered to as the report file. The header (lines beginning with '#') contains details of the search parameters. Next is a line naming each of the fields. Subsequent rows summarize one query-library match. The fields are as follows:

  • Query The identifier for the query spectrum.
  • LibId The number of the library with the match. The header lines will list all libraries being searched and assign each a number referenced in this column.
  • LibSpec The identifier for the library spectrum.
  • rank The rank of the match for this query spectrum. By default, ranks 1-5 are printed. In case of a tie (two matches with the same score) both matches will be given the same rank.
  • dotp The score given to this match (a dot product). Ranges from 0 (poor match) to 1 (two identical spectra).
  • query-mz The precursor m/z of the query spectrum.
  • query-z The charge of the query spectrum. If there was more than one in the spectrum file, they will be listed separated by comas.
  • lib-mz The precursor m/z of the library spectrum.
  • lib-z The charge of the library spectrum.
  • copies The number of spectra in the redundant library for this same sequence and charge state.
  • candidates The number of library spectra the query was compared to.
  • sequence The peptide sequence of the library spectrum.

Parameter files

All BibliSpec tools (with the exception of BlibBuild and LibToSqlite3) will accept a parameter file in which additional options can be specified. See each tool's documentation page for the specific options allowed. The file should contain one option per line with the full option name and value separated by an equals sign (=). Here is an example parameter file.

Example .ssl file

file   scan    charge  sequence
demo.ms2        8       3       VGAGAPVYLAAVLEYLAAEVLELAGNAAR
demo.ms2        1806    2       LAESITIEQGK
demo.ms2        2572    2       ELAEDGC[+57.0]SGVEVR
demo.ms2        3088    2       TTAGAVEATSEITEGK
demo.ms2        3266    2       DC[+57.0]EEVGADSNEGGEEEGEEC[+57.0]
demo.ms2        9734    3       IWELEFPEEAADFQQQPVNAQ[-17.0]PQN
demo.ms2        20919   3       VHINIVVIGHVDSGK
../elsewhere/spec.mzXML 00497   2       LKEPAQNTADNAK
../elsewhere/spec.mzXML 00680   2       ALEGPGPGEDAAHSENNPPR
../elsewhere/spec.mzXML 00965   2       FFSHEAEQK
../elsewhere/spec.mzXML 01114   2       C[+57.0]GPSQPLK
../elsewhere/spec.mzXML 01382   2       AVHVQVTDAEAGK

Example .ms2 file

H      CreationDate    Mon Apr 12 15:12:14 2010
H       Extractor       BlibToMs2
H       Library /home/me/research/search/demo.blib
S       1       1       636.34
Z       2       1253.36
D       seq     FKNGFQTGSASK
D       modified seq    FKNGFQTGSASK
187.40  12.5
193.10  19.5
194.30  13.7
198.30  29.8
199.10  12.2
208.30  23.1
208.90  11.4
210.30  11.8
213.00  3.3
214.50  4.3
216.10  32.8
219.10  11.2
221.00  14.3
222.10  64.0
225.10  16.6
226.00  31.6
228.30  7.2
229.10  8.5
230.50  58.2
231.20  236.1
232.20  75.8
233.60  2.4
234.20  51.4
235.10  5.6
236.30  30.2
239.70  14.4
241.30  34.8
242.30  14.2
244.30  9.0
S       2       2       745.3
Z       2       1471.7
D       seq     NFLETVELQVGLK
D       modified seq    NFLETVELQVGLK
1224.60 7.9
1228.70 468.9
1230.40 658.5
1231.50 144.2
1240.00 11.7
1242.70 45.9
1243.80 16.8
1253.80 17.2
1255.00 7.9
1255.80 14.4
1259.70 15.5
1273.10 5.9
1275.90 10.5
1277.10 7.8
1283.30 4.7
1296.50 19.2
1299.50 13.0
1307.40 6.1
1308.40 21.3
1313.00 1.7
1313.80 5.5
1315.40 3.6
1316.80 22.3
1323.90 1.5
1325.50 40.5
1326.30 75.9
S       3       3       732.1
Z       2       1444.7
D       seq     NEVSAMPTLLLFK
D       modified seq    NEVSAMPTLLLFK
209.00  62.5
210.30  12.8
216.00  87.0
220.10  58.0
224.90  4.9
226.10  418.2
227.00  68.3
227.90  46.7
229.20  13.3
231.10  12.7
238.10  209.1
239.20  15.0
244.10  953.8
245.20  90.0
245.90  20.4
252.30  8.8
255.30  38.8
260.20  9.4
262.10  35.0
270.00  10.9
275.80  21.8
277.40  6.3
279.10  12.7
280.20  49.8

Example .ssl file for small molecules

Note that these are tab separated fields, and the otherkeys field itself is tab separated.

file       scan    charge  adduct  inchikey        chemicalformula moleculename    otherkeys
dexcaf_051017.mzML      01369   -1      [M-H]   ZXPLRDFHBYIQOX-BTBVOZEKSA-N     C24H44O21N0     Glc04Reduced
dexcaf_051017.mzML      01639   -1      [M-H]   NBVGBCYERZIRIP-JAMOUWTMSA-N     C30H54O26N0     Glc05Reduced
dexcaf_051017.mzML      01855   -1      [M-H]   PNHJKLJIDNHXFR-ZGJYWSOBSA-N     C36H64O31N0     Glc06Reduced
dexcaf_051017.mzML      02029   -1      [M-H]   NVKJDLBVRSXYRE-BMFDHOHESA-N     C42H74O36N0     Glc07Reduced
dexcaf_051017.mzML      02179   -1      [M-H]   YMRGEPQWJZHXFF-MGQBKJSVSA-N     C48H84O41N0     Glc08Reduced
dexcaf_051017.mzML      01079   -1      [M-H]   RYYVLZVUVIJVGH-UHFFFAOYSA-N     C8H10N4O2       Caffeine        "InChI:1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3    HMDB:01847      CAS:58-08-2     SMILES:Cn1cnc2n(C)c(=O)n(C)c(=O)c12"