MGF files with product ion charges are ignored during spectral library build

support
MGF files with product ion charges are ignored during spectral library build jtsorren  2022-12-01 18:39
 

Hi,

I am using skyline-daily in molecule mode and using the built in feature to build a spectral library from a .ssl file and mgf files.

In mgf files you may now explicitly denote the product ion charge for measured peaks in individual scans. I am wondering why when I build the spectral library and add the molecules with transitions to the target space that the product ions are all assumed to have 1+ charge.

Thank you

 
 
Brian Pratt responded:  2022-12-02 15:02

That's certainly a new variant of MGF - of which there are many, of course. What software produced this?

I am not sure just how much work it will be to support this, I will investigate that. Thanks for providing the example data.

Best regards,

Brian Pratt

 
jtsorren responded:  2022-12-03 17:22

Thank you for the response.

One tool we use is Pyteomics (https://pyteomics.readthedocs.io/en/latest/index.html) to investigate the mgf files and associate charges to product ions.

Do you have a way where I can properly annotate product ions with the correct charge programmatically? Right now I am working with ~600-1000 transitions in the spectral library...

Additionally, will improper charge annotation of transitions affect the usage of the spectral library when importing DIA results?

Thanks

 
Brian Pratt responded:  2022-12-05 10:21

Do you have a way where I can properly annotate product ions with the correct charge programmatically? Right now I am working with ~600-1000 transitions in the spectral library...

If you're up for some SQLite programming, you could look at populating the RefSpectraPeakAnnotations table the the .blib file that gets produced. That's where we store fragment charge information for the (very few) import formats that provide it.

Additionally, will improper charge annotation of transitions affect the usage of the spectral library when importing DIA results?

No, but for now at least you'll have to live with the fragment charge information being ignored. So you'd lose the sense of the actual mass of the fragments, and only know the m/z. But that's always been true with the MGF format. While this does seem like a useful extension of the format I doubt you'll find a lot of MGF readers that recognize it (yet?).

I'll put this (i.e. getting that information into .blib's RefSpectraPeakAnnotations table) into our development queue.

Best regards,

Brian Pratt

 
jtsorren responded:  2022-12-06 09:18

Thank you for the response. I have investigated the .blib database file and see that I can fill in the RefSpectraPeakAnnotations table. I will try that.

Thank you for putting this in the queue.

 
jtsorren responded:  2022-12-06 11:34

1 minor question for filling out the RefSpectraPeakAnnotations table, which columns are required and is there a place I can go for information on each:

id RefSpectraID peakIndex name formula inchiKey otherKeys charge adduct comment mzTheoretical mzObserved

Thanks

 
Brian Pratt responded:  2022-12-06 11:44

This should be useful:

https://raw.githubusercontent.com/ProteoWizard/pwiz/master/pwiz_tools/BiblioSpec/tests/reference/tables.check

You'll need id, RefSpectraID, peakIndex, charge, mzTheoretical, mzObserved at a minimum.

Don't hesitate to ask if you have other questions.

Brian

 
jtsorren responded:  2022-12-06 12:11

I have filled in the id, RefSpectraID, peakIndex, charge, mzTheoretical, mzObserved information for the RefSpectraPeakAnnotations table but when I reload the spectral library to my skyline-daily session I am met with this error (attached).

Failed loading library 'automation_test'.
Specified cast is not valid.

I have taken the peakIndex from the SpecIDinFile column in the RefSpectra table (not sure if that is the proper index into the mz/intensity list for the RefSpectra)

Thanks

 
Brian Pratt responded:  2022-12-06 17:34

Perhaps I can see your modified.blind file?

 
Brian Pratt responded:  2022-12-07 05:45

.blib file, that is (autocorrect got in the way)

 
Brian Pratt responded:  2022-12-07 05:53

.blib file, that is (autocorrect got in the way)

 
jtsorren responded:  2022-12-07 08:14

Thanks for looking into this. Here is the modified.blib

 
Brian Pratt responded:  2022-12-07 10:59

Thanks. It looks like a misunderstanding of the "peakIndex" field. That's an index into the fragment [m/z, intensity] pairs for the RefSpectra noted in the RefSpectraId field. So instead of those all being 5 for RefSpectraId=6, I'd expect to see them increasing monotonically from 0 (assuming you're annotating every fragment).

 
Brian Pratt responded:  2022-12-07 11:15

... and then starting at 0 again when you move on to RefSpectraId=23

 
jtsorren responded:  2022-12-07 11:46

Ah I see. It looks like that worked. I will investigate my library and make sure it all checks out.

Thank you for all of the help! I will ask any other questions as they come.