Generating GC-MS "transition list" from .msp

support
Generating GC-MS "transition list" from .msp brynnsundberg  2022-12-14 05:54
 

Hello Skyline team!

I have looked at the user instructions to view GC-MS data in Skyline, but I'm stuck without an initial transition list. The dream is to import my NIST .msp file and to be able to generate "transitions" from the 1710 small molecule spectra it contains to find them in my GC-MS results (using the strategy of selecting DIA in the MS2 transition settings and using a fake precursor mass). Unfortunately, the library explorer doesn't recognize the spectra in my .msp file, maybe because it's missing precursor mass and charge information. Is there a way to make this work? Do I have to learn how to code so I can pull the information from the .msp file and create the transition list I need? Any help you can offer will be greatly appreciated!

Cheers,
Brynn

 
 
Brian Pratt responded:  2022-12-14 10:40

Hi Brynn,

That's a novel way to format an MSP file, but novel is pretty much normal for MSP - it's a very loosely defined "standard". What software produced that file?

The lack of any precursor information is a problem, certainly. It would be cool if Skyline could at least look up a mass from the CAS number but we haven't implemented that (yet?). I wonder if that's available in the file referenced as "C:\Users\decql\OneDrive - KIKIRPA\GC metingen voor Steven\KIKQuadrupole.msp"?

The other issue is the novel arrangement of the fragment information, grouping the m/z,intensity pairs with parenthesis. That's not a big deal to accomodate, though.

We're used to creative formatting in MSP files. But omitting precursor information is a problem. Let's see if we can find a workaround for that.

Thanks for using the Skyline support board,

Brian Pratt

 
Brian Pratt responded:  2022-12-14 10:44

On closer inspection, those CASNO values aren't actually CAS numbers. A very curious MSP file indeed!

Best,

Brian

 
brynnsundberg responded:  2022-12-14 11:10

Hi Brian,

This file was exported from NIST's AMDIS software, and it's a combination of internal references (where that OneDrive file path came from) with a NIST library. My understanding from the GC-MS tutorial (http://data.proteo.cloud/appendix5.pdf) is that the correct precursor mass isn't necessary to extract the target molecule from the data, and if that is the case, a series of placeholder numbers could be used?

Thanks for the speedy reply!
Brynn

 
Brian Pratt responded:  2022-12-14 11:35

You don't have to provide the mass if your give some other hint like a chemical formula, but some kind of meaningful association between the fragments and the parent molecule is necessary. A random mass value wouldn't give meaningful results.

I wonder if there are options with AMDIS that might produce a more fully defined data set?

In the meantime, I'll make sure that the () m/z,intensity pair formatting is supported moving forward.

 
brynnsundberg responded:  2022-12-15 00:55

Quite often we rely on detecting unknown species that are nevertheless associated with a material origin (from measurements of many reference samples--e.g., "Copal marker 4 - Manila copal - ion trap" or "Alkyd unverified 8"). The association with a real parent ion makes general sense but isn't always an option for complex, non-traditional samples.

Still, it looks like some of the CASNO and FORM fields have actual CAS numbers and molecular formulas, so I think a fully-defined data set is possible with AMDIS, just not our data!

 
Brian Pratt responded:  2022-12-15 09:54

So you want to match fragments to identify unknown parents? Skyline is designed for targeted use, but can be a kind of primitive search tool with a big enough library - you just toss out the transitions that don't yield good hits after chromatogram extraction. But you do need to tell Skyline what parent ions its looking for, even if you don't expect most of them to actually be there. It does seem like you have that parent information, but for some reason only the human readable names are being passed into the .msp files. Hopefully that's fixable on your end.

Also, what kind of mass spec data are we talking about here? DIA? DDA?

We'll figure this out!

Best,

Brian

 
Brian Pratt responded:  2022-12-15 10:24

e.g., "Copal marker 4 - Manila copal - ion trap"

That "ion trap" bit reminds me, the spectra you export to the library need to be appropriate for the mass spec data you're producing. Presumably AMDIS supports this kind of filtering.

  • Brian
 
brynnsundberg responded:  2022-12-16 04:07

For a bit more context, AMDIS is quite good for identification and serves pretty well for looking at individual samples, but there isn't any way to compare data between samples. I've used Skyline in previous work for the sort of targeted LC-MS/MS it was intended for, and I really like the combination of tiled chromatograms, replicate comparisons, and group comparisons, which is why I'm trying to get Skyline to work for my GC-MS data, even though it isn't a perfect fit.

I'm going to rearrange your questions in the hope that I will make a little more sense that way!


...what kind of mass spec data are we talking about here? DIA? DDA?

This is standard GC-MS data--separation followed by electron ionization at 70 eV (which also completely fragments the molecule) followed by a single quad. So there is only one mass filter. The absence of the parent ion means there are lot of parallels with DIA, which is how the tutorial I sent earlier manages to process GC-MS data with Skyline.

So you want to match fragments to identify unknown parents? Skyline is designed for targeted use, but can be a kind of primitive search tool with a big enough library - you just toss out the transitions that don't yield good hits after chromatogram extraction.

Since this fragmentation is from electron impact ionization, the fragment m/z and intensity very reproducible--"tossing out" fragments isn't an acceptable option and correlation scores should be quite high. We also rely a lot on retention index.

But you do need to tell Skyline what parent ions its looking for, even if you don't expect most of them to actually be there. It does seem like you have that parent information, but for some reason only the human readable names are being passed into the .msp files.

We don't have the parent ion information, because there is no parent ion! There is a parent molecule, which is what leaves the GC, but it is usually completely fragmented during ionization. The way the tutorial made this work in Skyline was by choosing an arbitrary value to be the parent ion.

The parent molecule information can be determined by comparison with a reference library from standards or deduced from its fragments (easier said than done, of course). In our case, we don't have standards for the molecules we want to detect, but we do have lots of reference materials that allow us to associate the spectrum with the sample type. This is why a some of the parent information is only human-readable. Someone analyzed as many manila copal resin references as they could and looked for common components that could be used as markers when analyzing a sample of an unknown varnish, even though the identity of the exact parent molecule is unknown. Not foolproof, of course, but still useful!

 
Brian Pratt responded:  2022-12-16 09:28

Thanks for the further details. So ideally when reading this kind of .msp that has names but no mass hints, to satisfy Skyline we'd just assign some dummy parent mass - say, 99.

Can you provide some mass spec data and a Skyline document with your desired settings (probably mimicking that tutorial), to go along with the previously provided .msp? I'll see what I can do to get that all working together.

Brian