Error importing GNPS spectrum library

support
Error importing GNPS spectrum library gioele_vi  2025-03-14 07:35
 

Hello,

I would like to implement this GNPS library (https://gnps-external.ucsd.edu/gnpslibrary) but I get the following error on Skyline:

D:\Databases\ALL_GNPS_NO_PROPOGATED.msp (line 1195456): No peaks found for peptide Phakellistatin 13.
Skyline (64-bit) 24.1.0.199 (6a0775ef83)
System.IO.IOException: D:\Databases\ALL_GNPS_NO_PROPOGATED.msp (line 1195456): No peaks found for peptide Phakellistatin 13.
at pwiz.Skyline.Model.Lib.NistLibraryBase.ThrowIOException(Int64 lineNum, String message) in C:\proj\skyline_24_1\pwiz_tools\Skyline\Model\Lib\NistLibSpec.cs:line 1598
at pwiz.Skyline.Model.Lib.NistLibraryBase.CreateCache(ILoadMonitor loader, IProgressStatus status, Int32 percent, String& warning) in C:\proj\skyline_24_1\pwiz_tools\Skyline\Model\Lib\NistLibSpec.cs:line 1060
at pwiz.Skyline.Model.Lib.NistLibraryBase.Load(ILoadMonitor loader, IProgressStatus status, Boolean cached) in C:\proj\skyline_24_1\pwiz_tools\Skyline\Model\Lib\NistLibSpec.cs:line 661

Do you know where this problem could be coming from?

Many thanks and best regards,

Gioele

 
 
Nick Shulman responded:  2025-03-14 08:47
Skyline is expecting that every molecule in that library is going to have a line which starts with:
Num Peaks:
indicating how many m/z intensity pairs are in the spectrum.

It looks like there are a lot of molecules in that .msp file which do not have a line like that.

I will ask my coworkers whether Skyline can fix this. It might be that the right thing for Skyline to do is to not treat this as an error and just skip molecules that do not have any spectrum information.
-- Nick
 
gioele_vi responded:  2025-03-17 06:09
Hi Nick,

Thanks a lot for your answer.
I hope your colleagues will follow your suggestion ;)
In the meantime, is there anything we can do to overcome the problem?

Best regards,

Gioele
 
Brian Pratt responded:  2025-03-17 12:33
Hi Gioele,

Thanks for reporting this. Obviously Skyline needs handle this more gracefully, but for now here is some Python code that Chat GPT thinks should remove the troublesome sections (I haven't run it, but it looks proper):

import re

def filter_sections(input_file, output_file):
    with open(input_file, "r", encoding="utf-8") as infile, open(output_file, "w", encoding="utf-8") as outfile:
        section = []
        keep_section = False

        for line in infile:
            if re.match(r"^\s*$", line): # Empty line signals end of a section
                if keep_section and section:
                    outfile.writelines(section)
                    outfile.write("\n") # Preserve section separation
                section = []
                keep_section = False
                continue

            section.append(line)

            if re.match(r"(?i)^name", line):
                keep_section = False # Reset for new section
            
            if re.match(r"(?i)^num peaks", line):
                keep_section = True # Mark section to be kept

        # Write the last section if necessary
        if keep_section and section:
            outfile.writelines(section)
            outfile.write("\n")


# Example usage
filter_sections("input.msp", "output.msp")
 
gioele_vi responded:  2025-03-31 01:43
Thanks Brian, I will test the code and let you know if it works.