Skyline can't load the interact-*.pep.xml and *_uncalibrated.mgf files from FragPipe 19.1 coupled with MSFragger 3.7 fcyu  2023-01-18 18:23
 

It works for the interact-*.pep.xml and *_uncalibrated.mgf files from FragPipe 18.0 coupled with MSFragger 3.5. But not the files from the latest MSFragger. Could you please help to test, and tell me if Skyline or MSFragger need to be adjusted? All files have been uploaded to the attachments.

Thanks,

Fengchao

 
 
fcyu responded:  2023-01-18 18:30

The zip file has bee uploaded to https://skyline.ms/project/home/support/file sharing/begin.view? with name dev_skyline.zip and description rowId=57930.

Thanks,

Fengchao

 
Nick Shulman responded:  2023-01-18 20:49
Thank you for uploading the .zip file.

I tried building a library from the "interact-5ngHeLaosmoothCE20-52lowguessSRIG450easy4_30t_C2_01_3451.pep.xmlinteract-5ngHeLaosmoothCE20-52lowguessSRIG450easy4_30t_C2_01_3451.pep.xml" file in the "191_failed" folder.
I got the attached error message.

That error message is telling you that BiblioSpec was looking for a file whose name starts with "5ngHeLaosmoothCE20-52lowguessSRIG450easy4_30t_C2_01_3451_uncalibrated" and whose name ends with one of a number of different possible extensions including "_uncalibrated.mgf".
 
I see that you already have a file there called "5ngHeLaosmoothCE20-52lowguessSRIG450easy4_30t_C2_01_3451_uncalibrated.mgf", which is almost what BiblioSpec is looking for. BiblioSpec is expecting to find a file whose name ends in "_uncalibrated_uncalibrated.mgf".
If you rename (or copy) that file to:
5ngHeLaosmoothCE20-52lowguessSRIG450easy4_30t_C2_01_3451_uncalibrated_uncalibrated.mgf
then things will work.

I am not enough of an expert on BiblioSpec to know whether there's a bug here. I could imagine that BiblioSpec should not try to append "_uncalibrated" to the end of a filename which already ends in "_uncalibrated", but I am not sure.
-- Nick
 
fcyu responded:  2023-01-19 09:18

Hi Nick,

Thank you very much for your prompt response.

It reminds me that Skyline appends _uncalibrated to the end of the base name because we had

<msms_run_summary base_name="I:\data\Bruker\5ngHeLaosmoothCE20-52lowguessSRIG450easy4_30t_C2_01_3451" raw_data_type="d" raw_data="d">

generated by MSFragger before version 3.6. Starting from 3.6, we changed it to

<msms_run_summary base_name="I:\data\Bruker\5ngHeLaosmoothCE20-52lowguessSRIG450easy4_30t_C2_01_3451_uncalibrated" raw_data_type="mzML" raw_data="mzML">

because we want PTMProphet to find the right _uncalibrated.mzML file. I think something need to be changed in Skyline.

It would be great if you can modify Skyline for us. Or, if you can point me the location of that part of the code, I can make the changes and send a pull request.

Thanks,

Fengchao

 
Matt Chambers responded:  2023-01-24 07:09

Is there actually an _uncalibrated.mzML file in MSFragger 3.6+? Because if so that should be found. It's only the MGF file that would have the double _uncalibrated_uncalibrated basename (and MGF is only supported for MSFragger output, so we don't just check for plain .mgf).

 
fcyu responded:  2023-01-24 07:53

Hi Matt,

Yes, there is an _uncalibrated.mzML generated for .d and .raw in MSFragger 3.6+. For the _uncalibrated.mgf, MSFragger 3.7+ has an option to generate it.

Best,

Fengchao

 
Matt Chambers responded:  2023-01-24 13:34

Well, the mzML is missing from the dev_skyline.zip file you sent. Did you delete it to make the zip smaller? Because if it's there it should be used by BlibBuild and maybe you were getting a different error than Nick.

 
fcyu responded:  2023-01-24 14:40

No, I didn't include the _uncalibrated.mzML file. I uploaded a new zip file named dev_skyline_2.zip with the _uncalibrated.mzML file.

Attached is the error I got.

Thanks,

Fengchao

 
Matt Chambers responded:  2023-01-24 14:45

Ah, forgot about that check. So it is currently limited to *calibration.mgf. I'll have to fix it support calibration..

 
fcyu responded:  2023-01-24 14:53

Hi Matt,

Thank you very much for the prompt reply. Looking forward to the fix.

BTW, could you please test if the _uncalibrated.mzML file can be correctly parsed by Skyline? I tested it using msconvert and seeMS.

Also, may I ask what is the priority of loading for the files _uncalibrated.mgf, _uncalibrated.mzML, _calibrated.mgf, and _calibrated.mzML? I think the _calubrated.* should have the lowest priority because the peaks are deisotoped and deneutrallossed, which is bad for spectral library building.

Best,

Fengchao

 
Matt Chambers responded:  2023-02-07 12:53

Hi Fengchao,

The mzML should be supported now. If you or one of your users wants to test before the next Skyline-daily here's a snapshot build: https://teamcity.labkey.org/guestAuth/repository/download/bt209/2271869:id/SkylineTester.zip

 
fcyu responded:  2023-02-07 16:55

Hi Matt,

Thank you very much for the fix.

I have tested the snapshot build and found the following issues:

  1. It does not load the original .d to extract Chromatograms (*_uncalibrated.mzML/mgf does not have MS1), which is required to get the XIC, intensity, etc.
  2. It loads *_calibrated.mzML when there is one. The *_calibrated.mzML/mgf should not be used because the peaks are deisotoped and deneutrallossed, which is not good for spectral library building.
  3. It no longer support *_uncalibrated.mgf, which is OK. We can remove the option of "write uncalibrated mgf" in FragPipe if we decided to shift to uncalibrated.mzML file.

Could you please

  1. Use the original mzML file for the spectral library building and Chromatogram extracting if the original mzML (without _uncalibrated) is available.
  2. If the original mzML is not available, load the _uncalibrated.mzML for spectral library building, and then load the .d for Chromatogram extracting, which is similar to the old Skyline version which loads mgf file.
  3. Remove the support of the _calibrated.mzML/mgf files.

Thanks,

Fengchao

 
Matt Chambers responded:  2023-02-08 06:40

Issues:

  1. Oops. Yeah, that would be a side effect of adding the _uncalibrated suffix to the base_name attribute. It'll be looking for that on the .d as well. I only tested BlibBuild so I missed that, thanks for taking it all the way!

  2. So if the user has deleted the _uncalibrated files for some reason, you prefer it to fail outright instead of using the less suitable calibrated files?

  3. Ah, my change didn't remove support for uncalibrated.mgf: that's a casualty of the _uncalibrated suffix on the base_name attribute (it's looking for *_uncalibrated_uncalibrated.mgf) that I forgot to address.

  4. I'm not sure how this would work, at least for spectral library building. How would the scan numbers/indices line up between the original .d/mzML and the uncalibrated.mzML?

  5. Yes, this is how it was supposed to work. It'll need another fix to take away that _uncalibrated suffix from the base_name.

  6. OK, if you confirm #2 above.

 
fcyu responded:  2023-02-08 09:55

Hi Matt,

Thank you very much for the prompt reply.

So if the user has deleted the _uncalibrated files for some reason, you prefer it to fail outright instead of using the less suitable calibrated files?

Correct. We prefer it to fail instead of using the decharged one to build the spectral library.

I'm not sure how this would work, at least for spectral library building. How would the scan numbers/indices line up between the original .d/mzML and the uncalibrated.mzML?

MSFragger gets the "original" scan number by parsing the id (not the index) from the (un)calibrated.mzML file. So, the scan numbers still match.

Best,

Fengchao

 
Matt Chambers responded:  2023-02-08 12:27

For #4, you want it to look for and use basename.mzML before basename_uncalibrated.mzML, for timsTOF data?

Ah, I see you are writing the _uncalibrated.mzML as if it came from a Thermo RAW. Why is that? There's no reasonable way to map between those "scan numbers" and the original Bruker coordinates (frame and scan or frame range and scan range, depending on how combining has been done).

 
fcyu responded:  2023-02-08 12:54

For #4, you want it to look for and use basename.mzML before basename_uncalibrated.mzML, for timsTOF data?

For timsTOF data, look for basename_uncalibrated.mzML first, then basename_uncalibrated.mgf (for back compatibility), and then basename.mzML. If none of them exists, return an error.

In most cases, there will be no basename.mzML but only basename.d. MSFragger will always generate basename_uncalibrated.mzML if the input is basename.d. But MSFragger will not generate basename_uncalibrated.mzML if the input is basename.mzML. So, if there is basename_uncalibrated.mzML, should use this one because the search input is likely to be .d. If there is no basename_uncalibrated.mzML, the search input is basename.mzML, so should use this one.

For non-timsTOF data, basename.mzML first, and then basename_uncalibrated.mzML, and then basename_uncalibrated.mgf.

Ah, I see you are writing the _uncalibrated.mzML as if it came from a Thermo RAW. Why is that?

I just want to make sure that most tools can read uncalibrated.mzML file since Thermo is the most common one.

There's no reasonable way to map between those "scan numbers" and the original Bruker coordinates (frame and scan or frame range and scan range, depending on how combining has been done).

MSFragger has its own way to assign 1-D "scan numbers" so scans. And that scan numbers are consistent between pepXML and uncalibrated.mzML files. I guess Skyline does not use the scan number to match the scans between pepXML and .d, right?

Best,

Fengchao

 
Matt Chambers responded:  2023-02-09 07:10

When the blib file is created it records that the basename_uncalibrated.mzML file is the source. Since mzML is one of the data formats that Skyline expects it might find MS1s in, Skyline looks at those blib source files first before searching the directories for other formats for the basename. So in this case, Skyline proposes to import results from basename_uncalibrated.mzML instead of finding basename.d. This can be fixed by checking "Exclude spectrum sources" on the import results page, but it's a very likely source of user error. We didn't have this problem with basename_uncalibrated.mgf because MGF can't have MS1, so Skyline doesn't bother to consider importing results from it. I'll have to confer with Brendan and Nick about where they want to fix that. It's conceivable to add a special case inside Skyline to ignore basename_uncalibrated.* files, but there's always a chance that a user has a Sciex or Thermo file ending in "_uncalibrated"!

 
fcyu responded:  2023-02-09 07:17

It's conceivable to add a special case inside Skyline to ignore basename_uncalibrated.* files, but there's always a chance that a user has a Sciex or Thermo file ending in "_uncalibrated"!

Agreed. What about adding another layer of checking: if the _uncalibrated.mzML has MS1 scans?

Best,

Fengchao