Some questions for diaPASEF data analysis about library and settings

lourh

2021-12-06 01:32

Dear Skyline team,

I'm using Skyline to search diaPASEF data with a tsv assay file, and have some questions in this process. I'd be appreciate if you can help me with them.
DIA data were acquired on timsTOF pro2 with 4 diaPASEF windows in each frame and repeated MS2 series twice after one MS1 frame. And the assay library was built on ddaFASEF data with FragPipe and filtered with OpenSwathAssayGenerator.

My questions are as follows

What's the best time to use "Integrate All"? Previously I directly exported search result after reintegrate with mProphet scoring, while the quantification looked weird. But when I clicked "Integrate All" before exporting (still after reintegrate), the exported result looked good. Is that fine to use this function as the last step? or I think this might be set at will?
Should I do some filters before exporting? Currently I usually use Refine -> Advanced, and set these values after reintegrate and before exporting
- Min peptides per protein: 1
- Min transitions per precursor: 3
- Remove peptides missing library match
- Q value cutoff 0.01
- Detected in 1 replicate
- Should I also restrict "Min peak found ratio" and "Max peptide peak rank", or others? In fact, I'm using a benchmark dataset, so better quantification without ID loss will be good
The exporting in my cases is a little slow. I think I chose some normal columns, for peptide sequence, RT, prec charge, Total area, qvalue, with no fragment level data (skyr is attached). Do you have any idea about it
It seems Skyline would able to not rely on protein information in searching and refinement steps, this it true? Would this means I can get same results if no fasta file assigned and keep the library have no proteins (the proteins in tsv file would be dropped if I build library from tsv file with Peptide settings -> Library -> Build)
About iRT and IM in blib built from tsv file. Now I found two ways to import a tsv assay file, one by import -> add assay file, another by Peptide settings -> Library -> Build.
- The first one cannot recognize ion mobility column but can import iRT in library correcly (and an IrtLibrary table will be generated in blib).
- The latter one can recognize ion mobility while the proteins were dropped and iRT values would be transformed to a very small scale (seems -1 to 3) even if I select iRT standard peptides as Biognosys 11 or set it as None, and in this case the RT difference score for mProphet cannot be checked.
- Is there a suitable solution for importing a tsv library? or do you think the followings make sense to use: first build blib with import -> add assay file; then open blib and add ion mobility values to tables "RefSpectra" and "RetentionTimes" for each precursor; also create table "IonMobilityTypes" to store drift time or reduced IM or compensation like library built from Peptide settings.
For ion mobility settings, is that enough to check "Use spectral library IM when present" and set resolving power to 30 as tutorial shown? or use fixed window to 0.06 (maybe this means 1/k0) like OpenSwath preferred.
What's the best practice to aggregate precursor quantities to protein level? Is top3 average enough, or some other suggestions?

The Skyline version is 21.1.0.278

Sorry for so many questions. Really thanks for your time.

Best regards,
Ronghui

PepQuant.skyr

Nick Shulman responded:	2021-12-06 07:22
1. In current versions of Skyline, the "Integrate All" setting really only affects the colors of the dots next to the Transitions in the Targets tree. If "Integrate All" is turned on, the dots next to the transitions will nearly always be green, unless the chromatogram peak really is completely flat. If "Integrate All" is turned off, the dots will be red if the apex of the peak for a particular transition is not close to the apexes of the other peptide transition peaks. "Integrate All" should be turned on if you are doing quantification, and you care about the values of the peak areas. "Integrate All" should be turned off if you are developing an SRM method and want to know which transitions have misshapen peaks. Nowadays, it is very unlikely that "Integrate All" will have any impact on the numbers in your exported report, so it does not matter whether you have it on or off, but you should have it on in your case. (It used to be that when "Integrate All" was off, many of your transition peak areas would be NULL instead of their measured value, and that caused a big problem for people who forgot to turn it on when doing quantification, so, we fixed it so that it only affects the colored dots in the Targets tree). 2. I don't know the answer to this one. 3. Can you send us your Skyline document? In Skyline you can use the menu item: File > Share to create a .zip file containing your Skyline document and supporting files including extracted chromatograms. If that .zip file is less than 50MB you can attach it to this support request. Otherwise, you can upload it here: https://skyline.ms/files.url I am not sure which columns in your report might be causing the report export to be slow. If I had to guess, I would say maybe if you remove the "Protein Abundance" column the report might go faster. It is not supposed to be slow for Skyline to calculate any of the columns in your report. If you send me your Skyline document I will try to improve the performance of Skyline on exporting that report. 4. I am not sure I understand your question. The way that peptides are grouped into proteins do not affect the peptide chromatograms or peak areas. 5. That's a good question. I do not know the answer. I was going to say, if you have a .tsv file, you get it into the document with "File > Import > Assay Library". You would use "Settings > Peptide Settings > Library > Build" if you have peptide search results. Here is a page which lists the types of peptide search results that you can build a library from: https://skyline.ms/wiki/home/software/BiblioSpec/page.view?name=BlibBuild However, now that I look at that page, I see that it says you can build a spectral library from OpenSwath tsv, so now I don't know whether you should build a spectral library versus import an assay library. 6. I do not know the answer to this, but maybe someone else does. 7. The "Protein Abundance" column that you have in the report is the best way to sum the peak areas across all of the transitions under the protein. Those peak areas are normalized according to the Normalization Method that you have specified at "Settings > Peptide Settings > Quantification". If that Normalization Method is "None", then the Protein Abundance will usually be the same as the sum of all of the precursor's Total Area values. Sometimes the Protein Abundance value will be null if any of the transitions have missing or truncated peaks, because Skyline decides that that replicate's Protein Abundance value cannot be compared to the other replicates. Please do send us your Skyline document. It is important that we figure out why your report export is slow, since that sounds like something that will be easy for us to fix. -- Nick

lourh responded:	2021-12-06 23:08
Dear Nick, Thanks for your reply. The description of "Integrate All" is really detailed and helpful to me. I'll remember to turn it on when set settings at next time. And for the export speed in question 3, it would be faster if I remove "Protein Abundance". In actual, this document contains ~300k precursors and would generate about 10M rows for export. If "Protein Abundance" was selected, it would take about one half day to export, while only several minutes without "Protein Abundance". Maybe in this case, the selection of precursors to calculate quantity for each protein would take much time? Meanwhile, the whole document would take about 800G so I didn't upload it now, but I can try if you want to have a look, please let me know if it would be useful. The question 4 was to say if proteins would affect any step for searching and statistic control in mProphet. Because the library built with "Peptide settings -> Library -> Build" with a tsv file will have no protein included. I think it wouldn't but not sure before, thanks for your information. For 5, I think the OpenSwath tsv in BlibBuild refers to search result from OSW and is used for visualization in Skyline (maybe?). But it truly worked with a tsv assay file so I tried it and got a blib with ion mobility annotated (though it would regard IM as drift time and need manually modify it to reduced IM). And Thanks for your information about protein quantity in question 7. The "Protein Abundance" is removed now, so I directly aggregate precursors' quantities to protein after export. While normalization would affect quantification accuracy to some extent (in my case, with benchmark data). Maybe I need to try those different methods. Thanks again for your enthusiastic help. Best, Ronghui

lourh responded:	2021-12-13 05:01
Thanks for Nick's help. I have a new question about same (or similar) precursor m/z in library. I noticed some peptides with same sequence and modifications, but different modification positions, would be dropped when adding them to document, becase of the match tolerance in settings. Then can I decrease this restriction to zero, to allow those peptides with same MS1 behavior but different MS2 information, for peptidoform detection? Do you think that's feasible in Skyline? We are going to test this for some software. And does anyone have any suggestion about question 2 in the initial request? I'll be appreciate about that. Best, Ronghui

Nick Shulman responded:	2021-12-13 08:14
I do not understand what you are saying about mz match tolerance affecting whether Skyline gives you all of the positional isomers that you were expecting. Those two things are not supposed to affect each other. There are two match tolerances in Skyline: "Ion match tolerance" at "Settings > Transition Settings > Library". When Skyline is looking at a spectrum in a spectral library, Skyline needs to figure out which theoretical transition (e.g. "y7++") corresponds to a particular m/z on the spectrum. The Library Ion Match Tolerance controls how much difference is allowed between the predicted and observed m/z's when interpreting spectral library spectra. There is also the "Method match tolerance m/z". This setting affects whether Skyline thinks that a particular SRM chromatogram or MS2 spectrum in a raw file matches the m/z of a precursor in your Skyline document. If you would like, you can send us your Skyline document and it might help us understand your question. In Skyline you can use the menu item: File > Share to create a .zip file containing your Skyline document and supporting files including spectral libraries and extracted chromatograms. If that .zip file is less than 50MB you can attach it to this support request. Otherwise you can upload it here: https://skyline.ms/files.url -- Nick