Optimal analysis workflow for 500 sample DIA experiment?

Optimal analysis workflow for 500 sample DIA experiment? becky carlyle  2021-01-19

Hi everyone,

Not necessarily a specific Skyline request here, but a hope that members of the Skyline community can help guide our workflows. We have 500 biofluid samples from a neurology clinic that are currently undergoing DIA. We are using an in-house generated fractionated library.

The Core running these samples uses Scaffold DIA but it's not capable of handling large sample sets. I also dislike the “black box” aspect and would prefer an open source solution. I was wondering if this community would be able to offer thoughts on an optimal pipeline for analysis of this data - preferably something with the potential for parallelization (I think ideally we’d use AWS to spin up some clusters to do this work - we have plenty of experience doing this with other 'omics pipelines). We are a biomarkers group and have moderate experience with MQ, Skyline, and X!Tandem, but we are definitely not experts in this particular field. We're also completely over-run with working from home and childcare, so it's very difficult to find the time to thoroughly research the huge number of developments in this field over the past couple of years. I hope the community can help point us in the right direction!

Thank you in advance!
Becky Carlyle (Buck course attendee from a few years ago)

Tobi responded:  2021-01-19

Hi Becky,

DIA-NN might be an option for you, its suitable for large datasets and quite intuitive and fast to learn.

For Orbitrap data i use 1missed cleavage, 1 ox. Met, between 15-20 ppm mass accuracy, robust LC (high precision) and RT & Signal-dep. normalization.

The Output can be a gene x file matrix with MaxLFQ quantities. (gg_matrix.tsv)