Attached to this page you will find a thorough study of how Skyline scales importing large scale DIA data with parallel file import of various file types on either a standard Intel i7 comptuer with 16 GB of RAM versus a Dell PowerEdge with 48 logical processors 196 GB of RAM, using either multiple threads or multiple processes.

General findings include:

  • Multiple process import can scale past mutiple threads in the same process (which we think is related to garbage collection)
  • Only multi-process import can take advantage of the true potential of a NUMA system with 24+ logical cores
  • The difference is much less pronounce on an i7 an may not be worth the effort to go multi-process
  • Many formats pay only a percentage increment for spinning disk versus SSD
  • The mz5 format, otherwise the fastest format to import, has serious problems scaling with parallel file import on a spinning drive

At the time of this writing, only the Skyline command-line interface (presented by SkylineRunner or SkylineCmd) can take advantage of multi-process import by using the --import-process-count argument.

  Attached Files  
 Processes v Threads Performance Comparison.xlsx

expand all collapse all