Processes v Threads Performance Comparison.xlsx
Attached to this page you will find a thorough study of how Skyline scales importing large scale DIA data with parallel file import of various file types on either a standard Intel i7 comptuer with 16 GB of RAM versus a Dell PowerEdge with 48 logical processors 196 GB of RAM, using either multiple threads or multiple processes.
General findings include:
- Multiple process import can scale past mutiple threads in the same process (which we think is related to garbage collection)
- Only multi-process import can take advantage of the true potential of a NUMA system with 24+ logical cores
- The difference is much less pronounce on an i7 an may not be worth the effort to go multi-process
- Many formats pay only a percentage increment for spinning disk versus SSD
- The mz5 format, otherwise the fastest format to import, has serious problems scaling with parallel file import on a spinning drive
At the time of this writing, only the Skyline command-line interface (presented by SkylineRunner or SkylineCmd) can take advantage of multi-process import by using the --import-process-count argument.