It was actually added to protect against some cases we saw that ended up causing BiblioSpec to mistakenly interpret every spectrum as coming from a different file. For instance, pepXML frequently has the format:
<spectrum_query spectrum="example.00214.00214.1"...
Where "example" is the basename of the searched file (e.g. example.mzXML)
The trailing numbers are interpreted as start-scan.end-scan.charge-state. If BiblioSpec encounters this format in a place where it is expecting to only find the basename or filename of the source file, then it will end up thinking every spectrum match comes from a different file, as I think was actually the case for the original pepXML implementation when the source of the spectra might have actually been DTA files with one per spectrum.
This had undesirable consequences, that took a while to notice and then more time to work out the cause, in the case where we saw it. So, we implemented what we thought was a reasonable upper limit to help us flag the issue in the future.
It should be relatively easy to raise that limit to 2000, which will hopefully give us at least another 5 years before we see either issue again.
Thanks for posting to the Skyline support board. Good luck with your very large library.
--Brendan