September 25, 2014
From time to time we are asked how to import large text (ASCII) files quickly. Normally speed is non-issue for ASCII import as it is blazing fast. That kind of question typically comes from person who wants to import hundreds of megabytes of data.
ASCII importer is optimized for adding new data to the existing database, so the most efficient operation is adding current quote (the newest one).
If you want to import huge amount of data in ASCII format in most efficient manner you need to make sure that the file you are importing is sorted
- in ascending symbol order (so “AAPL” before “INTC”), and within symbol
- in ascending chronological order (so oldest records first, newest records last)
In SQL query talk it would be “ORDER BY Symbol ASC, Date ASC”.
Doing so ensures that no sorting is required during import and symbol shuffling is reduced to minimum, so in-memory cache is used most efficiently.
If your file is not ordered or ordered in reverse then it takes long to import because AB must shuffle data. In worst case scenario (newest records first), every data insert involves sorting which makes it a killer. The difference can be hours vs seconds on properly sorted file.