HKGuns wrote in post #19174486
Both actually, given what the doctor said above. My machine is pretty robust but apparently CO performs well as compared to LR.
The i9 is pretty much a stripped down Xeon. The extra L3 cache and additional cores help to prevent cache thrash and reduce the bus chatter & latency from cores-to-cache-to-RAM. And with each core running much higher, it helps speed up the export. In a nutshell, pretty much exporting is a single-threaded operation, so each core in a system gets assigned one image and runs to completion before getting another. There's the work stealing and offloading of math functions to OpenCL/CUDA, but there is much less multi-threading to an export operation than other functions.
Single threaded benefits more from high core speed, whereas multi-threaded functions don't always do so. Cache latency and size can affect multi-threaded a lot as too much data or poor data block sizing in code can cause a cache eviction when a core switch occurs. L1 is the lowest latency, L2 the next and so on with RAM reads begin the slowest. Each time data gets evicted, it goes to the next higher ring and to read it back in takes longer. If the data is complete evicted from the proc, it'll have to be read back in from RAM, which is an eternity to a processor (and the core "stalls" waiting for data to come in).
On top of all that is the efficiency and quality of the code and how it handles blocks of data. LR is sloppy. Give it big files and watch it choke. C1's code design is much more efficient (when they don't have a stupid bug that takes them a month to fix). But, LR has better DAM organization whereas C1 does have a neat album/group/project method, but isn't optimized for the hobbyist cataloging their life's work on it. Phocus seems to be the fastest with the "tightest" code, but doesn't really organize at all.