Question

New Computer Performance Advice

  • 19 November 2022
  • 3 replies
  • 51 views

I currently have a Dell computer with a 10 core I9-10900 processor with 64GB RAM. 

I occasional get a 1-2 month project where I am running NSC models with many CAD parts, tracing 1e7 rays, running several variations of the zemax file in parallel. These runs often take 2-3 hours. Sometimes I do Seq Hammer optimizations that take 17 hours. I am more often running models that take 30min w/ numerous iterations is time consuming. 

I was looking at the Dell 7865 series with the AMZ Ryzen threadripper processor. I am trying to decide the following options. I am willing to spend the money on the most expensive configuration if necessary.

Also I may sell this in 1 year to get a Mac Pro w/ 64-128 cores because I need the neural processors for other applications. It is probably easier to resell a computer < $10k

I am leaning towards option 1.1.  

  1. CPU: 32 vs 64 cores
    1. 32 cores. 3.7x increase speed ($9,100, 64GB RAM, Single A4000)
    2. 64 cores, 5.0x increase speed ($12,500, 64GB RAM, Single A4000)
  2. RAM: 64 vs 128GB (+1,200)
  3. Video Card: Single NVIDA 16GB, Single A4000 vs Dual A4000 (+ 750)

3 replies

Userlevel 6
Badge +3

Hi Don,

Load your file and do a System Check. Look at the memory usage part:

Glass Catalog Memory Usage   :         0.66 Mb

Coating Material Memory Usage:         0.11 Mb

Coating Catalog Memory Usage :         1.52 Mb

Estimated Total Memory Usage :         9.55 Mb*** this line

Available Memory             :     19687.04 Mb

When you start adding CAD parts, the ‘Estimated Total Memory Usage’ will increase. Depending on your CAD parts, this can get very large. I’ve seen files with estimated memory usage of over 1 GB, just to hold the data.

Now when OS traces rays in multiple threads, each thread receives a copy of the entire lens data, so it has to copy this memory for each thread. If your file was 1 GB, and you have 64 processors, you could end up with 64 GB just for the memory. This represents the overhead of the multithreading.

OpticStudio uses ‘large’ threads like this whereas other applications use micro threading. Your browser, for example, will download the HTML of a page and parse the text in one thread, and download graphics in independent threads. These are low-memory, ‘small’ threads that scale differently, and many threading benchmarks use small threading tasks which aren’t a good guide to how OS will perform.

I’d recommend a good graphics card but don’t spend a lot on this. OS does not work the graphics card the way games do. If theer’s good/better/best a good one is all you need.

Then get as much memory as you can for the reason above. Also look for a large cache memory.

Then, last, look at numbers of cores. Once you are at 32 cores, it’s likely that some threads will return their data before the next one can be launched. It’s best to experiment with the number of cores to use. OS will default to ALL but keep your eye on task manager: the optimum number may be smaller than this.

Another thing to think about is how many rays will be traced per thread. If there are too few the thread will return quickly and the overhead will be large compared to the computation time. Divide your Analysis rays by the number of cores, and make sure each thread gets several million rays. OS does a lot of this scaling for you, but it’s best to keep your own eyes on it as well.

Lastly, relax about 17 hour optimizations. That’s really not a long time when optimizing really complex systems.

  • Mark

Hi Mark

Thanks again for you insight!  I am using 300Mb/ file, I am running 3 files simultaneously. On a 32 core machine that would put me at about 64Gb of RAM. So I will go with the 32 core machine, with the single processor video card and 128Gb (2x64 so I can upgrade later).   

How is your Mac studio running zemax? I see Parallels supports the full # of cores now. Any big differences between what you would expect from a PC with the # equivalent cores?

Userlevel 6
Badge +3

I don’t run OS on my Mac. I wasn’t aware that Parallels supported it yet so I'll check it out

Reply