jdiazsaezwd 0 Posted August 7, 2018 Hi, Thoughts on new threadrippers for houdini? 32 cores @3.0/4.2 or 16 cores @3.5/4.4 ? which one is better for hard FX in houdini? Share this post Link to post Share on other sites
Mandrake0 57 Posted August 9, 2018 I would go with the threadripper 32core. You get nearly the double of performance. it won't help in all function but for rendering and big simulation it can help. Also the license cost is lower when you use houdini engine. If you are a indie NO BRAINER. Share this post Link to post Share on other sites
jdiazsaezwd 0 Posted August 13, 2018 On 9/8/2018 at 9:47 PM, Mandrake0 said: I would go with the threadripper 32core. You get nearly the double of performance. it won't help in all function but for rendering and big simulation it can help. Also the license cost is lower when you use houdini engine. If you are a indie NO BRAINER. Doesn't simulations benefit from higher siglecore speed? wouldn't it be better to have less cores and higher core speeds? thats what I have been reading in some forums for FX but I'm barerly new to houdini Share this post Link to post Share on other sites
malexander 361 Posted August 13, 2018 The 24/32 core parts are odd beasts. They're made of four 8 core modules (24 core has 2 cores disable per module). Unlike the server Epyc processor, only 2 of those modules have access to main memory (each have access to half of it via dual memory controllers), and the other two modules must hop through one of the mem-attached modules to get at main memory. So they'll scale well for low-bandwidth, high-compute workloads (rendering), but start to suffer in cases where memory bandwidth is important (massive sims). The 16 core part has 2 modules and each has access to half the memory. AMD's designed the scheduling such that the modules connected to memory are populated with threads first, then the mem-isolated modules. I'm curious how that works for SMT (fill 16 threads on the mem modules, then 16 threads on the other modules, then populate the 33rd+ thread on the loaded cores; or load up the mem-modules to 32 threads first). I'd be more tempted to go for the 16 core version (2950X) myself. Thread efficiency starts to drop off at high core counts as well, so you really don't want to be losing even more performance in your top 16 cores. The memory bandwidth issue would also make me think twice about launching multiple processes using a lower thread count too. Pretty in-depth analysis here: https://www.anandtech.com/show/13124/the-amd-threadripper-2990wx-and-2950x-review Share this post Link to post Share on other sites
jdiazsaezwd 0 Posted August 13, 2018 2 hours ago, malexander said: The 24/32 core parts are odd beasts. They're made of four 8 core modules (24 core has 2 cores disable per module). Unlike the server Epyc processor, only 2 of those modules have access to main memory (each have access to half of it via dual memory controllers), and the other two modules must hop through one of the mem-attached modules to get at main memory. So they'll scale well for low-bandwidth, high-compute workloads (rendering), but start to suffer in cases where memory bandwidth is important (massive sims). The 16 core part has 2 modules and each has access to half the memory. AMD's designed the scheduling such that the modules connected to memory are populated with threads first, then the mem-isolated modules. I'm curious how that works for SMT (fill 16 threads on the mem modules, then 16 threads on the other modules, then populate the 33rd+ thread on the loaded cores; or load up the mem-modules to 32 threads first). I'd be more tempted to go for the 16 core version (2950X) myself. Thread efficiency starts to drop off at high core counts as well, so you really don't want to be losing even more performance in your top 16 cores. The memory bandwidth issue would also make me think twice about launching multiple processes using a lower thread count too. Pretty in-depth analysis here: https://www.anandtech.com/show/13124/the-amd-threadripper-2990wx-and-2950x-review Thank you for all the in-depth info Mark, didn't know any of this memory management, I will go with an 2950x OC. Thanks Share this post Link to post Share on other sites
malexander 361 Posted August 13, 2018 I'm not saying the 2990WX is bad, just that you should temper your expectations of a 32x speedup in rendering Share this post Link to post Share on other sites
Mandrake0 57 Posted August 15, 2018 it takes me wonder when you make a simulation how much the data throughput will affect the calculation cycle. i have the feeling the most challenging part will be the correct setup for so many cores. just read today that near all benchmark test could be wrong because there is a bug in nvidia driver that slows down massively when you have 64 threads and up specialy by games. https://translate.google.com/translate?sl=de&tl=en&js=y&prev=_t&hl=de&ie=UTF-8&u=https%3A%2F%2Fwww.golem.de%2Fnews%2F32-kern-cpu-threadripper-2990wx-laeuft-mit-radeons-besser-1808-136016.html&edit-text=&act=url https://translate.google.com/translate?hl=de&sl=de&tl=en&u=https%3A%2F%2Fwww.forum-3dcenter.org%2Fvbulletin%2Fshowpost.php%3Fp%3D11769097%26postcount%3D60 2990wx is surly a good cpu but the biggest challenge is the software and the setup. there is also a older article that shows how much the tile size can affect the rendering in blender cycles: https://www.blenderguru.com/articles/4-easy-ways-to-speed-up-cycles (under: 3. Change the Tile Size) Share this post Link to post Share on other sites
goldleaf 93 Posted October 3, 2018 Some relevant information in this article, on the memory bandwidth of 12/16 vs 24/32 Core Threadripper. https://www.pcworld.com/article/3298859/components-processors/how-memory-bandwidth-is-killing-amds-32-core-threadripper-performance.amp.html Share this post Link to post Share on other sites
AaronAb 7 Posted October 10, 2018 The infinity fabric has lower latency than socket to socket. That said it's still very sensitive to memory timings and speeds. Unlike Intel processors, the Ryzen processors must have the timings dialed in and the processors respond better to higher to memory frequencies(the less memory you have the faster it needs to be). In the case of the 32 core Threadripper it can only be run in NUMA mode which windows is absolutely terrible at handling. Slow, improperly timed memory + bad NUMA + bad scheduling = performance regressions with a 32 core processor in windows. Having slow improperly timed memory will also hurt your performance in Linux, but at least you will have superior NUMA and scheduling. 1 Share this post Link to post Share on other sites
Zetha 0 Posted November 6, 2018 On 11/10/2018 at 1:09 AM, AaronAb said: The infinity fabric has lower latency than socket to socket. That said it's still very sensitive to memory timings and speeds. Unlike Intel processors, the Ryzen processors must have the timings dialed in and the processors respond better to higher to memory frequencies(the less memory you have the faster it needs to be). In the case of the 32 core Threadripper it can only be run in NUMA mode which windows is absolutely terrible at handling. Slow, improperly timed memory + bad NUMA + bad scheduling = performance regressions with a 32 core processor in windows. Having slow improperly timed memory will also hurt your performance in Linux, but at least you will have superior NUMA and scheduling. Would that mean that Ram speed between DDR4 2400 and 3200 would be a high difference? Share this post Link to post Share on other sites