Jump to content

jdiazsaezwd

Recommended Posts

I would go with the threadripper 32core. You get nearly the double of performance. it won't help in all function but for rendering and big simulation it can help. 

Also the license cost is lower when you use houdini engine. If you are a indie NO BRAINER.

 

Link to comment
Share on other sites

On 9/8/2018 at 9:47 PM, Mandrake0 said:

I would go with the threadripper 32core. You get nearly the double of performance. it won't help in all function but for rendering and big simulation it can help. 

Also the license cost is lower when you use houdini engine. If you are a indie NO BRAINER.

 

Doesn't simulations benefit from higher siglecore speed? wouldn't it be better to have less cores and higher core speeds? thats what I have been reading in some forums for FX but I'm barerly new to houdini

Link to comment
Share on other sites

The 24/32 core parts are odd beasts. They're made of four 8 core modules (24 core has 2 cores disable per module). Unlike the server Epyc processor, only 2 of those modules have access to main memory (each have access to half of it via dual memory controllers), and the other two modules must hop through one of the mem-attached modules to get at main memory. So they'll scale well for low-bandwidth, high-compute workloads (rendering), but start to suffer in cases where memory bandwidth is important (massive sims). The 16 core part has 2 modules and each has access to half the memory.

AMD's designed the scheduling such that the modules connected to memory are populated with threads first, then the mem-isolated modules. I'm curious how that works for SMT (fill 16 threads on the mem modules, then 16 threads on the other modules, then populate the 33rd+ thread on the loaded cores; or load up the mem-modules to 32 threads first).

I'd be more tempted to go for the 16 core version (2950X) myself. Thread efficiency starts to drop off at high core counts as well, so you really don't want to be losing even more performance in your top 16 cores. The memory bandwidth issue would also make me think twice about launching multiple processes using a lower thread count too.

Pretty in-depth analysis here: https://www.anandtech.com/show/13124/the-amd-threadripper-2990wx-and-2950x-review

Link to comment
Share on other sites

2 hours ago, malexander said:

The 24/32 core parts are odd beasts. They're made of four 8 core modules (24 core has 2 cores disable per module). Unlike the server Epyc processor, only 2 of those modules have access to main memory (each have access to half of it via dual memory controllers), and the other two modules must hop through one of the mem-attached modules to get at main memory. So they'll scale well for low-bandwidth, high-compute workloads (rendering), but start to suffer in cases where memory bandwidth is important (massive sims). The 16 core part has 2 modules and each has access to half the memory.

AMD's designed the scheduling such that the modules connected to memory are populated with threads first, then the mem-isolated modules. I'm curious how that works for SMT (fill 16 threads on the mem modules, then 16 threads on the other modules, then populate the 33rd+ thread on the loaded cores; or load up the mem-modules to 32 threads first).

I'd be more tempted to go for the 16 core version (2950X) myself. Thread efficiency starts to drop off at high core counts as well, so you really don't want to be losing even more performance in your top 16 cores. The memory bandwidth issue would also make me think twice about launching multiple processes using a lower thread count too.

Pretty in-depth analysis here: https://www.anandtech.com/show/13124/the-amd-threadripper-2990wx-and-2950x-review

Thank you for all the in-depth info Mark, didn't know any of this memory management, I will go with an 2950x OC. Thanks :)

Link to comment
Share on other sites

it takes me wonder when you make a simulation how much the data throughput will affect the calculation cycle. i have the feeling the most challenging part will be the correct setup for so many cores.

just read today that near all benchmark test could be wrong because there is a bug in nvidia driver that slows down massively when you have 64 threads and up specialy by games.  

https://translate.google.com/translate?sl=de&tl=en&js=y&prev=_t&hl=de&ie=UTF-8&u=https%3A%2F%2Fwww.golem.de%2Fnews%2F32-kern-cpu-threadripper-2990wx-laeuft-mit-radeons-besser-1808-136016.html&edit-text=&act=url

https://translate.google.com/translate?hl=de&sl=de&tl=en&u=https%3A%2F%2Fwww.forum-3dcenter.org%2Fvbulletin%2Fshowpost.php%3Fp%3D11769097%26postcount%3D60

 

2990wx is surly a good cpu but the biggest challenge is the software and the setup. there is also a older article that shows how much the tile size can affect the rendering in blender cycles:

https://www.blenderguru.com/articles/4-easy-ways-to-speed-up-cycles (under: 3. Change the Tile Size)

 

 

 

Link to comment
Share on other sites

  • 1 month later...

The infinity fabric has lower latency than socket to socket. That said it's still very sensitive to memory timings and speeds. Unlike Intel processors, the Ryzen processors must have the timings dialed in and the processors respond better to higher to memory frequencies(the less memory you have the faster it needs to be). In the case of the 32 core Threadripper it can only be run in NUMA mode which windows is absolutely terrible at handling. Slow, improperly timed memory + bad NUMA + bad scheduling = performance regressions with a 32 core processor in windows. Having slow improperly timed memory will also hurt your performance in Linux, but at least you will have superior NUMA and scheduling.

 

 

 

  • Like 1
Link to comment
Share on other sites

  • 4 weeks later...
On ‎11‎/‎10‎/‎2018 at 1:09 AM, AaronAb said:

The infinity fabric has lower latency than socket to socket. That said it's still very sensitive to memory timings and speeds. Unlike Intel processors, the Ryzen processors must have the timings dialed in and the processors respond better to higher to memory frequencies(the less memory you have the faster it needs to be). In the case of the 32 core Threadripper it can only be run in NUMA mode which windows is absolutely terrible at handling. Slow, improperly timed memory + bad NUMA + bad scheduling = performance regressions with a 32 core processor in windows. Having slow improperly timed memory will also hurt your performance in Linux, but at least you will have superior NUMA and scheduling.

 

 

Would that mean that Ram speed between DDR4 2400 and 3200 would be a high difference?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...