Jump to content

Multithreading & Gpgpu


malexander

Recommended Posts

There's also a more concise article at ArsTechnica, for those with limited time on their hands.

The first version will apparently have its own memory pool, and not have access to main memory, which makes it a lot like the Cell's SPUs in my eyes. Also, I'm curious to see what sort of data alignment restrictions they have for the vectors - a 16 32b FP vector is 512b, hopefully that's not the alignment (alignment restrictions are probably the single most difficult hurdle when adapting algorithms to vector units). I'm also curious how that plan to keep that super-wide vector unit busy, or whether it falls on the developers to do so.

And finally, memory bandwidth to & from the unit is the last major detail I'd like to know. I was hoping for something a little better that PCI-Express speeds, but perhaps that will evolve as well.

Link to comment
Share on other sites

  • 2 months later...

The Nvision 2008 conference had some interesting sessions on applications of their GPUs to Image Processing, among other things. I wouldn't be surprised if a lot of those same algorithms are used in the new Photoshop CS4.

With CPUs moving into truly multicore territory, and GPUs becoming more formidable, it becomes an interesting problem to balance the work between them -- especially when there is such a large range between entry level and high end products on both sides. The fact that you can pair a cheap CPU with a powerful GPU, and vice versa, makes it important that the application choose the proper unit for a job. Ideally, you have the CPUs doing something while the GPUs works on something else, though that isn't always possible. Optimization just keeps getting more fun by the day :)

[Edit: fixed link]

Link to comment
Share on other sites

The Nvision 2008 conference had some interesting sessions on applications of their GPUs to Image Processing, among other things. I wouldn't be surprised if a lot of those same algorithms are used in the new Photoshop CS4.

With CPUs moving into truly multicore territory, and GPUs becoming more formidable, it becomes an interesting problem to balance the work between them -- especially when there is such a large range between entry level and high end products on both sides. The fact that you can pair a cheap CPU with a powerful GPU, and vice versa, makes it important that the application choose the proper unit for a job. Ideally, you have the CPUs doing something while the GPUs works on something else, though that isn't always possible. Optimization just keeps getting more fun by the day :)

[Edit: fixed link]

We will have the chance, someday, to see all of this multicore/GPU in the architecture of Houdini, so some contexts as SOP or POP will work complitelly using all the available cores ...

I know this is a great effort and nothing trivial, but really the thing that I miss in Houdini is speed, I think this should be the next step, the thing that says ok this is Houdini 10.

There is a lot of features, more than any other 3D application in the market, I think, but you can't use all the power of your computer, only in some areas or some selected tools.

In my opinion the big step for XSI is not ICE is to get a multicore architecture. this makes them one step ahead of the competition for the future.

So this is the reason cos I think that the next big step in Houdini should be to get a complete multicore infraestructure, and I really think that mixing CPU with GPUs is not the future, the future is to have a very stable multicore foundation in the tool.

In my opinion GPUs will disapear in the future and everything will be into the CPU ....

http://archive.gamespy.com/legacy/interviews/sweeney.shtm

cheers

Link to comment
Share on other sites

So this is the reason cos I think that the next big step in Houdini should be to get a complete multicore infraestructure, and I really think that mixing CPU with GPUs is not the future, the future is to have a very stable multicore foundation in the tool.

I tend to agree. I see multithreading as being the pervasive technology used for an application, while GPGPU might be used to speed up a few, specific areas. For applications that fit the GPU shoe, it can be used to great effect (30x +). For applications that don't, it ends up being a real hassle with a mediocre performance result. GPU processing also works well within a threaded environment as it offloads work from the CPU. For example, cooking a SOP tree with a GPU-accelerated SOP in the mix would essentially free up the CPU thread assigned to that SOP, and with Intel returning to SMT processing (aka "Hyperthreading"), it would be a good fit as other threads could use the new CPU resource. However, I believe the effort required to port all SOPs to the GPU would just be too great.

I know this is a great effort and nothing trivial, but really the thing that I miss in Houdini is speed, I think this should be the next step

We are keenly aware of this, don't worry :)

Link to comment
Share on other sites

I tend to agree. I see multithreading as being the pervasive technology used for an application, while GPGPU might be used to speed up a few, specific areas. For applications that fit the GPU shoe, it can be used to great effect (30x +). For applications that don't, it ends up being a real hassle with a mediocre performance result. GPU processing also works well within a threaded environment as it offloads work from the CPU. For example, cooking a SOP tree with a GPU-accelerated SOP in the mix would essentially free up the CPU thread assigned to that SOP, and with Intel returning to SMT processing (aka "Hyperthreading"), it would be a good fit as other threads could use the new CPU resource. However, I believe the effort required to port all SOPs to the GPU would just be too great.

We are keenly aware of this, don't worry :)

Thanks for the reply mark, good to know, only thinking of being able to use all the cores in more areas in Houdini is so sweet, keep the good work.

Link to comment
Share on other sites

Now that PhysX runs on top of cuda, maybe someone would like to do a PhysX DOP solver? :) (along the lines of the ODE DOP solver..)

I'd guess that SESI might be waiting for a manufacturer-agnostic solution (OpenCL?)

eetu.

Link to comment
Share on other sites

  • 11 months later...

It seems that Nvidia's newest GPU, the GT300 (codenamed Fermi) was designed with improvements almost exclusively targeted towards GPGPU. No word on performance yet, though the new cards are supposedly shipping in late Q4 2009 (Nov?).

They've also recently released OpenCL drivers and dev kits for anyone that's interested here.

And you can view Nvidia's webcasts of their 2009 GPU conference here, if you're interested (though they are quite long, ~1 hour).

Together with the recent release of the ATI 5870, and the news that AMD/ATI has OpenCL drivers for both x86 CPUs and their GPUs, this could bring about some very interesting new applications.

Link to comment
Share on other sites

  • 2 months later...

The graphics chip has been officially pushed back, but the Larrabee project continues... It didn't look good for the initial version after an underwhelming demo of Quake Wars (showcasing raytracing) and the departure of Pat Gelsinger, who championed the project.

I thought it was a strange pairing, the complex x86 instruction set and a massively parallel processor. x86 was supposed to allow code to run on the CPU or GPU, but since the 16-wide AVX vector unit in Larrabee cores required special instructions, and they lacked SSE, I don't see how this would work in practice. My money's still on Fermi, which appears to have a very well thought out architecture :)

Link to comment
Share on other sites

Mmmm this GPU multithreading world is changing so fast.

I hope that at leas OpenCL (or something similar, but I dont knoe anything else like it) will become the standard for multithreading developing, and therefore multithreading developing will be more easy and even more important more cross platform

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...