Jump to content

Houdini and Infiniband


rzh0013

Recommended Posts

Hello all, 

I was wondering if anyone has any experience working with Infiniband and Houdini. I have recently purchased two Voltaire HCA700ex2-q Infiniband cards (40Gb/ps) and have them connected via a Mellanox 40Gb capable fiber qsfp+ cable. I've already upgraded the firmware on both cards to the newest available version. And both cards can see each other and I am able to set up and transfer files between them using shared drives in Windows 7 Professional x64. (I will be trying this next in CentOS 7) What my question is: how do I get the two computers to see each other in Hqueue so that I can use my faster Infiniband connection rather than my much slower gigabit Ethernet connection which is getting fully saturated during distributed fluid sims. Do I need to edit the client information in some way to use the IP from the Infiniband card (cards are capable of 10Gb/ps IPoIB)?

 

Thanks,

 

Ryan

Link to comment
Share on other sites

Hey Ryan,

I would move your installation to windows 10 because of smb3 rather than smb2 on windows 7, transfers are much faster and I am sure you want FASTER!!! Anyway isn't 40gb a bit overkill unless you have pci nvme ssd's?(see link to pcie nvme).  If you don't have nvme ssd's and you still want to go fast try installing a ramdisk.(link)

 

To get Hqueue to see your other machine you have to edit your hosts file with the ip of the 40gb card rather than the 1gb card.  So if the ip of the 40gb connection for the render machine is 10.10.10.2, in your hosts file you would insert the information shown below and vice versa on render machines host file.

 

workstation host file

10.10.10.2        myrendermachine

 

rendermachine host file 

10.10.10.1       myworkstationmachine

When the hosts files are done you should edit your hqclient and hqserver ini's, 

Edit your hqnode.ini file to something like below.

 

server = myrendermachine

port   = 5000

sharedNetwork.mount = \\Nas_or_sharedstoragehost\shared_drive

 

You have to edit hqserver.ini to change it to the required computer name/network shares...

 

I hope this helps in getting hqueue working with the 40gb connection.

 

 

Let us know how it goes...

 

Albin Ho

 

 

 

Here are some links in getting the best out of your connection (the same rules apply for 10gb connections....)

http://www.cinevate.com/blog/confessions-of-a-10-gbe-network-newbie-part-1-basics/

 

Link to comment
Share on other sites

Thanks Albin, 

I actually did figure out the hosts thing a little while back and got it up and running. I meant to update the post but haven't had time as I'm currently setting up a similar system at work to handle our Octane Rendernodes. The reason I need the 40Gb pipes is to handle the data going between the GPUs. 

 

At at work we're getting some pcie nvme drives in soon and I plan on upgrading my home workstations as well sometime in the next month or two. I've been keeping an eye out for the Seagate nytro branded drive that's supposedly their response to the hp and dell pcie 3 x16 that are capable of 10Gb/ps.

 

As as far as ramdisks go though, I haven't had a lot of success in getting a good transfer rate in Win7 but that could be some sort of weird bottleneck. In fact I'm seeing better transfer and read speeds between my two ssds which strikes me as odd. I think the ramdisk was topping out at 200MB/ps on writes. Whereas the ssds were hitting 700MB/ps sustained. 

 

I'll update this thread with my findings from time to time. I'd love to chat with you about your experiences with this if you have the time. 

 

Cheers!

 

Ryan

Link to comment
Share on other sites

Hi Ryan,

As i said before windows 7 only supports smb2 whereas windows 8/10 supports smb3, which translates to a bottleneck for windows 7 on 10gb+ connections.

Also with a 40gb pipe you should be reaching the max limits of the ssd's.  Most likley 500mb+ reads/300-450mb writes with samsung 850 evos other ssd's results may vary.

You didn't state too much detail about the hardware in the systems as this makes a big difference in determing the bottlenecks in the system. CPU/RAM/Graphics cards(how many?), motherboard etc. 

You may find that intel nic's have much better performance than other brands simply because of better drivers/utililties...(even fake/generic intel nics, see link)

also some further reading ....

http://www.cinevate.com/blog/confessions-of-a-10-gbe-newbie-part-6-breaking-the-10gb-data-barrier/

 

HTH

 

Albin Ho ;-)

Link to comment
Share on other sites

Hello Albin,

 

I'll have to see about getting some OEM licenses for Windows 10 then to check out SMB3. I'm wondering if this could also be enabled in CentOS 7. 

The plan is to get several pcie nvme drives  in raid0 and have them be the drive that the project files get copied to when the job is submitted. I'm currently looking at the Intel 750, but am open to suggestions. 

On my test machines I have on either end: 

machine 1: Intel 4930k, 64GB Corsair RAM (can't remember speed), single OCZ 480GB ssd, GTX 770 Classified, Quadro K5000, MSI GD45-A(model might not be right but it's a GD45)

Machine 2: Dual socket hp z600 (each socket has a 4 core Xeon), 24GB RAM, a pair of 240GB ssd's in raid0 (one is a Seagate the other a Kingston ssd now), Quadro K5000, whatever the standard mb from HP is. 

Both machines also have a Mellanox Connectx2 qdr 40Gb/s Infiniband card (also capable of 10Gb/s IPoIB) and both cards have been flashed with the latest firmware available for those cards. They are directly linked using a Mellanox 15m qsfp+ fiber optic line.

Thanks for for the links to the series, there's a lot of good information there for future projects as well.

 

Cheers!

Ryan

Link to comment
Share on other sites

Hi Ryan,

Yes centos 7 does support smb3, you will need to install the mellanox linux drivers and if there are any options in the driver to enable:

  • jumbo frames
  • recieve side scaling
  • maximum number of processers
  • max number of queues
  • max send&recieve buffers...

Use nttcp tool to check transfer speeds(link, linux version).

You may find that transfers of small files i.e rendered files will be much slower than transfers of multi-gigabyte files.

Also untill you have nvme ssd's you will unlikely ever saturate that 40gb/s connection...Intel 750 is good but most likley samsung 950 pro nvme much faster...

 

HTH

 

Albin Ho...

 

  • Like 1
Link to comment
Share on other sites

  • 1 year later...

Hello Albin,

Wow I can't believe it's been almost 2 years since we had this discussion. Well some things have changed and now I have a networked NVME raid 0 array of two samsung 960s and my secondary machine has upgraded to a quad socket dell r280. The nvme drives are housed in the r820. Testing with crystal disk mark, my remote connection from my main PC is getting 2802MB/s and 3044MB/s reads and writes respectively. Locally on the r820 I'm getting 5244MB/s and 3785MB/s respectively. I should note that this is on win 10 for the desktop and server 2012 r2 for the the r820. So while the speeds are great, I would like to see an improvement to where it's as if the disks are located locally. To this end, I've seen some information about NVMEoF. So I will be looking into this, have you had any experience with this?

Cheers!

Ryan

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...