Jump to content

HQueue ... how to?


Macha

Recommended Posts

I am trying to build a small renderfarm and to test it I use 2 machines (main=windows 7 and slave=xp). I have trouble getting it to work with HQueue. I am on support but they don't seem to be able to help me, so I'll ask some basic questions here:

My setup:

I installed Houdini and HQserver on both machines. HQueserver service is

running on both machines. Python 2.6 is installed on both machines.

I added my (windows 7) machine to the clients list.

I want the shared network drive to be on my w7 machine.

Now:

I think my xp machine needs to know about where the shared drive is? So, I edit the hqserver.sharedNetwork.path.windows variable in the hqserver.ini file?

Is that right or do I misunderstand how this is supposed to work?

Link to comment
Share on other sites

Hey Marc,

At the first glance, you should run hqserver only on one machine in your network (W7 in your case where sharded folder is). Server is the central place where the Render Queue database is stored.

Python 2.6 needs to be install before doing any installation.

Removing HQserver completely and starting with the clean install works best for me at the moment, don't upgrade, i had some weird problems afterwards.

For the start stick with shared drive default settings. During second time installation I needed to map shared network drive manually, but that's ok.

On Windows in HQ ROP instead using: HQROOT/houdini_distros/hfs.HQCLIENTARCH (I removed dollar signs, since editor was cutting of this post)

try pointing to you local houdini folder. You farm is on windows exclusively so paths should be the same. Personally I try to avoid "Program files" spaces and install all sidefx stuff in C:\sidefx. You can of course specify the shared network drive to h:\houdini_distros\hfs.windows-x86_64

but this is meant to be slower due to additional network traffic. So try with local paths first and move on if that works.

I recommend starting with 11.0.504 version. Previously there was a nasty rendering progress bug which slows down each bucket a lot! Unfortunately on Amazon Cloud they stick with 11.0.426 so expect higher charges :P

Debugging suggestions:

Remember at least at the start, disabling all firewalls, just to check things work.

Before rendering check if you can access HQ monitor from the browser.

Locally you should be able access it by: localhost:5000

from a Client machine use either machineIP:5000 or hostName:5000

If that doesn't work there is a network problem (probably a firewall)

Client process at the moment is not a service, so after computer restart it needs to be start manually.

I think that is all what I can think of at the moment. It took me quite some time to set it up on my small farm but I can confirm it works.

If you have any additional question PM me or just post on this thread, I try to help.

Cheers,

Kuba

Link to comment
Share on other sites

Hey Kubabuk, thanks, I did more or less what you suggested and it kinda works. Well, it seems I was able to render from my own machine (when added as a client, you can see it as job 63 in the attached image) but now niether it nor the real client (xp machine) doesn't do anything except "waiting for machine".

D*** this is so frustrating. Truly Beta. Grrrrrrr! ℑ**q℣wrt⅍.|ゑtzz/*

post-4013-128747365032_thumb.jpg

Edited by Macha
Link to comment
Share on other sites

Ah I see, I forgot about the following. Since it is windows you have to add client manually and copy client startup and configurations scripts from the server machine (W7). Follow the instructions in the windows section:

http://www.sidefx.com/docs/hqueue11.0/help/gettingstarted/addclients/manual.html

You can also double check where you are outputing ifds with the HQueue ROP. that should be a network path otherwise only W7 sees it.

Check in the HQueue ROP in the Advance tab - Assign to parameter - should be default - "Any client"

Cheers,

Kuba

Link to comment
Share on other sites

OK, now I've got 3 clients, including myself, and they send heartbeats. However, only one of them works. The others fail.

Any ideas why that could be?

Job 26 says:

Name:  	Render -> HIP: //192.168.1.13/x/Houduni/testHoudini/renderme.hip ROP: /out/mantra1
Id: 	26
Status: 	
failed 	  	
Overall Progress: 	0%
ETA: 	0s
Properties: 	More...
Description: 	None
Tries left: 	0
Priority: 	5
Min Hosts: 	1
Max Hosts: 	1
Tags: 	
Queue time: 	October 19, 2010 05:57:18 PM
Runnable time: 	
Command start time: 	
Command end time: 	
Start time: 	October 19, 2010 05:57:19 PM
End time: 	October 19, 2010 05:57:21 PM
Time to complete: 	2s
Time in queue: 	0s
Progress: 	Unknown
Requirements: 	
hostname 	any 	aladdin
Environment: 	
HQCOMMANDS	{"hythonCommandsLinux": "export HOUDINI_PYTHON_VERSION=2.6 && export HFS=\"C:/PROGRA~1/SIDEEF~1/HOUDIN~1.504\" && cd $HFS && source ./houdini_setup && hython -u", "pythonCommandsMacosx": "export HFS=\"C:/PROGRA~1/SIDEEF~1/HOUDIN~1.504\" && $HFS/python/bin/python2.6-64", "pythonCommandsLinux": "export HFS=\"C:/PROGRA~1/SIDEEF~1/HOUDIN~1.504\" && $HFS/python/bin/python2.6", "pythonCommandsWindows": "(set HFS=C:\\PROGRA~1\\SIDEEF~1\\HOUDIN~1.504) && !HFS!\\python26\\python2.6.exe", "mantraCommandsLinux": "export HFS=\"C:/PROGRA~1/SIDEEF~1/HOUDIN~1.504\" && cd $HFS && source ./houdini_setup && $HFS/python/bin/python2.6 $HFS/houdini/scripts/hqueue/hq_mantra.py", "mantraCommandsMacosx": "export HFS=\"C:/PROGRA~1/SIDEEF~1/HOUDIN~1.504\" && cd $HFS && source ./houdini_setup && $HFS/python/bin/python2.6-64 $HFS/houdini/scripts/hqueue/hq_mantra.py", "hythonCommandsMacosx": "export HOUDINI_PYTHON_VERSION=2.6 && export HFS=\"C:/PROGRA~1/SIDEEF~1/HOUDIN~1.504\" && cd $HFS && source ./houdini_setup && hython -u", "hythonCommandsWindows": "(set HOUDINI_PYTHON_VERSION=2.6) && (set HFS=C:\\PROGRA~1\\SIDEEF~1\\HOUDIN~1.504) && (set PATH=C:\\PROGRA~1\\SIDEEF~1\\HOUDIN~1.504\\bin;!PATH!) && !HFS!\\bin\\hython -u", "mantraCommandsWindows": "(set HFS=C:\\PROGRA~1\\SIDEEF~1\\HOUDIN~1.504) && !HFS!\\python26\\python2.6.exe !HFS!\\houdini\\scripts\\hqueue\\hq_mantra.py"}
Log: 	Less...
October 19, 2010 05:57:21 PM 	job succeeded, but setting status to failed since child job(s) failed
October 19, 2010 05:57:21 PM 	setting status to failed

while its childjob 27 complains about:

Warning: 192.168.1.15 has keys which expire shortly
  License: 139xxxx3 [Houdini-Master] in 14 days
  License: 46cxxxx1 [Render] in 14 days
No licenses could be found to run this application.
	Please check for a valid license server host

post-4013-128747893788_thumb.jpg

post-4013-128747905526_thumb.jpg

Edited by Macha
Link to comment
Share on other sites

How many licenses for mantra does the Hkey show? It needs to be enough for each rendering core you want to use. Don't know how it is at the moment but few years ago (Escape i.e.) came with only two mantra tokens so any other machine used for network rendering complained about missing license. This might be it although I'm not 100% sure. Regarding licenses it's better to speak directly to SESI.

Are you rendering ifds or with hython? This depends if hkey takes over mantra or hbatch license.

Edited by kubabuk
Link to comment
Share on other sites

Marc do you have another Python version on every machine (2.5 or earlier)? If that is the case remove all builds and install 2.6 again. I recall I had similar problems at start, everything seemed to work but clients never picked the sent jobs. In my case Python 2.5.4 was the primary python version recognized by OS which lead to many silent problems during HQserver installation.

Also (that was while ago when 11.0.469 was the production build) so things maybe have changed today, SESI recommended to split the installation in two steps. Install Houdini and do all of the licensing stuff first and then restart the Houdini installer proceed with HQServer installation (uncheck everything else).

Link to comment
Share on other sites

Did you try rendering directly from UI with mantra -H from the W7 machine? (I guess this is the one where you run HMaster, HQserver and have a shared network folder). Check if you get any error messages in the console. This will prove the mantra licenses and network connection is properly setup on each node?

If that works the problem is on the HQserver side and I recommend wipe out HQserver and clients and install it from scratch.

Did you modified hqserver.ini? Can you share it? (Although I don't think the problem lays there).

Link to comment
Share on other sites

I'm not on the machine now but I can tell you that I rendered from a hqrender node inside Houdini. In the advanced tab I assigned the job to my w7 machine, and in the mantra options tab I assigned the idf job to w7 and other clients (to preempt any licence restrictions: I only have one master licence but 30 render licenses).

It's the other clients that fail, and they fail not completely but they seem to fail with the render licenses. It appears they can't fetch them or whatever (it says "job started" and they give heartbeats but a child job fails)

Python 2.6.4 is installed on w7 and client.

Edited by Macha
Link to comment
Share on other sites

So from your description looks you are instructing hqueue to use all machines to generate ifds and only W7 to render images. It should be other way round, or just leave defualts - hbatch should sort it out itself. Once the first ifd is generated other machines (XPs) pick up renderings. I also submitted a RFE's so that if you have a multi core server machine (which generates ifds) it could launch mantra at the same time once any new ifd file is created. Right now (W7) generates all ifds first and move to rendering phase afterwards, which is a little bit of waste if you are generating many fast rendering sequences.

Link to comment
Share on other sites

OK, I got one client working. I am now trying to add more via a remote desktop. I want to try the

"Do you want to mount the HQueue shared drive? (yes or no)"

option but I can't enter the password, or it doesn't understand or I do somethiong wrong. What the correct way to go about this? Do I understand correctly that the client machine can then load the Houdini distribiution from my shared folder?

Edited by Macha
Link to comment
Share on other sites

This is on the client machine. I want it to read the distro from the shared drive.

But what is it actually asking me for? My localhost name, its name? 'localhost', localhost, whats the difference? Whose machine's password? Any combination fails so far.

post-4013-128755166815_thumb.jpg

Link to comment
Share on other sites

If you are trying to install clients from remote desktop it will not work afaik, since SSH support on windows is not available. http://www.sidefx.com/docs/hqueue11.0/help/installation/networkrequirements.html

"Do you want to mount the HQueue shared drive? (yes or no)"

I would use default settings and if the shared drive is not already mounted, do it manually to minimize frustration.

As a rule of thumb I'm trying to stick with defaults, it's easier to debug at the moment.

Link to comment
Share on other sites

I'm making some slow progress. I am not able to remote install houdini and license server, and get them all started registered as clients and they render as well.

The next step would be to figure out this shared drive thing... that still drives me mad.

Link to comment
Share on other sites

How can I change $ HQROOT? I see "Macha" did it, how? And: I have always jumps on stage error "Generate IFDs", although I have repeatedly checked the path on availability. Is it possible to generate first and then give IFDs HQueue?

Link to comment
Share on other sites

How can I change $ HQROOT? I see "Macha" did it, how?

I typed in the path manually like so

C:/PROGRA~1/SIDEEF~1/HOUDIN~1.504

---

My clients become unresponsive after a while. Does anybody know why that happens and how I can revive them without running the python script again? See attached image.

Actually, on some machines I can't re-run the scripts because they complain that they "could not stop existing client", even though none appears to be running.

post-4013-128817151716_thumb.jpg

Edited by Macha
Link to comment
Share on other sites

Actually I am beginning to wonder if anybody has this working on windows. I find HQueue an absolute mess that does nothing but give me headaches and pain at every step I drag it along. SESI support albeit helpful does also appear to be clueless about it most of the time.

What other solutions do people use to build Houdini renderfarms?

Edited by Macha
Link to comment
Share on other sites

Actually I am beginning to wonder if anybody has this working on windows. I find HQueue an absolute mess that does nothing but give me headaches and pain at every step I drag it along. SESI support does also appear to be clueless about it most of the time.

What other solutions do people use to build Houdini renderfarms?

we are considering this product. i've been told it's a 'houdini rendering warhorse' . LOL.

deadline by primefocus

http://software.primefocusworld.com/software/products/deadline/overview/

houdini deadine guide

http://software.primefocusworld.com/software/support/deadline/houdini.php

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...