Jump to content

HQueue Problems


Recommended Posts

Hey guys,  I am having a heck of a time getting my farm setup for my school finals.  I have three machines I need to run sliced fluid sims on.  Right now I am just trying to get the main workstation to complete a HQueue job...  So I have HQueue client and server installed on this machine on the C drive. Both services run fine under another admin account I created called HQueue.  I Have used the shelf tool for creating a sliced sim (pyro sim in this case) as recommended per the HQueue documentation. The shared folder with a houdini install in it is on another disk called F in the workstation, the specific folder is shared as hq to all other machines and is mounted on all of them as H: I have no problems accessing it from any machine.  The server's .ini file has been setup with the server ip of the pc it is running on (the workstation), and these lines have been set:

 

hqserver.sharedNetwork.path.windows = \\KYLE-PC\hq

 

 

hqserver.sharedNetwork.mount.windows = H:

 

 

 

Everything else in there is vanilla.

 

 

 

The problem seems to be in writing the slice files to the mounted H: drive, as I get this error when I submit the houdini file I have attached:

 

    hqlib.callFunctionWithHQParms(hqlib.simulateSlice)

  File "\\KYLE-PC\hq\houdini_distros\hfs.windows-x86_64\houdini\scripts\hqueue\hqlib.py", line 1864, in callFunctionWithHQParms

    return function(**kwargs)

  File "\\KYLE-PC\hq\houdini_distros\hfs.windows-x86_64\houdini\scripts\hqueue\hqlib.py", line 1532, in simulateSlice

    _renderRop(rop)

  File "\\KYLE-PC\hq\houdini_distros\hfs.windows-x86_64\houdini\scripts\hqueue\hqlib.py", line 1869, in _renderRop

    rop.render(*args, **kwargs)

  File "//KYLE-PC/hq/houdini_distros/hfs.windows-x86_64/houdini/python2.7libs\hou.py", line 32411, in render

    return _hou.RopNode_render(*args, **kwargs)

hou.OperationFailed: The attempted operation failed.

Error:       Failed to save output to file "H:/projects/geo/untitled.loadslices.1.bgeo.sc".

Error:       Failed to save output to file "H:/projects/geo/untitled.loadslices.2.bgeo.sc".

 

 

 

I am really not sure why this is happening as I think I have all the relevant permissions.

 

Any suggestions peeps?

 

-Kyle




Here is the diagonostics ouput too:

Diagnostic Information for Job 75:
==================================
Job Name:                      Simulate -> HIP: untitled.hip ROP: save_slices (Slice 0)
Submitted By:                  Kyle
Job ID:                        75
Parent Job ID(s):              73, 76
Number of Clients Assigned:    1
Job Status:                    failed
Report Generated On:           December 12, 2015 01:52:08 AM

Job Properties:
===============
Description:                None
Tries Left:                 0
Priority:                   5
Minimum Number of Hosts:    1
Maximum Number of Hosts:    1
Tags:                       single
Queue Time:                 December 12, 2015 01:15:04 AM
Runnable Time:              December 12, 2015 01:46:19 AM
Command Start Time:         December 12, 2015 01:50:04 AM
Command End Time:           
Start Time:                 December 12, 2015 01:50:04 AM
End Time:                   December 12, 2015 01:50:18 AM
Time to Complete:           13s
Time in Queue:              35m 00s

Job Environment Variables:
==========================
HQCOMMANDS:
{
    "hythonCommandsLinux": "export HOUDINI_PYTHON_VERSION=2.7 && export HFS=\"$HQROOT/houdini_distros/hfs.$HQCLIENTARCH\" && cd $HFS && source ./houdini_setup && hython -u",
    "pythonCommandsMacosx": "export HFS=\"$HQROOT/houdini_distros/hfs.$HQCLIENTARCH\" && $HFS/Frameworks/Python.framework/Versions/2.7/bin/python",
    "pythonCommandsLinux": "export HFS=\"$HQROOT/houdini_distros/hfs.$HQCLIENTARCH\" && $HFS/python/bin/python2.7",
    "pythonCommandsWindows": "(set HFS=!HQROOT!\\houdini_distros\\hfs.!HQCLIENTARCH!) && \"!HFS!\\python27\\python2.7.exe\"",
    "mantraCommandsLinux": "export HFS=\"$HQROOT/houdini_distros/hfs.$HQCLIENTARCH\" && cd $HFS && source ./houdini_setup && $HFS/python/bin/python2.7 $HFS/houdini/scripts/hqueue/hq_mantra.py",
    "mantraCommandsMacosx": "export HFS=\"$HQROOT/houdini_distros/hfs.$HQCLIENTARCH\" && cd $HFS && source ./houdini_setup && $HFS/Frameworks/Python.framework/Versions/2.7/bin/python $HFS/houdini/scripts/hqueue/hq_mantra.py",
    "hythonCommandsMacosx": "export HOUDINI_PYTHON_VERSION=2.7 && export HFS=\"$HQROOT/houdini_distros/hfs.$HQCLIENTARCH\" && cd $HFS && source ./houdini_setup && hython -u",
    "hythonCommandsWindows": "(set HOUDINI_PYTHON_VERSION=2.7) && (set HFS=!HQROOT!\\houdini_distros\\hfs.!HQCLIENTARCH!) && (set PATH=!HQROOT!\\houdini_distros\\hfs.!HQCLIENTARCH!\\bin;!PATH!) && \"!HFS!\\bin\\hython\" -u",
    "mantraCommandsWindows": "(set HFS=!HQROOT!\\houdini_distros\\hfs.!HQCLIENTARCH!) && \"!HFS!\\python27\\python2.7.exe\" \"!HFS!\\houdini\\scripts\\hqueue\\hq_mantra.py\""
}

HQPARMS:
{
    "controls_node": "/obj/pyro_sim/DISTRIBUTE_pyro_CONTROLS",
    "dirs_to_create": [
        "$HIP/geo"
    ],
    "tracker_port": 54534,
    "hip_file": "$HQROOT/projects/untitled.hip",
    "output_driver": "/obj/distribute_pyro/save_slices",
    "enable_perf_mon": 0,
    "slice_divs": [
        1,
        1,
        1
    ],
    "tracker_host": "KYLE-PC",
    "slice_num": 0,
    "slice_type": "volume"
}

HQHOSTS:
KYLE-PC

Job Conditions and Requirements:
================================
hostname any KYLE-PC

Executed Client Job Commands:
=============================
Windows Command:
(set HOUDINI_PYTHON_VERSION=2.7) && (set HFS=!HQROOT!\houdini_distros\hfs.!HQCLIENTARCH!) && (set PATH=!HQROOT!\houdini_distros\hfs.!HQCLIENTARCH!\bin;!PATH!) && "!HFS!\bin\hython" -u "!HFS!\houdini\scripts\hqueue\hq_sim_slice.py"

Client Machine Specification (KYLE-PC):
=======================================
DNS Name:            KYLE-PC
Client ID:           1
Operating System:    windows
Architecture:        x86_64
Number of CPUs:      24
CPU Speed:           4000.0
Memory:              25156780

Client Machine Configuration File Contents (KYLE-PC):
=====================================================
[main]
server = KYLE-PC
port = 5000
sharedNetwork.mount = \\KYLE-PC\hq

[job_environment]


HQueue Server Configuration File Contents:
==========================================
#
# hqserver - Pylons configuration
#
# The %(here)s variable will be replaced with the parent directory of this file
#
[DEFAULT]
email_to = you@yourdomain.com
smtp_server = localhost
error_email_from = paste@localhost

[server:main]
use = egg:Paste#http
host = 0.0.0.0
port = 5000

[app:main]

# The shared network.
hqserver.sharedNetwork.host = KYLE-PC
hqserver.sharedNetwork.path.linux = %(here)s/shared
hqserver.sharedNetwork.path.windows = \\KYLE-PC\hq
hqserver.sharedNetwork.path.macosx = %(here)s/HQShared
hqserver.sharedNetwork.mount.linux = /mnt/hq
hqserver.sharedNetwork.mount.windows = H:
hqserver.sharedNetwork.mount.macosx = /Volumes/HQShared

# Server port number.
hqserver.port = 5000

# Where to save job output
job_logs_dir = %(here)s/job_logs

# Specify the database for SQLAlchemy to use
sqlalchemy.default.url = sqlite:///%(here)s/db/hqserver.db

# This is required if using mysql
sqlalchemy.default.pool_recycle = 3600

# This will force a thread to reuse connections.
sqlalchemy.default.strategy = threadlocal

#########################################################################
# Uncomment these configuration values if you are using a MySQL database.
#########################################################################
# The maximum number of database connections available in the
# connection pool.  If you see "QueuePool limit of size" messages
# in the errors.log, then you should increase the value of pool_size.
# This is typically done for farms with a large number of client machines.
#sqlalchemy.default.pool_size = 30
#sqlalchemy.default.max_overflow = 20

# Where to publish myself in avahi
# hqnode will use this to connect
publish_url = http://hostname.domain.com:5000

# How many minutes before a client is considered inactive
hqserver.activeTimeout = 3

# How many days before jobs are deleted
hqserver.expireJobsDays = 10

# The maximum number of jobs (under the same root parent job) that can fail on
# a single client before a condition is dynamically added to that root parent
# job (and recursively all its children) that excludes the client from ever
# running this job/these jobs again. This value should be a postive integer
# greater than zero. To disable this feature, set this value to zero.
hqserver.maxFailsAllowed = 5

# The priority that the 'upgrade' job gets.
hqserver.upgradePriority = 100

use = egg:hqserver
full_stack = True
cache_dir = %(here)s/data
beaker.session.key = hqserver
beaker.session.secret = somesecret
app_instance_uuid = {fa64a6d1-ae3f-43c1-8141-9c29fdd9d418}

# Logging Setup
[loggers]
keys = root

[handlers]
keys = console

[formatters]
keys = generic

[logger_root]
# Change to "level = DEBUG" to see debug messages in the log.
level = INFO
handlers = console

# This handler backs up the log when it reaches 10Mb
# and keeps at most 5 backup copies.
[handler_console]
class = handlers.RotatingFileHandler
args = ("hqserver.log", "a", 10485760, 5)
level = NOTSET
formatter = generic

[formatter_generic]
format = %(asctime)s %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %B %d, %Y %H:%M:%S

Job Status Log:
===============
December 12, 2015 01:15:04 AM: Assigned to KYLE-PC (master)
December 12, 2015 01:15:10 AM: setting status to running
December 12, 2015 01:15:23 AM: setting status to failed
December 12, 2015 01:18:28 AM: Rescheduling...
December 12, 2015 01:18:28 AM: setting status to runnable
December 12, 2015 01:18:28 AM: Assigned to KYLE-PC (master)
December 12, 2015 01:18:35 AM: setting status to running
December 12, 2015 01:18:47 AM: setting status to failed
December 12, 2015 01:23:18 AM: setting status to runnable
December 12, 2015 01:23:19 AM: Assigned to KYLE-PC (master)
December 12, 2015 01:23:20 AM: setting status to running
December 12, 2015 01:23:33 AM: setting status to failed
December 12, 2015 01:29:44 AM: setting status to runnable
December 12, 2015 01:29:44 AM: Assigned to KYLE-PC (master)
December 12, 2015 01:29:44 AM: setting status to running
December 12, 2015 01:29:57 AM: setting status to failed
December 12, 2015 01:34:17 AM: setting status to runnable
December 12, 2015 01:34:17 AM: Assigned to KYLE-PC (master)
December 12, 2015 01:38:17 AM: setting status to abandoned
December 12, 2015 01:46:19 AM: setting status to runnable
December 12, 2015 01:50:04 AM: Assigned to KYLE-PC (master)
December 12, 2015 01:50:04 AM: setting status to running
December 12, 2015 01:50:18 AM: setting status to failed

 

 

UPDATE: I just did a system restart to see if it would help and instead of the regular write error I recieved this:

0x00000000577CDE78 (0x000000000000002B 0x000000AD63AEF840 0x000000AD453FEEB0 0x0000000000000000), ?thread_sleep_v3@internal@tbb@@YAXAEBVinterval_t@tick_count@2@@Z() + 0x8C8 bytes(s)
0x00000000577CDD2B (0x000000AD45381F90 0x000000AD45381F90 0x000000AD453FEEB0 0x0000000000000000), ?thread_sleep_v3@internal@tbb@@YAXAEBVinterval_t@tick_count@2@@Z() + 0x77B bytes(s)
0x00007FFF29E43FEF (0x00007FFF29EE1DB0 0x0000000000000000 0x0000000000000000 0x0000000000000000), _beginthreadex() + 0x107 bytes(s)
0x00007FFF29E44196 (0x00007FFF29E44094 0x000000AD453FEEB0 0x0000000000000000 0x0000000000000000), _endthreadex() + 0x192 bytes(s)
0x00007FFF36582D92 (0x00007FFF36582D70 0x0000000000000000 0x0000000000000000 0x0000000000000000), BaseThreadInitThunk() + 0x22 bytes(s)
0x00007FFF36C29F64 (0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000), RtlUserThreadStart() + 0x34 bytes(s)

 

After resubmission, it went back to the usual error mentioned above.
 

untitled.hip

Edited by yamanash
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...