Jump to content

Mantra Memory Usage


Recommended Posts

Hi

So I wrote a geometry procedural for mantra recently and am at the point where I'm evaluating its memory usage. Some facts about the procedural that might be relevant: it uses a divide-and-conquer mechanism and breaks up the initial large procedural (in terms of its bounding box) into smaller child procedurals. Child procedurals are either further split into more procedurals, or geometry is generated from them, depending on some metric (such as the size of the bounding box of the child procedural).

I render it with micropolygon rendering, DSMs, no raytracing. So mantra should be able to free/destroy any geometry and procedurals that it has rendered and that is not overlapping with any tiles that are still to be rendered.

I set mantra's verbosity to 4 and this is the memory statistics that it outputs:

Render Time: 16:13.78u 17.01s 3:30.42r
Memory:  874.94 MB of 988.89 MB arena size. VM Size: 3.87 GB
    page rclm : 1264884   flts: 0
    # swaps   : 0
    blocks in : 248        out: 0
    switch ctx: 520043    ictx: 173131
Peak Geometry Objects: 937

I also added a Python tile callback to print out the 'tile:memory' property (which, as I understand it, is the total amount of memory mantra is using when the tile is finished rendering). The value printed by this is in-line with what mantra outputs at the end, it varies between 800MB up to about 1063MB.

Now, the VM size that mantra prints out is quite a lot larger than the arena size that mantra reports. But the virtual memory and the physical memory that a process uses is not the same (I am on Linux, so there are a lot of shared library and paging considerations for reported virtual memory size). Apparently KDE's System Activity app and pmap -d reports values closer to the actual physical memory used by the process. In my case this close-to-physical value peaks at 3260MB.

So, finally, my question(s), why when the physical memory usage of the mantra process is 3260MB does mantra report only 1063MB? What does mantra take into account when it reports memory usage? Does it take into account all the memory used by custom written procedurals as well? Am I missing something with respect to Linux's memory management?

Link to comment
Share on other sites

I render it with micropolygon rendering, DSMs, no raytracing. So mantra should be able to free/destroy any geometry and procedurals that it has rendered and that is not overlapping with any tiles that are still to be rendered.

From what I've been trying with the regular delay load archive, I've found mantra to be very reluctant to actually throw any procedurals out. In fact it feels like it just doesn't do it. Have you seen anything being thrown away?

Link to comment
Share on other sites

From what I've been trying with the regular delay load archive, I've found mantra to be very reluctant to actually throw any procedurals out. In fact it feels like it just doesn't do it. Have you seen anything being thrown away?

From my periodic gems of trying to force Mantra to free up memory this only happens in a single-thread mode. Once you turn on multi-threading, no throwing away... of course after all obvious conditions met (non-overlapping bboxes, no ray tracing etc). I reported it as a bug, but unfortunately I haven't met with understanding from the other side :). Not sure why, as it really seems to behave unexpectedly.

Link to comment
Share on other sites

From what I've been trying with the regular delay load archive, I've found mantra to be very reluctant to actually throw any procedurals out. In fact it feels like it just doesn't do it. Have you seen anything being thrown away?

Well, I use some UT_Counter instances. I increment them when a procedural is instantiated (in its constructor) and decrement it when a procedural is destroyed (in its destructor). When mantra exits, these counters print out the peak values and total increments. Generally the peak values are a lot smaller (8-10 times) than the total increments, suggesting that mantra is indeed destroying procedural instances that are no longer needed during the render. I also have a counter that increments when I create a geometry object. I cannot decrement this counter since I have (as far as I am aware) no way of knowing when mantra destroys the geometry object. With verbose output mantra does print out "Peak Geometry Objects: ###". In my case this peak geometry objects number is always substantially less (also about 8-10 times) than the total number of created geometry objects I get from my counter. This is with multi-threaded rendering, I haven't tested it with single-threaded rendering lately, I should give that a go...

So in short, it does look like mantra destroys procedurals and geometry that it no longer needs, or at least, it believes that it does destroy it :)

Link to comment
Share on other sites

My suspicion is that we're running into situations where glibc's allocator won't trim back its previously sbrk()'ed memory. One hint of this is here. So that might be why the "arena" size is a lot smaller than the reported VM Size.

Anyhow, in case you're interested, the technical details for how those stats are calculated in H11 on Linux are:

Current Memory Usage: Sum of the mallinfo::uordblcks and mallinfo::hblkhd members returned by mallinfo().

Arena Size: Sum of the mallinfo::arena and mallinfo::hblkhd members returned by mallinfo().

VM Size: Sum of the VmData entries in /proc/<pid>/status. This should be fairly accurate in matching the value returned by top. This is the total amount of virtual memory size of the process' data segment (aka heap).

Link to comment
Share on other sites

Thanks Edward. Yeah, it appears that the mallinfo integers are indeed 32-bit, so overflow is definitely a problem. Also, mallinfo doesn't support multiple arenas. So I think what we are seeing is just the first arena's stats. Using malloc_stats() I get the following output in a multi-threaded (12 threads) render:

Render Time: 17:43.48u 18.63s 3:53.14r
Memory:  891.37 MB of 1.03 GB arena size. VM Size: 4.06 GB
    page rclm : 1327186   flts: 0
    # swaps   : 0
    blocks in : 0          out: 0
    switch ctx: 597968    ictx: 207289
Peak Geometry Objects: 1013
Arena 0:
system bytes     =  840822784
in use bytes     =  638010304
Arena 1:
system bytes     =  272400384
in use bytes     =   78886336
Arena 2:
system bytes     =  285310976
in use bytes     =   88904928
Arena 3:
system bytes     =  256933888
in use bytes     =   46039824
Arena 4:
system bytes     =  245821440
in use bytes     =   56426128
Arena 5:
system bytes     =  241201152
in use bytes     =   65579664
Arena 6:
system bytes     =  263737344
in use bytes     =   54255760
Arena 7:
system bytes     =  236199936
in use bytes     =   45582736
Arena 8:
system bytes     =  268300288
in use bytes     =   70336544
Arena 9:
system bytes     =  269545472
in use bytes     =   66910624
Arena 10:
system bytes     =  257363968
in use bytes     =   67900320
Arena 11:
system bytes     =  254779392
in use bytes     =   77342320
Total (incl. mmap):
system bytes     = 3891318784
in use bytes     = 1555077248
max mmap regions =       6125
max mmap bytes   =  443310080

and the following for a single threaded render:

Render Time: 15:22.27u 3.04s 15:27.82r
Memory:  633.89 MB of 2.63 GB arena size. VM Size: 2.67 GB
    page rclm : 936574    flts: 0
    # swaps   : 0
    blocks in : 0          out: 0
    switch ctx: 150       ictx: 109348
Peak Geometry Objects: 792
Arena 0:
system bytes     = 2748416000
in use bytes     =  611138560
Total (incl. mmap):
system bytes     = 2828361728
in use bytes     =  691084288
max mmap regions =       3972
max mmap bytes   =  300474368

So we seem to have an arena for each thread. In the single threaded case the values from mantra/mallinfo appear to be relatively consistent with what we are seeing from malloc_stats(). I would have expected the arena size to overflow though, since its more than 2GB... The big difference between the in use bytes and the system bytes values are a concern though, then again I'm not exactly sure what 'system bytes' refers too, I'm guessing it might be reserved memory for the process where 'in use bytes' are what is actually malloc'ed/new'ed. I suppose that could be related to sbrk()'ed memory not being properly free'ed, as Edward suggested.

I googled a bit for this 32bit mallinfo issue and some people are saying that mallinfo is completely outdated and is not supported anymore, it won't be fixed for 64bit and shouldn't be used by new programs, for instance https://bugzilla.redhat.com/show_bug.cgi?id=173813

After running all of this I came across this: http://udrepper.livejournal.com/20948.html. If you scroll down to the "Information about malloc" section they talk about mallinfo's problems and its replacement, malloc_info. This outputs malloc stats as an XML file. I'll look into this a little bit later.

Link to comment
Share on other sites

I submitted a high priority bug on this a while ago, but if it was easy to fix I'm sure SESI would have fixed it already, it's been a major problem for a long time.

One option, at least on Linux, is to hack into the IFD a post-"ray_raytrace" command that calls command-line (unix) commands to query the memory and print that, before the process exits. At least you have a snapshot of the RAM in use before Mantra quits, but that also isn't really the answer, just slightly more accurate than Mantra's own reporting.

I find I spend a lot of time with "top" running sorted by memory usage when I'm analyzing Mantra's memory usage.

If it's any consolation, Nuke's memory reporting is equally inaccurate :(

Cheers,

Peter B

Link to comment
Share on other sites

If you're interested in peak memory, then just look at VM Size. That's why it's there.

Hi Edward,

But that's not actually the case in reality. We see VM reports of 33GB on machines with 8GB of RAM, that didn't go into swap.

When I monitor the "top" report live while one of those frames renders, the machine's free RAM doesn't go near that. "top" does report the same VM as what Mantra reports, but it's meaningless when trying to find problem renders...

Cheers,

Peter B

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...