thefreeman Posted September 28, 2011 Share Posted September 28, 2011 Hi So I wrote a geometry procedural for mantra recently and am at the point where I'm evaluating its memory usage. Some facts about the procedural that might be relevant: it uses a divide-and-conquer mechanism and breaks up the initial large procedural (in terms of its bounding box) into smaller child procedurals. Child procedurals are either further split into more procedurals, or geometry is generated from them, depending on some metric (such as the size of the bounding box of the child procedural). I render it with micropolygon rendering, DSMs, no raytracing. So mantra should be able to free/destroy any geometry and procedurals that it has rendered and that is not overlapping with any tiles that are still to be rendered. I set mantra's verbosity to 4 and this is the memory statistics that it outputs: Render Time: 16:13.78u 17.01s 3:30.42r Memory: 874.94 MB of 988.89 MB arena size. VM Size: 3.87 GB page rclm : 1264884 flts: 0 # swaps : 0 blocks in : 248 out: 0 switch ctx: 520043 ictx: 173131 Peak Geometry Objects: 937 I also added a Python tile callback to print out the 'tile:memory' property (which, as I understand it, is the total amount of memory mantra is using when the tile is finished rendering). The value printed by this is in-line with what mantra outputs at the end, it varies between 800MB up to about 1063MB. Now, the VM size that mantra prints out is quite a lot larger than the arena size that mantra reports. But the virtual memory and the physical memory that a process uses is not the same (I am on Linux, so there are a lot of shared library and paging considerations for reported virtual memory size). Apparently KDE's System Activity app and pmap -d reports values closer to the actual physical memory used by the process. In my case this close-to-physical value peaks at 3260MB. So, finally, my question(s), why when the physical memory usage of the mantra process is 3260MB does mantra report only 1063MB? What does mantra take into account when it reports memory usage? Does it take into account all the memory used by custom written procedurals as well? Am I missing something with respect to Linux's memory management? Quote Link to comment Share on other sites More sharing options...
eetu Posted September 28, 2011 Share Posted September 28, 2011 I render it with micropolygon rendering, DSMs, no raytracing. So mantra should be able to free/destroy any geometry and procedurals that it has rendered and that is not overlapping with any tiles that are still to be rendered. From what I've been trying with the regular delay load archive, I've found mantra to be very reluctant to actually throw any procedurals out. In fact it feels like it just doesn't do it. Have you seen anything being thrown away? Quote Link to comment Share on other sites More sharing options...
symek Posted September 28, 2011 Share Posted September 28, 2011 From what I've been trying with the regular delay load archive, I've found mantra to be very reluctant to actually throw any procedurals out. In fact it feels like it just doesn't do it. Have you seen anything being thrown away? From my periodic gems of trying to force Mantra to free up memory this only happens in a single-thread mode. Once you turn on multi-threading, no throwing away... of course after all obvious conditions met (non-overlapping bboxes, no ray tracing etc). I reported it as a bug, but unfortunately I haven't met with understanding from the other side . Not sure why, as it really seems to behave unexpectedly. Quote Link to comment Share on other sites More sharing options...
thefreeman Posted September 28, 2011 Author Share Posted September 28, 2011 From what I've been trying with the regular delay load archive, I've found mantra to be very reluctant to actually throw any procedurals out. In fact it feels like it just doesn't do it. Have you seen anything being thrown away? Well, I use some UT_Counter instances. I increment them when a procedural is instantiated (in its constructor) and decrement it when a procedural is destroyed (in its destructor). When mantra exits, these counters print out the peak values and total increments. Generally the peak values are a lot smaller (8-10 times) than the total increments, suggesting that mantra is indeed destroying procedural instances that are no longer needed during the render. I also have a counter that increments when I create a geometry object. I cannot decrement this counter since I have (as far as I am aware) no way of knowing when mantra destroys the geometry object. With verbose output mantra does print out "Peak Geometry Objects: ###". In my case this peak geometry objects number is always substantially less (also about 8-10 times) than the total number of created geometry objects I get from my counter. This is with multi-threaded rendering, I haven't tested it with single-threaded rendering lately, I should give that a go... So in short, it does look like mantra destroys procedurals and geometry that it no longer needs, or at least, it believes that it does destroy it Quote Link to comment Share on other sites More sharing options...
edward Posted September 29, 2011 Share Posted September 29, 2011 My suspicion is that we're running into situations where glibc's allocator won't trim back its previously sbrk()'ed memory. One hint of this is here. So that might be why the "arena" size is a lot smaller than the reported VM Size. Anyhow, in case you're interested, the technical details for how those stats are calculated in H11 on Linux are: Current Memory Usage: Sum of the mallinfo::uordblcks and mallinfo::hblkhd members returned by mallinfo(). Arena Size: Sum of the mallinfo::arena and mallinfo::hblkhd members returned by mallinfo(). VM Size: Sum of the VmData entries in /proc/<pid>/status. This should be fairly accurate in matching the value returned by top. This is the total amount of virtual memory size of the process' data segment (aka heap). Quote Link to comment Share on other sites More sharing options...
edward Posted September 29, 2011 Share Posted September 29, 2011 PS. Aren't those mallinfo() values are in 32-bit integers? We might be getting a 2 GB overflow here. Quote Link to comment Share on other sites More sharing options...
thefreeman Posted September 29, 2011 Author Share Posted September 29, 2011 Thanks Edward. Yeah, it appears that the mallinfo integers are indeed 32-bit, so overflow is definitely a problem. Also, mallinfo doesn't support multiple arenas. So I think what we are seeing is just the first arena's stats. Using malloc_stats() I get the following output in a multi-threaded (12 threads) render: Render Time: 17:43.48u 18.63s 3:53.14r Memory: 891.37 MB of 1.03 GB arena size. VM Size: 4.06 GB page rclm : 1327186 flts: 0 # swaps : 0 blocks in : 0 out: 0 switch ctx: 597968 ictx: 207289 Peak Geometry Objects: 1013 Arena 0: system bytes = 840822784 in use bytes = 638010304 Arena 1: system bytes = 272400384 in use bytes = 78886336 Arena 2: system bytes = 285310976 in use bytes = 88904928 Arena 3: system bytes = 256933888 in use bytes = 46039824 Arena 4: system bytes = 245821440 in use bytes = 56426128 Arena 5: system bytes = 241201152 in use bytes = 65579664 Arena 6: system bytes = 263737344 in use bytes = 54255760 Arena 7: system bytes = 236199936 in use bytes = 45582736 Arena 8: system bytes = 268300288 in use bytes = 70336544 Arena 9: system bytes = 269545472 in use bytes = 66910624 Arena 10: system bytes = 257363968 in use bytes = 67900320 Arena 11: system bytes = 254779392 in use bytes = 77342320 Total (incl. mmap): system bytes = 3891318784 in use bytes = 1555077248 max mmap regions = 6125 max mmap bytes = 443310080 and the following for a single threaded render: Render Time: 15:22.27u 3.04s 15:27.82r Memory: 633.89 MB of 2.63 GB arena size. VM Size: 2.67 GB page rclm : 936574 flts: 0 # swaps : 0 blocks in : 0 out: 0 switch ctx: 150 ictx: 109348 Peak Geometry Objects: 792 Arena 0: system bytes = 2748416000 in use bytes = 611138560 Total (incl. mmap): system bytes = 2828361728 in use bytes = 691084288 max mmap regions = 3972 max mmap bytes = 300474368 So we seem to have an arena for each thread. In the single threaded case the values from mantra/mallinfo appear to be relatively consistent with what we are seeing from malloc_stats(). I would have expected the arena size to overflow though, since its more than 2GB... The big difference between the in use bytes and the system bytes values are a concern though, then again I'm not exactly sure what 'system bytes' refers too, I'm guessing it might be reserved memory for the process where 'in use bytes' are what is actually malloc'ed/new'ed. I suppose that could be related to sbrk()'ed memory not being properly free'ed, as Edward suggested. I googled a bit for this 32bit mallinfo issue and some people are saying that mallinfo is completely outdated and is not supported anymore, it won't be fixed for 64bit and shouldn't be used by new programs, for instance https://bugzilla.redhat.com/show_bug.cgi?id=173813 After running all of this I came across this: http://udrepper.livejournal.com/20948.html. If you scroll down to the "Information about malloc" section they talk about mallinfo's problems and its replacement, malloc_info. This outputs malloc stats as an XML file. I'll look into this a little bit later. Quote Link to comment Share on other sites More sharing options...
pbowmar Posted September 29, 2011 Share Posted September 29, 2011 I submitted a high priority bug on this a while ago, but if it was easy to fix I'm sure SESI would have fixed it already, it's been a major problem for a long time. One option, at least on Linux, is to hack into the IFD a post-"ray_raytrace" command that calls command-line (unix) commands to query the memory and print that, before the process exits. At least you have a snapshot of the RAM in use before Mantra quits, but that also isn't really the answer, just slightly more accurate than Mantra's own reporting. I find I spend a lot of time with "top" running sorted by memory usage when I'm analyzing Mantra's memory usage. If it's any consolation, Nuke's memory reporting is equally inaccurate Cheers, Peter B Quote Link to comment Share on other sites More sharing options...
edward Posted October 1, 2011 Share Posted October 1, 2011 If you're interested in peak memory, then just look at VM Size. That's why it's there. Quote Link to comment Share on other sites More sharing options...
pbowmar Posted October 1, 2011 Share Posted October 1, 2011 If you're interested in peak memory, then just look at VM Size. That's why it's there. Hi Edward, But that's not actually the case in reality. We see VM reports of 33GB on machines with 8GB of RAM, that didn't go into swap. When I monitor the "top" report live while one of those frames renders, the machine's free RAM doesn't go near that. "top" does report the same VM as what Mantra reports, but it's meaningless when trying to find problem renders... Cheers, Peter B Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.