image Jeff Moser's recent post on becoming a grandmaster developer was very inspiring. I'm now consciously forcing myself to work on different, harder, problems on a daily basis. Today, I wanted to tackle a nasty memory leak on FranchiseBlast that had been bugging us for months. Actually, it only made the ASP.NET application restart twice (which has no impact because our session state is kept in the state service), but we've been postponing this bug for months :)

The bug was simple to reproduce. Simply refresh a particular page a hundred or so times and observe the worker process' memory usage continuously increase. dotTrace was not very useful today, because most of the memory was consumed by Strings.

I landed on the If broken it is, fix it you should blog. Reading the blog post and noticing how Tess knows much more than I do about debugging .NET applications, I figured I'd improve my .NET debugging skills by playing with WinDbg. My guess is that many of you haven't played with WinDbg either and that's why I thought I'd write this post. With no previous experience with the tool, I managed to learn enough to be able to solve my bug within a couple hours (learning time included). However, this is a situation where I am grateful to have had the chance to play with the low-level details (assembly language / computer architecture) during my undergrad software engineering studies.

WinDbg Installation

The article says simply run "!dumpheap -min 85000 -stat" to figure out which objects are using over 85k memory. Vaguely remembering doing something similar in VS.NET, I tried the command window, to no avail. After some googling, I understood we wanted to launch WinDbg with the SOS extension (Son of Strike.. not Summer of Sam :)). I managed to get everything running (installed WinDbg, attached to w3wp.exe, ran ".loadby sos mscorwks") and later found setup instructions on the same blog, which would have made my life easier. I still haven't remembered (nor have I looked) how to load SOS inside VS.NET instead of the external tool, but I'm pretty sure it can be done.

Playing around

I played around with various strategies found on the same blog, such as figuring out how much I was caching.  I didn't think caching was my problem, but wanted to play around with the tool.

Attempting to run an interesting command found on the aforementioned blog post, I ran into a syntax error:

.foreach (runtime {!dumpheap -mt 0x01b29dfc -short}){.echo ******************;!dumpobj poi(${runtime}+0x3c);.echo cache size

Of course, this was a blog formatting issue, as the real command was:

.foreach (runtime {!dumpheap -mt 0x01b29dfc -short}){.echo ******************;!dumpobj poi(${runtime}+0x3c);.echo cache size:;!objsize poi(${runtime}+0xc) }

It didn't teach me much :) One command that appeared to be very interesting to me was !dumpaspnetcache. This appeared to be exactly what I was looking for. However, the command is only available in SOS for .NET 1.0/1.1 and I'm trying to debug a .NET 2.0 application. I did lose some time unloading the current SOS and loading the one in the clr10 subfolder (which offers !dumpaspnetcache), but it complained that the command didn't work for .NET 2.0.

The main lesson learned here is that !dumpaspnetcache does not work on .NET 2.0. There is a workaround listed in the comments of the blog post, but I didn't try it out.

Solving a problem

  1. I identified that most of my memory was consumed by strings using !dumpheap -stat.
  2. I ran !dumpheap -min 85000 -stat to a view only the largest strings. I observed only three strings (out of approximately 400,000) and that, in total, they accounted for less than 10% of the memory used by my strings.
  3. I ran !dumpheap -mt 790fd8c4 -min 85000 to view the memory locations of these strings (where the MT 790fd8c4 was in the results of step 2).
  4. I ran !dumpobj 0c6800c0 to display the contents of one of the strings (where the address 0c6800c0 was in the result of step 3.)
  5. I saw the string was a simple ASP.NET AJAX script built by the framework. Repeating for the other objects, I didn't find anything interest. In addition, these were only 3 of my 400,000 strings.
  6. I ran !dumpheap -min 10000 -max 85000 -stat to get a feeling of how many objects were of a particular size. I repeated the exercise for various ranges and ended up discovering that 90% of my strings were under 100 bytes (50% of the memory usage).
  7. I ran !dumpobj 17386ce8 on one of the random 100 byte strings and discovered a country name. Interesting. This means I have a country name in here that is not getting disposed. I only have country names in my address editor, which uses a SubSonic DropDownList.
  8. I ran !gcroot 17386ce8 to look at who was referencing this string and confirmed that it wasn't getting collected by the garbage collector (see below).
  9. I looked at the handles and noticed that I had a user control which was listed as an event listener. Therefore, I discovered that we were registering the UserControl to listen to an event but never broke the link when we were done with the UserControl.
  10. Looking at the code, I saw we were adding the event listener in the Page_Load but never removing it. I now simply remove the event handler during the Page_PreRender event, when listening to events it is no longer necessary in my scenario.
  11. I repeatedly loaded the page and saw memory go up but back down again after the garbage collector does its job. Problem solved! (knock on wood)

Here's the result of my !gcroot command.

DOMAIN(0E68E998):HANDLE(Pinned):ed712f0:Root:0a646bd0(System.Object[])->
17cd3dc4(System.EventHandler`1[[LBBoLib.BO.Utility.GenericEventArgs`1[[System.Web.UI.Page, System.Web]], LBBoLib]])->
161a6630(System.Object[])->
173c8e18(System.EventHandler`1[[LBBoLib.BO.Utility.GenericEventArgs`1[[System.Web.UI.Page, System.Web]], LBBoLib]])->
1736abe4(ASP.usercontrols_content_searchpanel_ascx)->
1626d460(ASP.users_aspx)->
1736e234(ASP.usercontrols_content_users_usereditor_ascx)->
1736ff4c(ASP.usercontrols_shared_user_accountinformation_ascx)->
1737d558(ASP.usercontrols_shared_user_addressselectionpanel_ascx)->
1737d974(ASP.usercontrols_shared_user_addresseditor_ascx)->
1737e044(SubSonic.Controls.DropDown2)->
173866dc(System.Web.UI.WebControls.ListItemCollection)->
173866ec(System.Collections.ArrayList)->
173894b8(System.Object[])->
17386d0c(System.Web.UI.WebControls.ListItem)->
17386ce8(System.String)

Conclusion

In total, I only spent a couple hours fiddling around with WinDbg but I am very happy to have a new tool in my tool belt (and to have solved my bug). I strongly recommend reading the various tutorials on Tess's blog. I also found a WinDbg / SOS cheat sheet, which you might find interesting.

kick it on DotNetKicks.com