IIS 5 & IIS 6
I'm at a loss
Last post Dec 22, 2007 06:10 PM by steve schofield
Dec 14, 2007 05:10 PM|MichaelMo|LINK
I'm having problems diagnosing a problem and I would appreciate any help I can get. The server is an IIS 6 server on Win 2003. The server hosts only one website, though it does have high traffic used for many online university functions such as student enrollment
and scheduling, faculty grade entry, and several other web apps, all included in this one site so it has been hit hard over the past month and a half between scheduling and end of the semester activity. Practically everything is written in classic ASP and
depends highly on session state between the web apps as well as COM objects and database access to an iSeries and SQL server. The session state factor has kept me from carving the applications up into other worker processes or adding more processes to the
Most of the time it runs great, but randomly the whole site will just hang, seemingly during busy times (300+ connections). It's not CPU or memory usage... those are normal to low during the problematic times as best I can tell. It's almost like we're hitting
some sort of obscure limit somewhere, but all my online searching hasn't turned up anything useful so far.
I've taken a few IISState dumps, trying to capture as near to the point of failure as I can and have run them through the IIS Debug Diagnostic tool which set me on a couple of trails, but seem to be dead ends. Lately I've just been using the IIS Debug Diagnostic
tool to monitor for leaks, but nothing really coming up there either. I've been using Sysinternals ProcExplorer and ProcMonitor to observe what's going on too.
I'm not the writer of the apps... they are written by one other department.
The site is currently set up on a Cisco content switch... was on Microsoft's Network Load Balancing and that hasn't changed anything. The second server is standing by. We put the secondary server under the full load and it seemed to have the same problem
among other problems that shouldn't be related.
My best solution to the problem has been to have both servers share the load, but I know that's not fixing the problem and not even guaranteed that that will work.
I appreciate any direction or suggestions they may have.
Dec 16, 2007 11:14 PM|steve schofield|LINK
300 connections shouldn't be that many. IIS can handle a lot of load. Of course this depends on your size of the server. ASP and COM+ app problems can be tricky to troubleshoot, unless you written the app or have access to code, I'm not sure there is
much you can do.
Hopefully this can help provide some starting point.
1) Can you identify any errors in the event log when it hangs.
2) Any errors from the user. Does it seem to be database related?
3) Have you tried to put the application on a test machine and reproduce the error using a stress test tool?
4) Have looked in the logs to determine what pages were executing during the failure?
If you are running into an issue such as this you can't reproduce, you might want to think about engaging PSS (Microsoft Product support services). Since you are a college, they might have some form of discount on support.
Windows Server MVP - IIS
Log archival solution
Install, Configure, Forget
Dec 18, 2007 12:51 PM|MichaelMo|LINK
Thank you for the reply...
Yesterday (12/17) was eventful in this area. We were up to 700 connections yesterday with faculty trying to submit grade information by a 10am deadline so the process kept hanging on us on an average of about 30 minutes until we made it past the deadline
time. This made it nice that I was able to watch the server activity as it happened, yet I'm not much closer to solving the problem.
I agree... even one server should be able to handle the load without too much trouble. Processor, disk, and memory don't seem to be the problem at all.
I'll try to answer your questions:
1. Yes, there are some 8000ffff|Catastrophic_failure messages in the logs that seem to relate to the problem we're having. We also found several ASP_0147|500_Server_Error entries when looking through yesterday's logs. When researching these online we found
some forums that mentioned this occuring when a driver becomes corrupted in the worker process memory. Our SQL Server drivers are up to date, but we found we're a little behind on our iSeries drivers so we plan to update those soon.
2. What we typically hear from users is just that the site hangs or occasionally we hear about a user that sees the catastrophic error message. The catastrophic erros and the ASP_0147 errors I mentioned in answering the last question do seem to relate to
database connections. I had looked into these some back when this was a small problem towards the beginning of the year, but kept running into dead ends and had forgotten about it. Since it's come back up, I'm looking at it again.
3. We did take the second of the two identical servers and put it through stress testing. We were able to get the server working pretty hard and more database connections than we've seen, yet we didn't experience any problems with it. We didn't run it
over extended periods of time though... perhaps that's something that could be tried.
4. I haven't been able to identify a common thread of applications that the error occurs on more than others. I've looked at some of the simpler applications that it has failed on and everything seems to be ok.
One other thing I thought I would mention is that when I do analysis of the IIS dumps with the Debug Diagnostic Tool I almost always get a warning message about the COM+ STA ThreadPool being used up. This is something I've been reasearching on my own because
I don't have a good grasp on how this would affect the server. What confuses me about it is that when I take normal operation dumps, I get the same warning message so it appears to be normal, yet if there is something that could be improved upon in the code
then I can pass the word along to the programmers. From what I've seen it looks like they're closing the COM objects when they are done, but I can't remember if they set them to 'Nothing'.
Calling Microsoft has been in the back of my mind as a last resort. Since they cost as much as they do I would hope it would be worth while. I've been given the approval to do what it takes to fix it. It's just my own subborness that is preventing me
from doing so.
Thank you much,
Dec 22, 2007 06:10 PM|steve schofield|LINK
I would try to find the common thread, a database used by the application, a COM+ package When stress testing, I would try to replay the production logs on a test / isolated machine and see if the errors happen. Hope that helps.