Scalability problem using web requests initiating asynch web service requestsRSS

4 replies

Last post Feb 15, 2013 01:04 PM by Parashuram

  • Scalability problem using web requests initiating asynch web service requests

    Jan 09, 2013 11:53 AM|Toby999|LINK

    We are experiencing some problems in production (IIS 7.5 integrated managed pipeline mode and .NET 4.5 application) when to may of our incoming IIS requests trigger asynchronous web service requests. There is a quite nice MSDN blog post here by Thomas Marquardt that gives a very good picture of the potential problems that can arise. I believe our problem is due to the ServicePointManager.DefaultConnectionLimit low default value and we are about to deploy this fix (requires code change as can be read about in the blog.

    However, I have quite a few remaining questions from this blog post. I have posted these questions yesterday in the blog post (quite miserably since there was a cap on the word lenght...), but since I am guessing it may take some time before Thomas may response, I repost the questions here as well in hope that someone in the community knows some of the answers. You may need to read the blog post first to be in tune with what I write below, but I can promise it is good reading. Laughing If someone knows one or several of the answers to the questions, please just refer to the what question it is in the answer (eg 1.c)

    Here is my comment(s) in the blog post with nice WYSIWYG markup:

    Thanks [Thomas Marquardt] for some great explanations and for taking the time to answer all these questions. It really clarifies things. In our case, we have a problem in production with an .NET 4.5 / IIS 7.5 web server running integrated [managed pipeline] mode when we get a high load of requests that utilises a high degree of outgoing asynchronous web service requests (stack trace reveals HttpWebRequest usage beneath the hood of the HttpClient class usage). Based on my reading here, it seems the solution is to set the ServicePointManager.DefaultConnectionLimit programmatically in production code as you suggest (since we use autoConfig set to true). However before I try this in our production environment (initiating a redeploy) I have some remaining categorised questions with regards to what I’ve read (sublabeled (a), (b) etc to make your answering easier):

    1. appRequestQueueLimit config:
      (a) Do I understand it correctly that the httpRuntime setting appRequestQueueLimit configuration only is applicable for IIS 6 and 7 when running in classic mode? I.e it sets the limit for the old application scoped queue (with “miserable performance” as you write)?
    2. (questions a-f below) 
      When our servers gets over loaded and don’t scale (CPU only at 20%), we get 503 error code in response to the original request + that the async. web service requests ends up with quite a few exceptions: “SocketException: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full”. A problem with this type of error message hint is of course that you are not sure which queue may be full (if it is the queue). 

      (a) So how many queues exists on the server in the (total) request pipeline? From you description, in our scope, there is both a (1) process-wide native queue as well as the (2) CLR ThreadPool that naturally can be regarded as a queue. But you also mention the HTTP.sys kernal queue. 
      (b) Is the HTTP.sys kernal queue the same queue as (1) process-wide queue? 
      (c) If not, could you perhaps explain breifly what the purpose of this HTTP.sys queue is? 
      (d) Are there any other lower level queues that can become problematic? (eg network card driver) 

      However, in our case, if the problem is actually due to connectionManagement/MaxConnections, I guess the web service error message is quite logical. If there is no possibility for the web service to get access to the connection (keep-alive), the number of requests waiting to execute in the ThreadPool should increase of course until the maxium number of threads is hit. A poor solution would be to increase the number of maxThreads in the ThreadPool... 
      (e) Is there a performance monitor for showing the current number of threads in the ThreadPool? 

      Also, there seems to be a performance counter called [Web Service.Current Connections] that should really hit a max pegged value (12*core count) if this is the problem. 
      (f) Is this correct (below the (e) question)?
       
    3. From you question response 8 Dec 2011, I understand it that “Requests Queued” is use both to measure the number of requests that are waiting as well as the number of executing requests in the ThreadPool + those requests that possible (unlikely if correctly configured) hit the native queue. It seems to me then, that it is quite difficult to distinguish the scenario when there actually are requests in the native queue. 
      (a) Would that be possible in .NET 4.5 / IIS 7 integrated mode? (how if so? :)
    4. You mentioned in comment 22 april, 2010 and 28 oct, 2010 that if you run out of (ephemeral) ports it can be good to decrease the TcpTimedWaitDelay registry setting (as well as MaxUserPort). For us I don’t think this is a problem since we use TCP keep-alive (persisted) connection for the web service requests. 
      (a) However, in order to find out if this was a problem, would it not be enough to check the performance counter [TCPv4.Connection Established] and see if this value is close to the maximum amount of available ports (which should be quite high on Windows 2003/2008 by default). 

      Further more, I think decreasing TcpTimedWaitDelay is a good option although perhaps one need to consider other types of TCP reliant I/O high latency requests running on the server (eg web services and even database calls) if lowering this number. 
      (b) Do you agree?

     I can of course follow up on this thread with answers from our internal investigation to my questions. 

  • Re: Scalability problem using web requests initiating asynch web service requests

    Jan 12, 2013 11:38 AM|Chen Yu - MSFT|LINK

    Hi,

    As Thomas has already replied your question, I will post his reply in this thread below.

    Tobias, in response to the post you gave on the IIS forum:

    1) Yes, appRequestQueueLimit only applies to IIS 6 (also 7 when running in classic mode).

    2a) IIS 7 and later have the queues that you mention.

    2b) The HTTP.sys kernel queue is not the same as the ASP.NET process-wide queue.

    2c) The HTTP.sys kernel queue is essentially a completion port on which user-mode (IIS) receives requests from kernel-mode (HTTP.sys).  It has a queue limit, and when that is exceeded you will receive a 503 status code.  The HTTPErr log will also indicate that this happened by logging a 503 status and QueueFull.

    2d) I do not know the details of how HttpClient or HttpWebRequest are implemented.  You need to ensure that you are closing/disposing all System.Net objects properly.  You likely need to increase connectionManagment/maxconnection in the config file or increase it programmatically via ServicePointManager.DefaultConnectionLimit.  You may also need to modify the default registry values for TcpTimedWaitDelay and MaxUserPorts if your connections are sitting in the TIME_WAIT state or you do not have enough ports available.  Be careful with these registry values--you need to know what you're doing, and why you're doing it.  Perhaps the System.Net folks have a forum?

    2e) "Process(w3wp)\Thread Count" and the ".NET CLR LocksAndThreads" performance counters will help a little, but ultimately you will need to resort to the debugger (windbg) and the sos.dll debugger extension.  It has a !ThreadPool command that will tell you how many threads are active in the pool and what the maximum limits are.

    2f) "Web Service\Current Connections" is the number of connections to IIS.  This has nothing to do with your outbound System.Net connections.

    3) ASP.NET v4.5 has a performance counter in the "ASP.NET" category specifically for the native queue.  This is new to v4.5.

    4a) Perhaps, but I'm not familiar with the "TCPv4\Connection Established" performance counter.

    4b) Yes, I would be careful about changing TcpTimedWaitDelay and/or MaxUserPort.  You need to know what you're doing, and why you're doing it.

    Thanks,

    Thomas


    Best Regards,

    Please mark the replies as answers if they help or unmark if not.
    Feedback to us



  • Re: Scalability problem using web requests initiating asynch web service requests

    Jan 14, 2013 04:11 AM|Toby999|LINK

    Yes, thanks. I saw that last week. I tried to post a follow up comment on his blog, but it seems he turned it off now for further comments. Quite understandable if Thomas is not working with the se things anymore. I have some follow up questions at the bottom of this comment though. 

    Though Thomas answers I think answered most of my questions. However, we were also able to establish last week that setting ServicePointManager.DefaultConnectionLimit to int maxValue really does not make a difference since this already is the new default value in .NET 4.5. So essentially, it is not our problem. We will most likely open a Microsoft support case on this issue later on this week. But as an input to this, I will write a summary of the problem + possible solutions as found o the Internet. I might as well post it here before I mark anything as an answer. I'll follow up with another dedicated post for this.

    Follow up information/questions on previous questsions/answers:

    2(e) information: 
    I found the performance counter for the HTTP.sys: HTTP Service Request Queues.CurrentQueueSize.

    2 further questions regarding number of queues in the request pipeline:
    So it seems that 3 queues have been identified so far then (1) HTTP.sys queue, (2) "native" queue, and the (3) .NET Thread Pool. I've looked a bit at the decompiled source code of WebClient that we are using (we changed from HttpClient since this was too slow), but it is a bit too much source code to dig into. I am guessing though that at the .NET Framework level (above the CLR ThreadPool queue), there are no other queues that can cause the problem.
       f) Is this a wrong assumption that between the .NET Threadpool and the WebClient asyc usage there are no queues? (this may be the wrong forum for that question, I know)

    InformationLikewise, at a lower level, I guess there may be other queues in Kernal mode. Eg. network adaptor buffer. However, since this is Kernal execution level, it should be more likely that the user mode level queues would be become the bottlenecks, not the native level queues. However, on the performance counter category [Network Interface] there is a counter called "Packets Received Discarded" that may rise if there if the network card (driver) buffer is full (according to the description of the performance counter).

    2 (f). Information:
    For ayone interested: we found the performance couters for the outgoing connections in the [.NET CLR Networking 4.0.0.0] category. Several useful HttpWebRequests counters there.

    2(e) Thread Pool thread counter question
    So I don't understand why the "Process(w3wp)\Thread Count" and the ".NET CLR LocksAndThreads" only helps "a bit" as Thomas states. Why are these not accurate enough?

  • Re: Scalability problem using web requests initiating asynch web service requests

    Jan 15, 2013 03:29 AM|Toby999|LINK

    I have posted a follow up summary of the problems and potential solution candidates on the stacktrace forum since it may be a bit more active with possible responses. Undecided But we'll see. I may post o the ASP.NET forum as well if no ones has any input there.

  • Re: Scalability problem using web requests initiating asynch web service requests

    Feb 15, 2013 01:04 PM|Parashuram|LINK

     

    Your question falls into the paid support category which requires a more in-depth level of support. Please visit the below link to see the various paid support options that are available to better meet your needs. http://support.microsoft.com/default.aspx?id=fh;en-us;offerprophone