IIS 7 and Above
504 Gateway Timeout with RequestAquireState
Last post Dec 09, 2019 02:08 PM by email2saga
Nov 12, 2019 07:45 PM|email2saga|LINK
This is the application flow, during the peak traffic time we are getting loads of 504 gateway timeout error at Client end.
Client->API Gateway->GSLB->AWS Instances with WCF Service (IIS 8.5, .Net 4.5)->Web Server
Tried debugging the AWS instances and see there are lots of request hanging in the Worker Process, tried to recycle them and it again queues them.
CPU and Memory remains less than 30% all the time.
Updated the below settings and no luck....
1. DefaultConnectionLimit = 200
2. serviceThrottling maxConcurrentCalls=32&maxConcurrentSessions=200&maxConcurrentInstances=232
Each AWS Instance has 2 CPU and 8 GB RAM.
Appreciate all your help !!!
Nov 12, 2019 07:46 PM|email2saga|LINK
I put the executionTimeout as 1 sec and all the queues are entering in to "ExecuteRequestHandler" state and its clearing, this is really helping but i dont understand how....can someone please explain
Nov 13, 2019 03:40 AM|Yuk Ding|LINK
It sounds like some requests are not being completed at all which cause whole IIS process get stuck. Have you tried to collect and analyze dump file with debug diagnostic tool ?
Nov 13, 2019 09:08 AM|email2saga|LINK
Yes some request are not cleared which is hanging in there for more than hours..
For our business use case we hit endpoint which may be invalid IPaddress which is valid case, though we have timeouts of 75 secs these request are not closing in 75 secs no matter what and stops the other good endpoints to fail at client end as 504 gateway
After setting <httpRuntime executionTimeout="1"/>, all the request started closing as per the 75 secs.
Why would this fix help, can you please share some idea...
Nov 14, 2019 02:09 AM|Yuk Ding|LINK
The first thing we need to do is collect and analyze dump file. We need to check what method are the request stuck on. You could find the thread number of not cleared request and check their call stack. If it is hang up in managed code, then you may have
to ask developer to check your code.
However, if the application get stuck in Microsoft official API or native code, then it is recommended to open a support ticket to Microsoft support. It would probably be a compatibility issue.
This is the best way to figure out the root cause.
If the reply is helpful, it is appreciated if you could mark it as answer.
Nov 14, 2019 08:04 AM|email2saga|LINK
ok sure let me analyze the dump and get back..
Meantime can you please explain what does executionTimeout=1 do, how does it help in this situation.
Nov 15, 2019 02:50 AM|Yuk Ding|LINK
When you set executionTimeout=1, IIS will only allow 1 second to handle single request. If you monitor the dump file, you will see that most of time, one thread handle one request. It seems that the thread haven't executed to the code pieces that cause
your IIS hang in 1 second. Which means the connection get closed before worker process thread get frozen.
Looking forward to hear from you.
Nov 15, 2019 12:28 PM|email2saga|LINK
Hello Yuk Ding, Thanks for the reply..
But i see a different experience, my successful WCF REST calls take avg 2 to 3 secs and its working properly, if it were to close within 1 sec as per executionTimeout then i will not see any success, so i am trying to understand how this is working.
For now, we have modified these things and system is working fine today
Nov 15, 2019 05:36 PM|email2saga|LINK
During peak hours we started seeing the timeouts.
Big unanswered question is why CPU does not go beyond 15% where all my request are queues in worker process.
If I can clear all the RequestAquireState I think this issue will be resolved, looking forward to your response
Default values when i print is
processmodel.MaxWorkerThreads=100 and processmodel.MaxIOThreads=100
Nov 15, 2019 09:46 PM|Rovastar|LINK
Nov 16, 2019 10:24 AM|email2saga|LINK
Thanks for the reply Rovastar !!!
You are right, we hit an external API via direct IPAddress, close to 2% of our transactions has wrong IPAddress and i dont have choice to skip them, so all these calls are holding the worker process threads for 75 Secs which is proxy timeout and its an expected
scenario for our business case.
I have set the WCF throttle as based on server CPU count (4)
<serviceThrottling maxConcurrentCalls="64" maxConcurrentSessions="400" maxConcurrentInstances="464"/>
As you mentioned i will check the dump during hang state and confirm.
Nov 18, 2019 12:09 PM|email2saga|LINK
I ran the PerfMon tool during non peak hours where i still see 15% transaction timeouts, here are the values
Calls - 352
Calls per second - 267.180
Instances created per second - 266.164
Percentage of Max Concurrent Calls - 1.563
and rest all 0
W3SVC_W3WP parameters ( i have 3 worker process in each server)
Maximum thread count - 768 (256+256+256)
Active Request - 20
Active Thread Count - 0
With this parameter i dont see any issues at the code or the throttle value
Nov 20, 2019 01:24 PM|email2saga|LINK
i am still facing this issues, any help is greatly appreciated.
Dec 09, 2019 02:08 PM|email2saga|LINK
Issue resolved !!!
This was an issue with the end device (running on Jetty Server) which had an restriction on the number of TCP connections it servers.
There is a limitations of only 3 connections at a time, since i am making connections via ServicePointManager by default its keeps the TCP connection open for ~100 sec, which made the 4th and subsequent request to be on queue at the device.
Made keepalive=false and it closed the connection after the response and it quickly opened up for subsequent connections.
Ufffffffffff... finally :)
Thanks everyone !!!