« Previous Next »

Thread: Recover from unexepected error crash in native module

Last post 10-05-2009 10:01 PM by DF_Frederic. 5 replies.

Average Rating Rate It (5)

RSS

Page 1 of 1 (6 items)

Sort Posts:

  • 10-02-2009, 5:46 AM

    Recover from unexepected error crash in native module

    I don't have experience making native module in C++ or experience in the language so I will try to explain as clearly as I can what I want to achieve.

     

    We have a native module developed by an external programmer. Let say for some reason this module crash on a null pointer because of a programming error. For example, a structure was not instantiated and a string compare method like _strnicmp tried to access the data inside.  In that case, the application will crash and It is not possible to catch the error since no error was thrown (from what I understand).

     

    For sure in the end the real answer is to fix the bug but until you find that error that was missed, it would be good if there was a way to catch the error so it doesn't bring down the complete server.  What I saw is after a few repeated module crash, IIS will decide that this application pool is not stable and stop it.  I would prefer to find a way to recover from the crash inside the module, don't alter the page content that will be sent to the customer and log the error in any possible log (event viewer or any other appropriate one).  If the module fail, the customer should still be able to see the un-altered content since the page content in that scenario is more important than the resulting modified page.

     

    Is it possible to do so in those type of error? It seems that our module made for the apache server doesn't have this issue since apache will just kill that module and the website continue to work sending the original page.

     

     Any information on the subject, document to read about IIS or approach to recover gracefully from a bug like mentioned about will be greatly appreciated.

  • 10-02-2009, 12:08 PM In reply to

    • anilr
    • Top 10 Contributor
    • Joined on 05-23-2006, 10:13 PM
    • Redmond, WA
    • Posts 2,343

    Re: Recover from unexepected error crash in native module

    What you suggest is not possible - a process cannot recover from a crash within the process - the only possible thing is to kill the process, restart a new one and hope for the best - what IIS does.

    Anil Ruia
    Senior Software Design Engineer
    IIS Core Server
  • 10-02-2009, 8:30 PM In reply to

    Re: Recover from unexepected error crash in native module

    Thanks again Anil for your answer.

    I guess it's hard to ask a question clearly when you don't know about the subject. I may have found yesterday night what I was looking for after a lot of searching. I'm not sure yet if it's something common since, like mentioned above, I don't program in C/C++.

    I have found on the project option (C/C++, code generation)  that I can set the "enable C++ exception" to "Yes with SEH exception (/EHa)".  When I activate this, I can catch the programming error that was causing a hardware error (null pointer handling error in the C string compare, divide by zero etc) and can catch it, allowing me to continue/stop the process based on the current processing state of the page.  I think this is better than a not handled crash.

    The program was coded in C style inside the class functions so no exception handling was used. The only reason I tried to find more about this is because of my experience in C#/java and reading about C++ about exception handling long time ago.

    Now it seems that it could be the solution to what I was looking for but I need to read more on the subject. I don't know if there is any issue in using this in II7 too (I guess not but I should confirm when you don't know about it). SEH doesn't seem something new ,I saw an article back to 1997, but never had to use this so I should make sure what I'm doing before using it. 

    The next step will be to find a way to notify the user that something really wrong is going on with the module and that he's losing analytic data but his website is still running. I will check the II7 API to see if I can write to the event viewer but I think that if it happen too often it will flood it so it seems the wrong place to log that message.

    I cannot ask the original developer since he was a C developer that didn't know about the win32 platform at all.  For now I can only really on common sense when you don't have any person on the team that develop for the win32 platform.  I guess it will be about time that I learn more about it. I always wanted to do it someday so I guess I have no choice now.

    Do you think Anil that there is something I should be careful with the approach mentioned above? Thanks in advance for any information on the subject.
  • 10-02-2009, 8:56 PM In reply to

    • anilr
    • Top 10 Contributor
    • Joined on 05-23-2006, 10:13 PM
    • Redmond, WA
    • Posts 2,343

    Re: Recover from unexepected error crash in native module

    Using SEH to catch exceptions means that you know exactly what/where the exception is happening and know what to do to recover from the exception.  This can definitely not be done by IIS for arbitrary code throwing an exception - think of deadlocks/leaks etc - and I would argue that the time to do it correctly even within your code would be better spent finding and fixing the source of the exceptions.

    Another drawback of catching exceptions is that you prevent normal debugging mechanisms (JIT debugger or "problems and solutions" in control panel) from kicking in.

    Anil Ruia
    Senior Software Design Engineer
    IIS Core Server
  • 10-04-2009, 12:39 AM In reply to

    • lextm
    • Top 10 Contributor
    • Joined on 10-22-2008, 4:18 AM
    • Shanghai, PRC
    • Posts 1,392

    Re: Recover from unexepected error crash in native module

    To troubleshoot the crash and locate what part of the code base needs to be fixed, you can utilize Microsoft Debug Diagnostics, (http://www.microsoft.com/DOWNLOADS/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=en). We have a KB article on that http://support.microsoft.com/kb/919789
    Lex Li
    Support Engineer at Microsoft
    ---------------------------
    This posting is provided "AS IS" with no warranties, and confers no rights.
  • 10-05-2009, 10:01 PM In reply to

    Re: Recover from unexepected error crash in native module

    Thanks again Anil for the information provided.

     

    For the current beta stage of the module, I will have to chose between the lesser of two evil. either:

    - the module crash because of a programming error causing lost of analytic data. The crash cause the application pool to fail making the website(s) not available. Customer is not happy that website(s) in app pool are down.

    or

    - The module crash because of a programming error but try to recover in specific phase of the processing. Analytic data is lost but the website is still up running. If done properly, hopefully there is no memory leak. Customer is a little bit annoyed to lost analytic data but he's aware that it's a private beta so he can understand about possible error left.

     

    In this private beta, the customer will not be debugging the application himself nor are we going to have access to the machine directly so we will have to rely on debug logs. So I'm not too worried about the debugging feature not being available. Locally we can remove this code for testing but still, when I was testing it by remote debugging the process w3wp.exe, the debugger was dying on that specific line and not stopping on it so I couldn't do anything about it. Maybe there is an option that I still don't know in C++ debugging or the way the code was programmed was wrong, I cannot tell yet.

     

    For now, option 2 seems the more appropriate even thought I don't like it since you're right, code should be working properly on the get go and I'm not arguing against that. Now I have to make that decision from a business point of view based on time and resource available.

     

    So basically what I will have to do it to check every phase of the page processing, check where memory is allocated, catch possible unexpected error.  I will have to divide the check per phase to help the debugging process and when it fail in phase N, log more details about it and try to recover memory/original page content if possible.  What I'm not sure is how to warn the customer that something very bad happened. Event Log seemed fine but it the bug is on every request, it will flood it, which is not good.

     I really appreciate the time you take to answer this question.  I will mark this thread as answered soon since I don't think there is much left to talk. Is there anything else I should be careful about SEH, I will be more happy to hear about it.  I don't know if it would have made a difference it the application would have been made in C++/ATL. For now it's a one class application with C code inside methods using pointer everywhere. I don't know if this is/was common for ISAPI/Module programming even in C++ (using pointers). Maybe I'm too judgmental since the programmer didn't have experience for win32 programming but I still have to evaluate it for possible future development of this module or the a newer one.

    @Lextm:

     I will check the link you provided and see if it can help at a later stage. It seems to relate to info for IIS5/6 but I guess it must still work under II7 to debug native code.

     

Page 1 of 1 (6 items)
Microsoft Communities