« Previous Next »

Thread: invalid character was found in text content

Last post 09-29-2009 10:40 AM by zumsel. 18 replies.

Average Rating Rate It (5)

RSS

Page 1 of 2 (19 items) 1 2 Next >

Sort Posts:

  • 04-27-2006, 8:56 AM

    invalid character was found in text content

    I am working on a solution for archiving windows event logs using logparser 2.2.  I decided to use XML as the file format and wrote a simple script to dump the security log to an XML file without issue.  The problem comes when I try to use logparser to read the resulting XML file back in for display in a datagrid.  It kicked out an error as follows:

    Error: Error loading document "C:\Data\Scripts\logparser\Test.xml": An invalid character was found in text content.

    It appears to be caused by the following message in a 680 Event:

      Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon account: userid Source Workstation: ITSNT139� Error Code: 0x0

    I suspect the character at the end of ITSNT139 is the problem.  I tried using URLESCAPE on the message and strings field, but this just seems to have problems with other records when I try to read the XML file back in.  (Stops in the middle of processing one of the records where it should be doing a URLUNESCAPE on a strings field.  It doesn't kick out any errors, but doesn't process the remaining event records, and displays the records processed up to that point.)

    One obvious question you may be thinking: why is that character appended onto the computer name in the 680 event?  I'm dealing with thousands of computers in the domain, so I may not be able to stop that from happening anyway.  This is outside the scope of logparser, but if you have any ideas on this, I'd be willing to look into it further.

    All that said, is using XML as a file format for archived windows event logs a bad idea?  Is there another way to get around this? (Perhaps by configuring it to ignore bad characters in an XML file?)

    Any advise would be appreciated, so thanks in advance.

  • 04-27-2006, 10:45 AM In reply to

    RE: invalid character was found in text content

    If you can't control the special characters going into LP, I would think that URLESCAPE is likely your only hope for fixing the problem.

    So, when you use URLESCAPE, does the XML contain an escaped sequence in the record that you quoted above? What about if you pass 0 or -1 in as the codepage for the URLESCAPE function?
  • 04-27-2006, 1:59 PM In reply to

    RE: invalid character was found in text content

    Here is a description of what I've tried.  (Note: Where the domain or a userid was listed I replaced with DOMAIN, user1id, and user2id.)
     
    I saved an evt file to play with and used something similar to the following to generate the XML from the same EVT file:

    "C:\Program Files\Log Parser 2.2\logparser.exe" file:ConvertToXML.sql -i:EVT -o:XML

    ConvertToXML.SQL
    ----------------
    SELECT timegenerated, timewritten, computername, eventid,eventtypename, eventcategory, eventcategoryname, sourcename, SID, Resolve_SID(SID) AS UserID, URLESCAPE(message,0) AS NewMessage, URLESCAPE(strings,0) AS NewStrings, data INTO c:\data\scripts\logparser\test0.xml
    FROM badlogeventsec.evt
    WHERE (eventid = 680)


    I tried using URLESCAPE with no codepage parameter and with -1 and 0.  When the resulting XML is opened in notepad, and the relevant fields copied in, this is what I see for one of the records.

    Without specifying the codepage
      <NewMessage>
      Logon%20attempt%20by:%20MICROSOFT_AUTHENTICATION_PACKAGE_V1_0%20Logon%20account:%20user1id%20Source%20Workstation:%20ath-cha-163d%20Error%20Code:%200x0%20
      </NewMessage>
      <NewStrings>
      MICROSOFT_AUTHENTICATION_PACKAGE_V1_0%7cuser1id%7cath-cha-163d%7c0x0
      </NewStrings>

    Using 0
      <NewMessage>
      Logon%20attempt%20by:%20MICROSOFT_AUTHENTICATION_PACKAGE_V1_0%20Logon%20account:%20user1id%20Source%20Workstation:%20ath-cha-163d%20Error%20Code:%200x0%20
      </NewMessage>
      <NewStrings>
      MICROSOFT_AUTHENTICATION_PACKAGE_V1_0%7cuser1id%7cath-cha-163d%7c0x0
      </NewStrings>


    Using -1
      <NewMessage>
      Logon%20attempt%20by:%20MICROSOFT_AUTHENTICATION_PACKAGE_V1_0%20Logon%20account:%20user1id%20Source%20Workstation:%20ath-cha-163d%20Error%20Code:%200x0%20
      </NewMessage>
      <NewStrings>
      MICROSOFT_AUTHENTICATION_PACKAGE_V1_0%7cuser1id%7cath-cha-163d%7c0x0
      </NewStrings>


    I then ran something like the following:

    "C:\Program Files\Log Parser 2.2\logparser.exe" fileetailingUserLoginsXML0.sql -i:XML -oatagrid

    DetailingUserLoginsXML0.SQL
    ---------------------------
    SELECT
        TimeGenerated,EventID,UserID,URLUNESCAPE(NewMessage,0) AS Message,URLUNESCAPE(NewStrings,0) AS strings
    FROM Test0.xml
    WHERE EventID=680

     

    When I run without specifying the codepage parameter or with -1, it stops after processing 965 elements.  (This ends up being on the record listed above.)
    When I ran using 0 for the codepage, it stopped after 5693 elements a different record. (XML snippet for this record below)

      <NewMessage>
      Logon%20attempt%20by:%20MICROSOFT_AUTHENTICATION_PACKAGE_V1_0%20Logon%20account:%20user2id%20Source%20Workstation:%20UM1PRIM%20Error%20Code:%200x0%20
      </NewMessage>
      <NewStrings>
      MICROSOFT_AUTHENTICATION_PACKAGE_V1_0%7cuser2id%7cUM1PRIM%7c0x0
      </NewStrings>


    The result in either case is similar:
    In the datagrid, I click All rows and scroll to the bottom.  On the last record displayed in the grid, Message is displayed as I would expect and strings is NULL.  Those last lines are pasted below.

    codepage No value and -1
    2006-04-26 15:00:09 680 DOMAIN\user1id Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon account: user1id Source Workstation: ath-cha-163d Error Code: 0x0  NULL

    codepage 0
    2006-04-26 15:18:31 680 DOMAIN\user2id Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon account: user2id Source Workstation: UM1PRIM Error Code: 0x0  NULL

  • 04-27-2006, 2:35 PM In reply to

    RE: invalid character was found in text content

    I'm very sorry but I honestly don't have much of an idea what might be going on. :/

    Gabriele could probably answer, but he hasn't been on the forums in a while (although he has said he hopes to return shortly)...
  • 04-27-2006, 2:44 PM In reply to

    RE: invalid character was found in text content

    No problem.  Thanks for trying.

     

  • 05-03-2006, 10:42 AM In reply to

    RE: invalid character was found in text content

    Metzlerd, do you mind posting the XML file that Log Parser is unable to parse?

    Also, have you tried using -oCodepage:-1 when creating the XML file?

  • 05-03-2006, 3:57 PM In reply to

    RE: invalid character was found in text content

    First, I did try using -oCodepage:-1, and it didn't work.

    I'm attaching an XML file that should illustrate the problem.  The workstation field was parsed out of the strings field on 680 events generated from a particular computer in the security log.  The last character in that field appears to change.

    That computer is running 3rd party software that does radius authentication for dial up users.  This would be consistant with some kind of bug where the computer is sending an extra character as part of the computer name during the authentication attempt.  That character, I would assume, is just whatever value is contained in memory at that location at the time, (some random value), and may not always translate into a valid character.

    While I could try to get the computer repaired to resolve these particular instances, it would still bother me if I knew that somebody could do something to generate events that effectively prevent security reports from being generated properly using that XML as input.

    I'll be interested to hear your response.

    Thanks again

  • 05-04-2006, 11:05 AM In reply to

    RE: invalid character was found in text content

    Hey, hold on a sec - what Log Parser version is this? 2.2.9 or 2.2.10?

    (execute "logparser.exe" with no args and look at the first line dumped out)

    There was a bug time ago in the XML output format that would cause it to append extra chars in fields...

  • 05-04-2006, 11:10 AM In reply to

    RE: invalid character was found in text content

    It's 2.2.9.

    That doesn't sound like it would be added during XML creation because the character, from what I can tell, is in the original event log.

  • 05-04-2006, 11:12 AM In reply to

    RE: invalid character was found in text content

    There actually was a bug in the XML output format exactly as you describe.

    2.2.10 has been released months ago, I'm 99% sure it will solve your problem. Download it from www.microsoft.com and see if it solves the problem.

  • 05-04-2006, 11:23 AM In reply to

    RE: invalid character was found in text content

    Downloaded and installed 2.2.10.

    Reran and still have the same problem.

     

  • 05-04-2006, 11:25 AM In reply to

    RE: invalid character was found in text content

    Downloaded and installed 2.2.10

    Reran and have the same problem.

  • 05-04-2006, 11:51 AM In reply to

    RE: invalid character was found in text content

    Ok, let's try another thing.

    Can you re-generate the _same_ XML in the previous example using URLESCAPE on that field this time? The second parameter doesn't matter.

  • 05-04-2006, 12:01 PM In reply to

    RE: invalid character was found in text content

    Here is the xml with URLESCAPE on the Workstation field.

     

  • 05-04-2006, 12:11 PM In reply to

    RE: invalid character was found in text content

    By the way, you would know this better than I, but URLEscape deals only with a few specific characters if it serves the traditional function of a URLEscape function.  That is it only translates characters like &, space, <, >, and others that might be lost in translation of a URL transfered over the internet.

    I think the problem is it doesn't necessarily work for a random value in memory where a character should be represented.

Page 1 of 2 (19 items) 1 2 Next >
Microsoft Communities