I raised the question below to google groups and received an answer (also below). I'm new to SQL, Logparser, scripting with autoit, and IIS so please excuse my ignorance. I'm having trouble with syntax despite having looked at the documentation. I need to figure our how to write the two queries suggested in the answer to my google groups question pasted further below.
One thing I'm trying to figure out is whether there is an abbreviated way to refer to all fields in a WHERE clause. I want to remove all lines in the log file that contain "botname". Its easy enough to say:
logparser "SELECT* INTO A1.log FROM A.log WHERE cs(User-Agent) NOT LIKE '%botname%' " -i:IISW3C -o:IIS
But how do I tell logparser to query against not just the cs(User-Agent) field, but all fields? Or do I even need to be searching in the other fields for this particular task?
Here, in a question posted to google, are more details on what I am trying to do (the query I will ultimately need will be more complicated than the above example, but it would be nice to get it working in the simplified form first):
I just started reading about logparser and I would like to use it to
rid IIS logs of entries containing bot names listed in a text file.
I'll be using autoit and Logparser COM.
I was wondering what the best way to go about this would be. From what
I have read in the logparser documentation, it seems that the
following would not be possible:
1. Logparser query cannot be setup such that: it is passed a file (say
a list of bot names) and told to filter the inputfiles of all lines
containing any of the terms in said file.
2. Logparser cannot produce output upon running a query on a set of
multiple files such that the output is organized into files that
correspond to the source filenames. (So if files A, B, C are to be
filtered, the output cannot automatically be placed in files A1, B1,
C1 such that the contents of A1 = filtered contents of A, and
similarly for B1 and C1).
Given the above assumptions, I have concluded that If I want to filter
a large set of files and if the filtering involves a large set of
filter terms, the best way to utilize logparser would be to run a
query multiple times in nested loops (inner loop traverses the list of
bots, and outer loop traverses the list of files. [In effect: For each
file A, [for each filter term 'bot' run a query that retreives all
lines NOT containing "bot", placing the output in A1. Repeat query on
file A for all bots names] Repeat for all files].
I'd like to confirm whether or not my assumptions 1 and 2 above are
mistaken, and if my suggested way of completing this task is over
complicating things. Also any further tips or tool /scripting
recommendations are most welcome. Thanks.
ANSWER:
I think your approach is a bit complicated
You overlook the fact that LogParser allows you to make a SQL query
against an aggregation of log sources while retaining knowledge of
which log source a particular log entry came from. This means that
what you want to do can be done with two commands which pass over the
logged data only twice.
The first command is a LogParser query over the aggregate of the log
sources with a SQL WHERE clause constructed from the bot names in the
text file. This simultaneously walks both your nested loops with a
single SQL query. This allows File:A,File:B,File:C --(SQL-based-
Filter)-> File:A1B1C1
The second command chops up the output of the first command based on
the Filename field which indicates the original log source that the
data came from. This allows you to reconstruct File:A1, File:B1,
File:C1 given a file whose contents is A1B1C1 concatenated
Of course, it is possible that aggregating everything into memory all
at once is too much, in which case your incremental approach would be
necessary.
//David