Analyze without agents
Here are my notes toward building an “unsupervised” machine-learning framework to identify patterns in various logs.
Logs are produced by each program components:
- Operating system logs
- Web Server logs
- Linux top
- Custom application logs to commemorate specific events such as an invoice being sent or other business transaction being processed.
https://sematext.github.io/logagent-js/parser/ detects log formats based on a pattern library (yaml file) and converts it to a JSON Object.
The value to keeping logs is to provide insights to what is being logged.
That is usually about the pattern and anomalies of occurances over time.
SIEM systems collect and analyze logs over time to detect persistent threats.
Microsoft System Logs can be parsed using http://logparserplus.com/Article
Web server logs
Web servers such as Apache, IIS, NGINX, etc. store an entry for each HTTP and file (resource) query.
Apache and others create logs in a W3C-defined format.
A trivial sample is provided at data/apache.access.log.
A fuller example is provided at http://www.monitorware.com/en/logsamples/apache.php
A parser and model for the log file: See ApacheAccessLog.java.
A configuration file specifies what fields are output in the log.
https://github.com/rory/apache-log-parser is written in Python. http://codereview.stackexchange.com/questions/68846/someone-thinks-poorly-of-my-server-log-parser
https://awstats.sourceforge.io/ is written in Perl with an architecture that enables plug-ins for additional functionality.
MS Log Parser for SQL
Microsoft Log Parser provides SQL-like query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows® operating system such as the Event Log, the Registry, the file system, and Active Directory®. It was created for Windows 2000, Windows Server 2003, Windows XP Professional Edition.
$31 http://lizard-labs.com/log_parser_lizard.aspx provides a GUI to the command-line access to a “Swiss Army Knife”
Microsoft Log Parser Toolkit: A Complete Toolkit for Microsoft’s by Gabriele Giuseppini, Mark Burnett
QUESTION: Its equivalent for Linux?
MS PAL (Performance Analysis of Logs)
It makes use of PowerShell v2.0 or greater which uses Microsoft Chart Controls for Microsoft .NET Framework 3.5 Service Pack 1
Custom application logs
Code to output logs
Due to their size, systems “rotate” logs, meaning new file names are created when the allocated disk space for each file is used.