Analyze without agents
Overview
Here are my notes toward building an “unsupervised” machine-learning framework to identify patterns in various logs.
Logs are produced by programs:
- Operating system logs
- Web Server logs
- Perfmon
- Linux top
- Custom application logs to commemorate specific events such as an invoice being sent or other business transaction being processed.
https://sematext.github.io/logagent-js/parser/ detects log formats based on a pattern library (yaml file) and converts it to a JSON Object.
Vendors
Commercial vendors include:
- Splunk
- QRadar (IBM)
- ArcSight (Micro Focus/HP)
- LogRhythm
- McAfee Enterprise Security Manager
- RSA Security Analytics
- Sentinel (Microsoft)
- Exabeam
- Securonix
- Rapid7
- Grurucul
- RSA
- Fireeye
Strategy vs. Current Offering
In 2020 Forrester rated Microsoft as leading the market in strategy but rated IBM the strongest offering. Splunk is up there as well.
Microsoft’s Sentinel
https://learn.microsoft.com/en-us/azure/sentinel/skill-up-resources
Visualizations
The value to keeping logs is to provide insights to what is being logged.
That is usually about the pattern and anomalies of occurances over time.
SIEM systems collect and analyze logs over time to detect persistent threats.
System logs
Microsoft System Logs can be parsed using http://logparserplus.com/Article
Web server logs
Web servers such as Apache, IIS, NGINX, etc. store an entry for each HTTP and file (resource) query.
Apache and others create logs in a W3C-defined format.
A trivial sample is provided at data/apache.access.log.
A fuller example is provided at http://www.monitorware.com/en/logsamples/apache.php
A parser and model for the log file: See ApacheAccessLog.java.
See https://databricks.gitbooks.io/databricks-spark-reference-applications/content/logs_analyzer/chapter1/spark.html
A configuration file specifies what fields are output in the log.
https://github.com/rory/apache-log-parser is written in Python. http://codereview.stackexchange.com/questions/68846/someone-thinks-poorly-of-my-server-log-parser
https://awstats.sourceforge.io/ is written in Perl with an architecture that enables plug-ins for additional functionality.
https://wiki.jenkins-ci.org/display/JENKINS/Log+Parser+Plugin
http://alvinalexander.com/scala/scala-apache-access-log-parser-library-java-jvm
https://easyengine.io/tutorials/nginx/log-parsing/
MS Log Parser for SQL
Microsoft Log Parser provides SQL-like query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows® operating system such as the Event Log, the Registry, the file system, and Active Directory®. It was created for Windows 2000, Windows Server 2003, Windows XP Professional Edition.
$31 http://lizard-labs.com/log_parser_lizard.aspx provides a GUI to the command-line access to a “Swiss Army Knife”
-
https://blogs.msdn.microsoft.com/carlosag/2010/03/25/analyze-your-iis-log-files-favorite-log-parser-queries/
-
https://blog.codinghorror.com/microsoft-logparser/
-
http://www.symantec.com/connect/articles/forensic-log-parsing-microsofts-logparser
-
https://technet.microsoft.com/en-us/library/ee692659.aspx
-
https://www.codeproject.com/articles/13504/simple-log-parsing-using-ms-log-parser-in-c-ne
-
Microsoft Log Parser Toolkit: A Complete Toolkit for Microsoft’s by Gabriele Giuseppini, Mark Burnett
-
https://www.simple-talk.com/blogs/using-logparser-part-1/
QUESTION: Its equivalent for Linux?
Perfmon logs
MS PAL (Performance Analysis of Logs)
https://pal.codeplex.com/
It makes use of PowerShell v2.0 or greater which uses Microsoft Chart Controls for Microsoft .NET Framework 3.5 Service Pack 1
Custom application logs
Code to output logs
https://www.arcgis.com/home/item.html?id=90134fb0f1c148a48c65319287dde2f7
Log gathering
Due to their size, systems “rotate” logs, meaning new file names are created when the allocated disk space for each file is used.
References
Log parsing
http://stackoverflow.com/questions/3328688/need-some-ideas-on-how-to-code-my-log-parser
More on Security
This is one of a series on Security in DevSecOps:
- SOC2
- FedRAMP
-
CAIQ (Consensus Assessment Initiative Questionnaire) by cloud vendors
- Git Signing
- Hashicorp Vault
- WebGoat known insecure PHP app and vulnerability scanners
- AWS Security (certification exam)
- Cyber Security
- Security certifications