Assemble and filter data from Elastic Beats
Overview
This section describes how to install, configure, and use the Logstash component within the Elastic Stack.
Logstash usually receives its data from Beats agents.
LogStash Marketing
The product marketing page for Beats is at:
https://www.elastic.co/products/logstash
Logstash was originally developed by Jordan Sissel when he was a system administrator at Dreamhost.
- https://github.com/jordansissel/
- https://twitter.com/jordansissel
- http://semicomplete.com (latest post in 2012)
- https://www.youtube.com/watch?v=fwMnb4-t8vo More Logstash Awesome - Jordan Sissel - PuppetConf 2013
- https://www.youtube.com/watch?v=RuUFnog29M4
Competitors
Competitors to Logstash include
- Apache Kafka PDF: is used at LinkedIn
- Cloudera Flume +Elasticsearch+Kibana or Flume+HDFS+HIVE+PIG
- Greylog2
- Fluentd+MongoDB
- Stackify
- LOGalyse
- Scribe
Logstash Configuration
Before forwarding, Logstash can parse and normalize varying schema and formats.
A basic Logstash configuration (logstash.conf) file contains 3 blocks: input, filter, and output.
Each block contains a plugin distributed as a RubyGem (to ease packaging and distribution).
Filters are applied in the order they are specified in the .conf file.
Field names are specified between %{
and }
.
-
Create a configuration file using the vi editor (or your favorite):
vi logstash.conf
PROTIP: Associate .config files with a text editor.
To the vi editor, press Esc, then write and quit the vi editor by typing :wq.
-
Copy the following and paste into the .conf editor window:
input { stdin { } } filter { grok { type => "apache" pattern ==> ['%{COMBINEDAPACHELOG}'] } } output { stdout { codec => rubydebug } elasticsearch { embedded => true } }
Logstash Sources
Logs into Logstash brokers can be from various shippers (origins):
- TCP/UDP
- Files
- Syslog
- Microsoft Windows Eventlogs
- STDIN
- WebSockets
- ZeroMQ
- SNMPTrap
- geoIP
Brokers go to Lucene index accessed by the storage and search server which has a web interface.
Log Lifecycle Logstash
The lifecycle of a log: Record, Transmit, Store, Delete.
STDIN Logstash
-
Since STDIN means the command line, type
testing
and press Enter for this debug response:{ "message" => "testing", "@version" => "1", "@timestamp" => "2015-08-02T02:02:06.903Z", "host" => "Wilsons-MacBook-Pro.local" }
NOTE: The Z in the timestamp stands for GMT/UTC “Zulu” time, basically London time without the Summer Time (what the UK calls Daylight Savings Time in the US).
Log Input Formats
A key benefit of using Logstash is that it normalizes different timestamps from different systems:
- JSON
- XML
- CSV
- Multi-line stack traces
- Regex
- Grok (Regex on steroids)
- Zabbix
- SQS (Amazon)
Logstash Outputs
With the categories of output:
Relay:
- Redis
- RabbitMQ
- TCP/UDP socket
- Kafka
- Syslog
Storage:
- Elasticsearch
- MongoDB
- Amazon S3
- File
Notification:
- PagerDuty
- Nagios monitoring
- Zabbix.com
- Amazon Cloudwatch
- Alerting tools (Hipchat, SMS)
Metrics (graphics):
- StatsD
- Graphite
- Ganglia
Brokers
- AMQP (Advanced Message Queuing Protocol) http://www.amqp.org/
- zMQ at http://zeromq.org/
-
Redis from http://redis.io/ receives the log event on the central server and acts as a buffer (port 6379), which should be used only with STunnel or with public information.
The front server would notice files based on this .conf using just a few of the file plugin’s many options.
input { file { type = > "syslog" path = > ["/var/log/secure", "/var/log/messages"] exclude = > ["*. gz"] } } } output { stdout { } redis { host = > "10.0.0.1" data_type = > "list" key = > "logstash" } }
The backend:
input { redis { host = > "10.0.0.1" type = > "redis-input" data_type = > "list" key = > "logstash" } output { stdout { } elasticsearch { cluster = > "logstash" } }
Logstash Filters
labels instead of regex patterns.
- grok uses patterns to extract data into fields.
- date parses timestamps from fields to standardize into a “canonical” date format
- mutate rename, remove, replace, modify fields in events
- geoip determines geographic info. from IP addresses (via Maxmind)
- csv parses comma separated values or other pattern or string
- kv key-value pairs in event data
- grep
- alter
- multiline
- ruby to run arbitrary Ruby-language code.
Integration with Alternatives
Logstash can work in sync with other commercial products (can compete with it):
- https://github.com/IBM-ITOAdev/logstash-input-appdynamics
Logstash Forwarder
Configure for scale by using a Logstash Forwarder and RabbitMQ between a Logstash Producer and Logstash Consumer http://jakege.blogspot.in/2014/04/centralized-logging-system-based-on.html
Logstash Forwarder is written in the programming language Go.
Resources
- https://www.youtube.com/watch?v=Kqs7UcCJquM Visualizing Logs Using ElasticSearch, Logstash and Kibana by Jeff Sogolov.
This page describes the configuration of Logstash servers for capacity.
We would prefer to have a local Logstash server near servers issuing logs.
Logstash servers then forward logs to Elastisearch servers.
To handle additional load ….
Logstash Install
A sample:
#install logstash (based on http://jakege.blogspot.in/2014/04/centralized-logging-system-based-on.html)
sudo wget https://download.elasticsearch.org/logstash/logstash/logstash-1.3.3-flatjar.jar
sudo mkdir /opt/logstash
sudo mv logstash-1.3.2-flatjar.jar /opt/logstash/logstash.jar
sudo wget http://logstash.net/docs/1.3.2/tutorials/10-minute-walkthrough/hello.conf
sudo wget http://logstash.net/docs/1.3.2/tutorials/10-minute-walkthrough/hello-search.conf
sudo mv hello.conf /opt/logstash/hello.conf
sudo mv hello-search.conf /opt/logstash/hello-search.conf
cd /opt/logstash/
#example configuration
java -jar logstash.jar agent -f hello.conf
java -jar logstash.jar agent -f hello-search.conf
The java here is a JRuby run-time (for performance). Logstash is extendable with Ruby.
Run Logstash
-
Run Logstash using a script in the bin folder and the .conf file just created:
bin/logstash agent --debug -f logstash.conf
See list of command line flags.
If the command includes
--configtest
or just-t
, logstash stops after processing it.If a folder is specified, such as /etc/logstash/conf.d, all .conf files in it are loaded.
In production mode, Logstash would be started as a service (Unix daemon):
sudo service logstash start
-
To stop on a Mac, hold down control and press C. On Windows, it’s Ctrl+C.
Logstash Outputs
Logstash sends its own log output locally to /var/log/logstash/logstash.log by default.
This location can be changed at ???
Resources
- James Turnbull (at Kickstarter) wrote the $9.99 Logstash v1.5 Book Kindle Edition http://www.amazon.com/Logstash-Book-James-Turnbull-ebook/dp/B00B9JQTCO/ref=wilsonslifenotes
More
This is one of a series on Elastic Stack and monitoring:
- Elastic Stack ecosystem of people, websites, tutorials
- Elastic Stack architecture and installation
- Elastic Scaling (the database engine)
- Elastic Query (via REST API)
- Elastic Kibana (the visualization engine, like Grafana)
- Elastic Logstash to assemble and filter data from Beats
- Elastic Beats to collect data from servers