With Alibaba Cloud Log Service, there are several methods available for you to collect upstream data. You can use the built-in LogSearch and LogAnalytics functions, or you can deploy the more familiar ElasticSearch, Logstash, and Kibana (ELK) stack. In this article, we will discuss how you can build your own ELK stack on Alibaba Cloud Log Service to analyze and monitor Apache logs.
Installing Logstash within the ECS
First, we need to install and deploy Logstash within the ECS. When you subscribe to the ECS service, be sure to prepare JDK version 1.8 or higher.
wget https://artifacts.elastic.co/downloads/logstash/logstash-5.5.3.tar.gz
Decompress and install
tar -xzvf logstash-5.5.3.tar.gz
Establishing the Logstash Pipeline
In order to write data to ElasticSearch with Logstash, first we need to establish a Logstash pipeline, which has three parts:
input {
}
# a note in this section indicates that this filter can be selected
filter {
}
output {
}
- Set input to the data source
- Set output to the target
- A filter is optional, you can normally use it to set data filtering logic
Settings for this section are quite simple. Create a .conf file in the Logstash directory, then set input and output according to the following format:
input {
file {
path => "/usr/local/demoData/*.log"
start_position => beginning
}
}
output {
ElasticSearch {
hosts => ["http://*******************:9200"]
user => "*******"
password => "***********"
}
}
Note: Because ElasticSearch is preset with the X-Pack plugin, you must verify all access. This will require you to set a username and password in the output.
Let us take a case where we need to send the Apache log indexing frequently generated by Alibaba Cloud ECS to ElasticSearch. We can deploy Logstash to the ECS on which the web server is running. If there are concerns about this affecting the application running on the web server, you can deploy Logstash to any accessible ECS over the network.
Note: Logstash input can handle different forms of input. If you have deployed a Logstash to a network-accessible ECS, you will need to configure an http template as an input as follows:
input {
http {
host => "**********"
port => "**********"
}
}
Because ElasticSearch is deployed in a VPC environment, if the ECS on which Logstash is deployed is on a classic network, then the VPC needs to be connected to via the Classiclink method.
Analyzing Apache Logs Using Logstash Filter
Let us now see how one can quickly analyze Apache logs using a Logstash filter. An Apache log typically contains the following information:
To retrieve user distribution information from the log and make it more intuitive for non-technical users, we can use the Gork filter to analyze the Apache network logs.
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
}
We can take the original log information:
66.249.73.135 - - [04/Jan/2015:05:30:06 +0000] "GET /blog/web/firefox-scrolling-fix.html HTTP/1.1" 200 8956 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Then filter it into standard JSON structure:
{
"clientip" : "66.249.73.135",
"ident" : ,
"auth" : ,
"timestamp" : "04/Jan/2015:05:30:06 +0000",
"verb" : "GET",
"request" : "/blog/web/firefox-scrolling-fix.html",
"httpversion" : "HTTP/1.1",
"response" : "200",
"bytes" : "8956",
"referrer" : "http://www.google.com/bot.html",
"agent" : "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25"
}
We can then extract the IP to discern the user location using geoip.
filter {
geoip {
source => "clientip"
}
}
Once we have the address information from the IP, we can enter a geoip field into the log information. We can receive the following information by checking an IP with geoip:
"geoip":{
"timezone":"America/Los_Angeles",
"ip":"66.249.73.135",
"latitude":37.419200000000004,
"continent_code":"NA",
"city_name":"Mountain View",
"country_name":"United States",
"country_code2":"US",
"dma_code":807,
"country_code3":"US",
"region_name":"California",
"location":{
"lon":-122.0574,
"lat":37.419200000000004
},
"postal_code":"94043",
"region_code":"CA",
"longitude":-122.0574
},
Using Kibana, we can use the coordinate information stored in the location key from geoip. Subsequently, we can then create a visualization of the geographic distribution of users’ access locations.
With the above method, we can analyze ECS logs in batch and complete the configuration in Kibana.
You can get more information on Configuring Logstash here.
Conclusion
You can analyze and monitor logs with the LogSearch and LogAnalytics on Alibaba Cloud Log Service, or deploy your own ElasticSearch, Logstash, and Kibana (ELK) stack. Each option comes with its own set of benefits, and the effectiveness is highly dependent on your application.
I hope this blog helped you understand how you can install Logstash on Alibaba Cloud ECS and use it for analysis of Apache logs. To know more about Alibaba Cloud Log Service, visit the official product page or the official product documentation.