By Jeff Cleverley, Alibaba Cloud Tech Share Author
In this series of tutorials, we will set up a Server Cluster that is Horizontally Scalable, that is suitable for high traffic Web Applications and Enterprise business sites. It will consist of 3 Web Application Servers and 1 Load Balancing Server. Although we will be setting up and installing WordPress on the cluster, the actual cluster configuration detailed here is suitable for most any PHP based Web Applications. Each server will be running a LEMP Stack (Linux, Nginx, MySQL, PHP).
To complete this tutorial, you will need to have completed the first two tutorials in the series.
In the first tutorial, we provisioned 3 node servers and a server for load balancing. On the node servers we configured database and web application file system replication. We used Percona XtraDB Cluster Database as a drop in replacement for MySQL to provide the real time database synchronization between the servers. For Web Application file replication and synchronization between servers, we set up a GlusterFS distributed filesystem.
In the second tutorial, we completed the installation of our LEMP stack by installing PHP7 and Nginx, configured Nginx on each of our Nodes and our Load Balancer, issued a Let’s encrypt SSL certificate on the Load Balancer for our domain, and installed WordPress to on the cluster.
We now have a WordPress cluster with equal load balancing between each node.
In the final tutorial we will look at more advanced cluster architecture configurations that directs administration traffic to node1 and general site traffic to node 2 and node 2. This will ensure that any behind the scene cpu and resource intensive work being carried out in the administration of our web application will never affect any of our site traffic responses.
When this tutorial is completed we will have Cluster architecture like so:
<Three Node Cluster with Load Balancer redirecting Admin traffic and Site traffic>
In addition, we will also add Nginx FastCGi caching to the mix to aid performance and ensure the cluster doesn’t sweat even under the most extreme loads, and harden our database cluster and distributed file system.
Throughout the series, I will be using the root user, if you are using your superuser please remember to add the sudo command before any commands where necessary. I will also be using a test domain ‘yet-another-example.com', you should remember to replace this with your domain when issuing commands.
In the commands I will also be using my server's private and public IP addresses, please remember to use your own when following along.
As this tutorial directly follows the first two, the sequence of steps is numbered accordingly. Steps 1 to 3 are in the first tutorial, Steps 4 to 7 in the second tutorial. This tutorial begins at Step 8.
Advanced Configurations
Step 8: Configure Nginx FastCGI Caching
With the present configuration, the web application is being served from a cluster of 3 servers, this horizontal scaling will allow the site to withstand tremendous loads, and allow for additional scaling with new servers, or easy swapping out of old servers.
We can improve the performance further using Nginx FastCGI caching.
If you visit your site, open the inspector network tab and reload your site you will see the page load speeds:
<Inspect your site load in the network tab>
In my case the site is loading in 1.91 seconds.
On each of your nodes that will deal with site traffic, open the Virtual Host Nginx Configuration file for your WordPress site for editing.
In our example, node1 will be used for administration tasks so doesn’t require caching. Therefore, on node 2 and node 3 issue the following command:
# nano /etc/nginx/yet-another-example.com
Above the server block add the following:
fastcgi_cache_path /var/run/nginx-fastcgi-cache levels=1:2 keys_zone=FASTCGICACHE:100m inactive=60m;
fastcgi_cache_key "$scheme$request_method$host$request_uri";
fastcgi_cache_use_stale error timeout invalid_header http_500;
fastcgi_ignore_headers Cache-Control Expires Set-Cookie;
This creates the cache in the /var/run/ directory, which is mounted in RAM, and gives the cache a key_zone identifier. The fastcgi_cache_use-stale also instructs your server to keeps serving cached pages even in case of a PHP timeout or http 500 errors. Nginx caching is really quite brilliant.
Inside the server block, below your error logs and above your first location block add the following:
set $skip_cache 0;
# POST requests and urls with a query string should always go to PHP
if ($request_method = POST) {
set $skip_cache 1;
}
if ($query_string != "") {
set $skip_cache 1;
}
# Don't cache uris containing the following segments
if ($request_uri ~* "/wp-admin/|/xmlrpc.php|wp-.*.php|/feed/|index.php|sitemap(_index)?.xml") {
set $skip_cache 1;
}
# Don't use the cache for logged in users or recent commenters
if ($http_cookie ~* "comment_author|wordpress_[a-f0-9]+|wp-postpass|wordpress_no_cache|wordpress_logged_in") {
set $skip_cache 1;
}
These set specific cache omissions for different WordPress functionality.
Finally, within the ‘location ~ /.php$’ block add the following:
fastcgi_cache_bypass $skip_cache;
fastcgi_no_cache $skip_cache;
fastcgi_cache FASTCGICACHE;
fastcgi_cache_valid 60m;
add_header X-FastCGI-Cache $upstream_cache_status;
The fastcgi_cache directive must match the keys_zone from the code block above the server block, the fastcgi_cache_valid sets the time to hold cache for, you can adjust this to be longer if your content rarely changes or you get fewer visitors, and the add_header directive adds a header to the Responses Headers so we can verify if a page is being served by cache or not.
Your full configuration file should now look like this:
fastcgi_cache_path /var/run/nginx-fastcgi-cache levels=1:2 keys_zone=FASTCGICACHE:100m inactive=60m;
fastcgi_cache_key "$scheme$request_method$host$request_uri";
fastcgi_cache_use_stale error timeout invalid_header http_500;
fastcgi_ignore_headers Cache-Control Expires Set-Cookie;
server {
listen 80;
listen [::]:80;
root /var/www/yet-another-example.com;
index index.php index.htm index.html;
server_name _;
access_log /var/log/nginx/yetanotherexample_access.log;
error_log /var/log/nginx/yetanotherexample_error.log;
set $skip_cache 0;
# POST requests and urls with a query string should always go to PHP
if ($request_method = POST) {
set $skip_cache 1;
}
if ($query_string != "") {
set $skip_cache 1;
}
# Don't cache uris containing the following segments
if ($request_uri ~* "/wp-admin/|/xmlrpc.php|wp-.*.php|/feed/|index.php|sitemap(_index)?.xml") {
set $skip_cache 1;
}
# Don't use the cache for logged in users or recent commenters
if ($http_cookie ~* "comment_author|wordpress_[a-f0-9]+|wp-postpass|wordpress_no_cache|wordpress_logged_in") {
set $skip_cache 1;
}
location / {
try_files $uri $uri/ /index.php?$args;
}
location ~ \.php$ {
include snippets/fastcgi-php.conf;
fastcgi_pass unix:/run/php/php7.0-fpm.sock;
fastcgi_cache_bypass $skip_cache;
fastcgi_no_cache $skip_cache;
fastcgi_cache FASTCGICACHE;
fastcgi_cache_valid 60m;
add_header X-FastCGI-Cache $upstream_cache_status;
}
location ~ /\.ht {
deny all;
}
location = /favicon.ico { log_not_found off; access_log off; }
location = /robots.txt { log_not_found off; access_log off; allow all; }
location ~* \.(css|gif|ico|jpeg|jpg|js|png)$ {
expires max;
log_not_found off;
}
}
In your terminal it should look like this:
<Nginx Virtual Host Configuration File with FastCGI Cache enabled>
Save and exit the file, and as ever, check it for syntax errors before reloading:
# nginx -t
# service nginx reload
Now reload your site with the network inspector tab open:
<Reload and inspect the site with FastCGI caching>
As you can see, my site now loads in nearly one-third the time it did before. Loading in 693ms, we have shaved 1.3s from the loading time. You should see similar gains.
Step 9: Configure Admin Node and Visitor Nodes
At the moment our cluster is configured in a balanced configuration. The Load Balancer will serve traffic equally to each of the node servers.
We could weight that traffic if we liked, to serve more traffic to some servers and less to others. However, we are going to leave node 2 and 3 to each be served equal site traffic, while reserving node1 for administration duties.
As mentioned earlier, many of the administration tasks involved in running a web application, like WordPress, can consume valuable resources and lead to a slow down on the server. This can adversely affect visitors to the site if they are being served pages from the same server the administration tasks are being executed on. Our chosen cluster architecture ensures this never happens, and make it easy to add extra site visitor nodes if we ever need to scale further.
Open another port in the security group
Visit your security group in the Alibaba Cloud Management Console, and open another inbound port:
- Port 9443/9443 - Authorization Object 0.0.0.0/0
<Open a Port for Admin access to Node1>
Reconfigure the load balancer's Nginx virtual host configuration file
On your load balancer open the Nginx Configuration file for editing:
# nano /etc/nginx/sites-available/yet-another-example.com
Inside the configuration file add a new upstream block, and add your node1 private IP:
# Cluster Admin - Only accessible by port 9443 - reserves this node of Admin activities
upstream clusterwpadmin {
server 172.20.62.56;
}
Now remove the node1 private IP from the clusternodes upstream block:
# Clusternodes - public facing for serving the site to visitors
upstream clusternodes {
ip_hash;
server 172.20.213.159;
server 172.20.213.160;
}
Below the existing server block, add another one for listening on the Admin Port with the following code:
#Admin connection to yourdomain.com:9443 they will be directed to node 1.
server {
listen 9443 ssl;
server_name yet-another-example.com www.yet-another-example.com;
ssl_certificate /etc/letsencrypt/live/yet-another-example.com/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/yet-another-example.com/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
if ($scheme != "https") {
return 301 https://$host$request_uri;
} # managed by Certbot
location / {
proxy_pass http://clusterwpadmin;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
}
}
Make sure this block has access to the SSL directives from Certbot, and the proxy_pass is directed at the ‘clusterwpadmin’ upstream servers.
Now your entire Configuration file should look include the following:
# Cluster Admin - Only accessible by port 9443 - reserves this node of Admin activities
upstream clusterwpadmin {
server 172.20.62.56;
}
# Clusternodes - public facing for serving the site to visitors
upstream clusternodes {
ip_hash;
server 172.20.213.159;
server 172.20.213.160;
}
server {
listen 80;
server_name yet-another-example.com www.yet-another-example.com;
location / {
proxy_pass http://clusternodes;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
}
listen 443 ssl; # managed by Certbot
ssl_certificate /etc/letsencrypt/live/yet-another-example.com/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/yet-another-example.com/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
if ($scheme != "https") {
return 301 https://$host$request_uri;
} # managed by Certbot
# Redirect non-https traffic to https
# if ($scheme != "https") {
# return 301 https://$host$request_uri;
# } # managed by Certbot
}
#Admin connection to yourdomain.com:9443 they will be directed to node 1.
server {
listen 9443 ssl;
server_name yet-another-example.com www.yet-another-example.com;
ssl_certificate /etc/letsencrypt/live/yet-another-example.com/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/yet-another-example.com/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
if ($scheme != "https") {
return 301 https://$host$request_uri;
} # managed by Certbot
location / {
proxy_pass http://clusterwpadmin;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
}
}
In your terminal:
<Admin Node configuration for Load Balancers Nginx>
Now you can only visit node1 by appending :9443 on the end of the url. To access the node1 for Admin work, visit:
https://yet-another-example:9443/wp-admin
<WordPress administration on the node1 Admin server>
Of course, you can still visit the WordPress administration on any of the other nodes if necessary, but I would advise against it.
Step 10: Securing the Cluster Replication
In our cluster each of our Nodes Percona Database is communicating with the other Nodes Database via the MySQL 3306 port, alongside the specific Percona ports 4444,4567, and 4568. Likewise, our GlusterFS glustervolume is communicating with each of it’s nodes via standard open TCP ports.
At the moment, any external server can communicate with each of these components if they know their ports and volume details. We should secure them.
Securing Percona database replication ports
In our Security Group, we opened the following ports for access to all IP addresses 0.0.0.0/0:
- Port 3306 TCP (Inbound/Outbound)
- Port 4444 TCP (Inbound/Outbound)
- Port 4567 TCP (Inbound/Outbound)
- Port 4568 TCP (Inbound/Outbound)
We now need to create individual rules for each port, one rule for each allowing Inbound and Outbound access to each of our Private IP addresses:
We need to add the following rules:
Port 3306 x 3 - Inbound & Outbound Rules
Authorization Type: Address Field Authorization Object: 172.20.62.56
Authorization Type: Address Field Authorization Object: 172.20.213.159
Authorization Type: Address Field Authorization Object: 172.20.213.160
Port 4444 x 3 - - Inbound & Outbound Rules
Authorization Type: Address Field Authorization Object: 172.20.62.56
Authorization Type: Address Field Authorization Object: 172.20.213.159
Authorization Type: Address Field Authorization Object: 172.20.213.160
Port 4567 x 3 - - Inbound & Outbound Rules
Authorization Type: Address Field Authorization Object: 172.20.62.56
Authorization Type: Address Field Authorization Object: 172.20.213.159
Authorization Type: Address Field Authorization Object: 172.20.213.160
Port 4568 x 3 - - Inbound & Outbound Rules
Authorization Type: Address Field Authorization Object: 172.20.62.56
Authorization Type: Address Field Authorization Object: 172.20.213.159
Authorization Type: Address Field Authorization Object: 172.20.213.160
Now we need to delete the original rules for each port that allowed full access to 0.0.0.0/0
Our Security Group Inbound rules should look like this:
<Secure the Percona Inbound Ports>
Our Security Group Outbound rules should look like this:
<Secure the Percona Outbound Ports>
Test Secured Percona Ports
We need to test communication between our nodes on their private IP addresses using these ports. Unfortunately we can’t use the ‘ping’ tool for this, as it doesn’t work with ports.
Luckily the ‘hping3’ tool does, install it with:
# apt-get install hping3
Now on each of your nodes run the following command for each of the other nodes IP addresses AND each of the ports, that means run the command 8 times on each node:
# hping3 <other node ip> -S -V -p <port number>
For example, on my node1:
# hping3 172.20.213.159 -S -V -p 3306
# hping3 172.20.213.159 -S -V -p 4444
# hping3 172.20.213.159 -S -V -p 4567
# hping3 172.20.213.159 -S -V -p 4568
# hping3 172.20.213.160 -S -V -p 3306
# hping3 172.20.213.160 -S -V -p 4444
# hping3 172.20.213.160 -S -V -p 4567
# hping3 172.20.213.160 -S -V -p 4568
If all the ports are working on a node you should get response as follows:
<Successfully Testing Node 2 Port 3306 from Node 1>
<Successfully Testing Node 2 Port 4444 from Node 1>
<Successfully Testing Node 2 Port 4567 from Node 1>
<Successfully Testing Node 2 Port 4568 from Node 1>
With this completed the nodes in your Percona Cluster can only talk to each other via their open Ports and Private IP addresses.
Remember, if you add extra nodes, you will need to configure extra rules in your security group.
Secure your GlusterFS File System
At the moment, any computer can connect to our storage volume as long as it knows the volume name and our IP range, but it is easy to secure this.
On any of your nodes, issue the following command, using your nodes private IP addresses separated by a comma:
# gluster volume set glustervolume auth.allow 172.20.62.56,172.20.213.159,172.20.213.160
You should receive a ‘Success’ message.
<Set GlusterFS volume authorizations>
At any point you can check whether you have security enabled, or for other details about your volume with the info command:
# gluster volume info
As you can see, our volume only authorizes access from our nodes’ private IP addresses.
<Gluster Volume Info - With Restricted Access>
If you want to turn this off and allow all access for any reason, do that with the following command:
# gluster volume set glustervolume auth.allow all
We Are Done!
And that is it, we are done. We have created and secured a highly performant WordPress cluster using 3 nodes. We are using a GlusterFS distributed network storage system, and Percona XtraDB Cluster Database.
We have set one node for administration of the the Web Application with its system crontab administering WordPress Cron scheduled tasks, while the other and two nodes are left to handle Site traffic. Our site traffic nodes are using Nginx FastCGI caching to further enhance performance and stability under heavy loads.
This architecture can be scaled horizontally with ease to server the most demanding of enterprise sites. We could even deconstruct the cluster to have it running with a cluster of Nginx web servers being served by a cluster of dedicated GlusterFS node file servers, and dedicated Percona Cluster database servers. We could even add external object caching via a Redis server, and remove search functionality to a dedicated Elasticsearch server. These are topics for another tutorial.