Cloud native Wordpress with Docker on Kubernetes

Perhaps it's good to start with the definition of cloud native:

Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach. These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil. > > https://github.com/cncf/foundation/blob/master/charter.md

So, is Wordpress Cloud native?

Fuck no.

I will explain it further later on but to be honest, no PHP application or even framework is made in a cloud native way or mindset. There are always a few fundamental "architecture design" problems which cause the entire cloud native way impossible out of the box. Sometimes even with loads of work altering in the "core", you won't be able to run it in the way you wanted to. This might suck, but if you analyze your requirements and needs, you pick the right tools. If we want high availability, high performance, and ease of CI/CD, perhaps you should have picked another framework / CMS or even language...

It's not a joke

I work and talk with a lot of talented people who have done amazing things in the DevOps / k8s world. Imagine telling me that I want to use Wordpress. I guess the general rule of thumb is that PHP, in general, is being a joke. Especially if you mention Wordpress.

I get that, on the technical level. PHP with its loosely typed "scripts". Wordpress which can become a cluster fuck of bad code because of its hundreds of plugins. I'm not going to defend anything here saying that PHP has become a bit more mature. What I'm going to defend is that Wordpress as a CMS is pretty good.

If you manage to manage Wordpress (see what I did there), you can run it perfectly fine. Enabling end-users/customers with a rich environment they desire. Especially if you have a decent development team which ignores the core of WP, creating really fine themes aka the frontend.

I can ignore the rants and hate by providing something that is just good. Fully automated ci/cd on a kubernetes platform, with independent containers. Running fully stateless with best practices, enabling developers to create awesome stuff and providing customers with a solid platform.

Making something cloud native - Team effort.

It is always important to know the internals of an application. Obviously, if you work as DevOps / Infra - you would mainly provide a "stack" or tools for developers to work with. Nonetheless, there should be some sort of communication on how an application should run. There is something really important I've learned: Developers have little concept of the true internals of their application. Especially: What does "internal x" mean for "server setup Y". A developer might know that he needs a shared folder for "resources" to be used by every node. He does not know the performance of a shared folder nor the issues that may arise when you hammer like mad on a log file which gets written & read by 100 nodes every 0.1 seconds. (answer: it sucks).

The truth is that many applications still require extensive work for it to become cloud native or even close to it. From both sides, talking to Development and DevOps.

I have the "luck" (cries in php) to have knowledge of both sides, which makes it a little bit easier to just yolo-solo this myself.

Wordpress: why not cloud native?

As promised I would explain the "fuck no" part. This is just a summary of things that we should "fix". Now the UI is fairly rich. It allows administrators of the Wordpress website to install plugins, configure them but also do the same for themes. The core of this lays in the **wp-content **folder. There are obviously more files & folders but to give you a simple example:

  • www
    • index.php
    • wp-config.php
    • wp-content
      • plugins
        • yoast-seo
      • themes
        • my-awesome-theme
      • uploads
        • selfie.png

We can and should eliminate the use of the plugins and themes folder. Simply because if we don't, we have to find a solution to make them shared over all our nodes. In the end, we want to be able to scale things. So if we require a shared folder that means we need a volume that is RWX (Read Write Many) - A volume that can be read & written by multiple containers. That means we need something like NFS, GlusterFS, Ceph or whatever you prefer.

_A small introduction: I'm talking about running this on Kubernetes. So we have "nodes" which host pods. Pods consist of one or more containers, which in their turn can mount volumes. If we want a bit "high-availability" we run multiple pods containing our application. Therefore we require these volumes to be RWX if we want to share them over the pods. _

How great those products are (or not) - they are inferior to a simple disk. Not only can it be hard to maintain a stable platform, but the performance can also be killing. If our storage solution is down, or pods won't be able to mount and start running. If the performance goes south, every pod serving your application will experience issues.

I'm not saying you should never use such solution but it really depends on the case. If you have a high throughput of many simultaneous reads and writes: please re-think your options. Especially if you can eliminate the need of it, you should (and we can in this case).

The next thing to know is that if we are going to use certain plugins, they will create folders and/or files outside their own plugin folder. This is just hurtful. If we want to make something stateless, we really, really, really do not want random plugins to make random changes in our stateless application. Worst case scenario is to run the plugin locally and see what files/folders it makes and persist this into your project. Best case is to just delete that plugin and find something better.

The same goes for our wp-config.php which is basically our configuration file. Some plugins/actions may alter our configuration. We also really do not want this. When we "start" our application we want to "bootstrap" it, initialize it, with our values and configurations. After that, it should be just stateless and do its shit.

Lastly, we want some form of caching. Every Wordpress website needs caching. There are different plugins for this. Some over 1 million downloads. Nonetheless, they all have one thing in common: they write the cache to files. Specifically wp-content/cache in most cases. Besides that, they write config files in the wp-content root. Just to repeat: we don't want this.

To define more issues on running Wordpress in a sensible way is the official docker image. https://hub.docker.com/_/wordpress/ Now this is fine for development but seriously uncool for production. We want to know and define our state. This way we can use our container for dev, accept, production or on the moon: and its state will be 100% identical. The container will always download Wordpress in its Dockerfile, the entry-file will determine what it will do.

echo >&2 "WordPress not found in $PWD - copying now..."
		if [ -n "$(ls -A)" ]; then
			echo >&2 "WARNING: $PWD is not empty! (copying anyhow)"
		fi

It also has VOLUME /var/www/html which just indicates "Yolo, make everything a volume for my application". Most guides will describe to make a volume mount for the entire wp-content folder. Some even do it for the entire /html folder.

This means Wordpress can live entirely on your persistent storage and the only real way to manage it is via ssh or the UI. In short, it means that the container is just a simple "fire and forget" method to get Wordpress running.

I hope you now understand why you do not want to run this in production.

How to fix it

We have to change the way we work and develop. With just a few basic rules we can fix a few issues.

  • We do not persist plugins or themes
  • Plugins and themes are provided by our project and are running in our container.
    • This is possible by running WP locally, download the plugin/theme and commit it into your project repository. Updating idem-ditto
  • The other option is to use composer: https://wpackagist.org/ Simply define the plugins + versions and build your application when deploying.
  • We manage our wp-config.php, not the application

For it to work we also have to make a docker image which can handle this. In short, it would be something like:

COPY app/plugins wp-content/plugins/
COPY app/themes  wp-content/themes/

It does not matter if the "app/plugins" is coming straight from raw files in your repository or from a build process with composer.

This leaves us left with our uploads. We need something to persist our images. This is something we really require and there is a neat solution: Buckets. https://.wordpress.org/plugins/amazon-s3-and-cloudfront/

With this plugin, we are able to offload our images to a few major cloud providers. It will change the URL of our assets to the bucket's URL. If we add a CDN we can provide our assets under a subdomain of our website and have blazing fast resources. Without the need for persisting AND sharing the files over all our nodes.

For the cache we won't be using a plugin. We are going for Varnish.

Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. You install it in front of any server that speaks HTTP and configure it to cache the contents. > > [https://varnish-cache.org/intro/index.html#intro](https://varnish-cache.org/intro/index.html#intro)

Varnish is just the bomb for caching. Our k8s setup will look like this:
Loadbalancer -> Ingress -> Varnish -> Application

We need to define the backend(s) for Varnish (our application). We also need to "whitelist" our backends so they are able to purge the cache (or parts of it). To not get into many technical details: We can use our app-selector name for our backend, not so much for the whitelist. We have to automatically update our VCL to provide these hosts, every time we have a new pod. It is lacking some good images for Varnish which can do these actions out of the box. I will eventually make something public to address that issue.

The Dockerfile

I merely provide an example. If you are really going to run this in production, you should make your own. Also, do make your own nginx/php image, I will again eventually provide mine, for now I used richarvey/nginx-php-fpm image. It has to much "things" for my taste, hence you also should create your own ;)

 1# "Init" container to prepare wordpress core
 2FROM alpine:latest as downloader
 3
 4ENV WORDPRESS_VERSION 5.1.1
 5ENV WORDPRESS_SHA1 f1bff89cc360bf5ef7086594e8a9b68b4cbf2192
 6
 7# Download Wordpress to /tmp for further processing
 8RUN set -ex; \
 9    apk add --no-cache curl; \
10    curl -o wordpress.tar.gz -fSL "https://wordpress.org/wordpress-${WORDPRESS_VERSION}.tar.gz"; \
11    echo "$WORDPRESS_SHA1 *wordpress.tar.gz" | sha1sum -c -; \
12    tar -xzf wordpress.tar.gz -C /tmp;
13
14WORKDIR /tmp/wordpress/
15
16# We are going to remove all default plugins and themes that are shipped with Wordpress. Also some other non-core files.
17# These are directories yet we maintain the default index.php for security in the themes and plugins dirs
18RUN find ./wp-content/themes/ -maxdepth 1 -mindepth 1 -type d -exec rm -r {} \;; \
19    find ./wp-content/plugins/ -maxdepth 1  -mindepth 1 -type d -exec rm -r {} \;; \
20    rm wp-config-sample.php; \
21    rm wp-content/plugins/hello.php; \
22    rm readme.html
23
24# Starting the actual container
25FROM richarvey/nginx-php-fpm:1.6.8
26
27# Basic requirements
28RUN apk update && apk upgrade &&\
29    apk add --no-cache \
30    less
31
32# Define volumes, workdirs and prepare wp
33WORKDIR /var/www/html/
34
35COPY --from=downloader /tmp/wordpress/ /var/www/html/
36COPY wp-config.conf wp-config.php
37
38COPY app/.htaccess .htaccess
39COPY app/plugins wp-content/plugins/
40COPY app/themes  wp-content/themes/
41
42RUN chown www-data:www-data . -R
43
44COPY conf/nginx-site.conf /etc/nginx/sites-available/default.conf
45        
46COPY docker-entrypoint.sh /usr/local/bin/
47
48ENTRYPOINT ["docker-entrypoint.sh"]
49CMD ["/start.sh"]

What I'm trying to achieve with this multi-stage build is to download Wordpress and only place the essentials in the actual image. We don't want the default theme in our image for instance.

The wp-config.php

 1    
 2<?php
 3
 4// Make this dynamic
 5define( 'AS3CF_SETTINGS', serialize( array(
 6    'provider' => 'gcp',
 7    'key-file-path' => '/etc/bucket/keys.json',
 8) ) );
 9
10define( 'DB_NAME', '{DB_NAME}' );
11
12define( 'DB_USER', '{DB_USER}' );
13
14define( 'DB_PASSWORD', '{DB_PASSWORD}' );
15
16define( 'DB_HOST', '{DB_HOST}' );
17
18define( 'DB_CHARSET', '{DB_CHARSET}' );
19
20define( 'DB_COLLATE', '{DB_COLLATE}' );
21
22define( 'AUTH_KEY',         '{AUTH_KEY}' );
23define( 'SECURE_AUTH_KEY',  '{SECURE_AUTH_KEY}' );
24define( 'LOGGED_IN_KEY',    '{LOGGED_IN_KEY}' );
25define( 'NONCE_KEY',        '{NONCE_KEY}' );
26define( 'AUTH_SALT',        '{AUTH_SALT}' );
27define( 'SECURE_AUTH_SALT', '{SECURE_AUTH_SALT}' );
28define( 'LOGGED_IN_SALT',   '{LOGGED_IN_SALT}' );
29define( 'NONCE_SALT',       '{NONCE_SALT}' );
30
31$table_prefix = '{table_prefix}';
32
33define( 'WP_DEBUG', {WP_DEBUG} );
34
35define('WP_AUTO_UPDATE_CORE', {WP_AUTO_UPDATE_CORE});
36
37if (strpos($_SERVER['HTTP_X_FORWARDED_PROTO'], 'https') !== false)
38    $_SERVER['HTTPS']='on';
39
40if ( ! defined( 'ABSPATH' ) ) {
41    define( 'ABSPATH', dirname( __FILE__ ) . '/' );
42}
43
44require_once( ABSPATH . 'wp-settings.php' );

Nothing special, but we want to initialise our config before runtime and keep it like this. All the values are set via ENV vars (and secrets) by our docker-entrypoint

docker-entrypoint

 1    
 2#!/bin/bash
 3
 4array=(
 5    'WP_CACHE::false'
 6    'WPCACHEHOME::'
 7    'DB_NAME::dbname'
 8    'DB_USER::dbuser'
 9    'DB_PASSWORD::dbpass'
10    'DB_HOST::localhost'
11    'DB_CHARSET::utf8'
12    'DB_COLLATE::'
13    'AUTH_KEY::put your unique phrase here'
14    'SECURE_AUTH_KEY::put your unique phrase here'
15    'LOGGED_IN_KEY::put your unique phrase here'
16    'NONCE_KEY::put your unique phrase here'
17    'AUTH_SALT::put your unique phrase here'
18    'SECURE_AUTH_SALT::put your unique phrase here'
19    'LOGGED_IN_SALT::put your unique phrase here'
20    'NONCE_SALT::put your unique phrase here'
21    'table_prefix::wp_'
22    'WP_DEBUG::false'
23    'WP_AUTO_UPDATE_CORE::false'    
24)
25
26sed_escape() {
27    echo "$@" | sed -e 's/[\/&]/\\&/g'
28}
29
30for index in "${array[@]}" ; do
31    KEY="${index%%::*}"
32    VALUE="${index##*::}"
33
34    if [ -z "${!KEY}" ]
35    then
36        declare ${KEY}=$VALUE
37    fi
38
39    sed -i -e 's/{'"$KEY"'}/'"$(sed_escape ${!KEY})"'/g' wp-config.php
40
41done
42
43cd /var/www && curl -O https://raw.githubusercontent.com/wp-cli/builds/gh-pages/phar/wp-cli.phar
44su - www-data -s /bin/bash -c 'php wp-cli.phar core update-db --path=/var/www/html/'
45rm wp-cli.phar
46
47exec "$@"

We fill in our values for the wp-config but we also do something special with "wp-cli".

wp-cli is a tool which can do a lot of things, but merely one important thing for this stage: checking for a db-changes. IF we update our image by upping the Wordpress core version, we might require DB changes. With this "hack" we can process this change.

The result

It does not seem that hard, to make a few changes here and there. The truth is that it's much harder to keep it clean. The changes required for "this" to work means you might have to alter your development flows. Perhaps even change the way you or the customer uses their website. Truth be told is that if you want it "cloud native" or even on a single VPS; you are going to need structure and decent guidelines in order to keep your Wordpress website "sane".

"My way" enforces stateless. You can still install a plugin via the UI but on a pod re-creation, it is gone. This also means that IF you get hacked, it does not persist those PHP shells left behind. Obviously, you have to fix your code, but a restart of the pod will fix your website (temporary).

I can scale my Wordpress pods without issues. Every "part" of my application is scalable and not dependent on each other. I can scale whatever I like, whenever I want. Deployments are easy, having 0 downtimes when it happens.

I know what version my Wordpress or specific plugins are. I also know that my acceptance is 100% identical to my production environment. Providing the image allows a development setup within seconds, which is again 100% identical to production.

On github, I have pushed some code https://github.com/wiardvanrij/wordpress-kubernetes . I intentionally did not push an easy setup including every part of k8s. I believe you should at least think about your own application and create the right tools and images. Using the best principles, understanding containers and persistence.

The future

I truly believe we have to re-think the way most PHP applications are designed. The core issue I notice here is that too many things are generated in/on runtime. This happens with Symfony (a framework) but also with Magento (e-commerce system) and many others. Include the fact that most "systems" are non-modular, you end up with a monster as backend. Also because of, let's say this randomness while running it, it becomes challenging to provide a stable hosting infrastructure - cloud native style.

There is a shift towards the progressive web applications, yet it still does not solve having a huge clump of a backend. We merely removed the frontend from the backend, making it possible to scale each part. Nonetheless, it can be still challenging to scale that backend.

Let's just pick one part: images. It would be so nice if the management of images would be provided by a component that could run on itself. Just a simple REST-based application with various options as backends (storage). The same goes for caching, especially on Magento which does a lot with different caches. Why not just make a component that can generate and manage those caches on itself. Again with various backends (Redis for example), that can again, run stand-alone.

Eventually, this may solve specific bottlenecks that I see happening a lot in various production environments. Cronjobs processing data causing the entire application on hold, because the cron has to run on the "main application". If this was separated we could manage the cron, giving it less effect on the rest of the application. Yet also being able to scale it indefinitely if we require that power.

If we stop making applications with such monolithic architecture, we will provide so much more power and stability for the future.

comments powered by Disqus