Deploy a global, private CDN on your lunch break

An overview of Edgemesh Supernodes

Edgemesh gets the vast majority of it’s network throughput via the browser based clients, which helps diversify the content distribution network while dramatically reducing page load times and the customer’s server load. This organic visitor mesh is fantastic and has the beautiful property of reverse scaling (the more visitors who hit the website, the larger the mesh capacity!). Users get faster page loads transparently while site operators get increased network capacity — it’s a win win.

But early in our development a major customer asked us for a server side version of the Edgemesh client. This server side version would allow the customer to utilize their existing global infrastructure while simultaneously providing an always online peer to accelerate their mesh content. Essentially they were looking for way to implement their own CDN services without the complexity of changing their current infrastructure.

And so, the Supernode was born.

Supernodes: self-bootstrapping caches in a box

To get started with a Supernode, you will need to have an active domain registered with Edgemesh. If you haven’t done so yet, go ahead and sign-up and add Edgemesh to your site (it only takes 5 minutes 😊 ).

Once you’re good to go, have a look at the map overview. Here we can see your site visitors as they browse your new Edgemesh enhanced website and your mesh beginning to grow.

Map overview with site visitors (in purple clusters)

Under the covers what’s happening is visitors are accessing the website and downloading the content (if it isn’t in their Edgemesh cache yet). This organic crawling of the website allows Edgemesh to know what assets (images etc) are the most requested and thereby the most important to replicate across the mesh network. Below is an example with debug enabled in the Chrome console showing this workflow.

You can see this for yourself by visiting the Edgemesh homepage and in your Chrome console type:

User visits page and downloads assets , it then stores these for other peers to obtain
As the user stays online, it begins to replicate in other assets from other peers. Effectively pre-caching the content

This workflow, where visitors crawl your site organically and registers new assets on the mesh is important for our Supernodes.

Unlike the client side services, Supernodes do not crawl a site. Instead Supernodes listen for any new asset registration and attempt to replicate that asset in from a browser peer as quickly as possible. Since Supernodes have a larger storage capacity (however much disk you allow) when compared to the browser based peers, they attempt to keep as many assets as possible in their cache.

Supernodes consolidate assets to larger cache points

Getting started with a Supernode

Getting going with a Supernode is as simple as running a docker instance. Go ahead and pull down the Supernode image (it’s rather large) and get yourself a shell.

docker run -it bash

Make sure you have a browser open to your Edgemesh enabled site, this will allow the supernode to quickly discover a peer and begin to replicate in assets. To watch the magic happen, we’re going to run this Supernode in DEBUG mode. With your shell open just do:

DEBUG=supernode* npm run docker-start && forever logs 0 -f

If everything worked as designed, you should see something like the image below:

Supernode coming online

We should also see a Supernode in the portal if we look at the map overview.

Understanding your Supernode cache

When a Supernode comes online it receives instructions on what it should replicate in and from whom (peer ID) along with a checksum for each and every asset. This checksum is used to validate the integrity of the asset. The debug console shows these 4 components right after startup, as shown below.

The assets themselves are stored in the data directory. We can ctrl+c to stop the Supernode and have a look ourselves.

It is important to note that simply placing an asset in the data directory will not make that asset available for the mesh. Supernodes can only populate their caches via another peer (either a browser peer or another Supernode). The knowledge of what assets reside in which Supernodes sits on the Edgemesh signal server backplane.

The index.json file has some data on what each asset is. You can install the wonderful jq utility and have a look for yourself.

Now that our Supernode is online and has some assets, browser peers will automatically discover it and use it as an always online high speed alternative peer. Unlike the browser client, Supernodes can rapidly service hundreds of browser peers simultaneously — allowing you to deploy high capacity caches wherever you can run docker. Best of all there are zero configuration changes required to your infrastructure. No DNS entries, no load balancer — Supernodes come online and discover your mesh and add capacity automagically.

Rolling a global mesh with an epic one-liner

Now that we understand the basics, let’s roll out a global CDN cache using a single line of bash. For this example we’re going to roll out our Supernode network across Google Cloud.

Google’s network is one of the best in the world, and includes access to a number of low latency transport (private) backbones to help our Supernodes rapidly replicate content across the globe. Of course at the major internet exchanges such as Equinix, this network meets with peering partners to deliver a truly massive footprint. To get a feeling of the scale have a look at Google (ASN 15169) on Caida.

I’m assuming you have created an account on Google Cloud and created a project. For a step by step guide checkout this and go ahead an install the gcloud command line utilities.

We can start small and deploy a single Supernode in a single Google datacenter. I’ve shown the script below here with some comments

# Edgemesh Supernode deployment script for Google Cloud
# This script assumes you have the gcloud installed and configured
# <install>
# <configure> gcloud auth login
# This script also assumes you have a
# GCE project named `edgemesh-supernode`
# project name
#To create a single Supernode in us-west1-b
docker-machine create --driver google \ # use the GCE driver
--google-project $EM_GCE_PROJ \ # pass our project name
--google-zone us-west1-b \ # which datacenter to run in
--google-machine-type f1-micro \ # CPU is not that relevant
--google-preemptible \ #why not! Supernodes can come on and offline
--google-machine-image \
em-sn-west1b #what to name this host

That will create a docker host, and will kick off something like the below:

Docker-machine on GCE

Once the host is available, we can go ahead and run our Supernode there by evaling to that machine as our docker target and then using our docker run command.

eval "$(docker-machine env em-sn-west1b)" \
&& docker run

Assuming everything worked, we can see our new Supernode on the map.

To remove this instance (and stop billing) we can do:

docker-machine rm em-sn-west1b -y

Turning it to 11

With the trial run complete, let’s now rollout a Supernode in every Google zone and call it a day. To do this we’re going to use the following one liner:

gcloud compute zones list --filter NAME:*-a|awk '{print $1}'|sed 1d | while read -r dc; do docker-machine create --driver google --google-project edgemesh-supernode --google-zone $dc --google-machine-type f1-micro --google-preemptible --google-machine-image em-sn-$dc; eval "$(docker-machine env em-sn-$dc)"; docker run -d; done

Now let’s break this down:

#step 1: get a list of every GCE zone that ends in *-b
# pipe that to beautiful awk and then pipe that
# to sed to drop the header row
gcloud compute zones list --filter NAME:*-a 
# output:
# step 2: setup a while loop and read in each row
# making a variable "dc". Then do a docker-machine create
# in each datacenter , name our new host
# then eval to it and start our supernode instance

| while read -r dc; do docker-machine create --driver google --google-project edgemesh-supernode --google-zone $dc --google-machine-type f1-micro --google-preemptible --google-machine-image em-sn-$dc; eval "$(docker-machine env em-sn-$dc)"; docker run -d; done

From my house this global rollout took 48 minutes and 39 seconds. I also added a few Azure based nodes for good measure. You can of course run the host creation in parallel if that is too long for you.

Once we’re done we can check our map and see our new Supernode network. Not too bad for under an hour!

There are more configuration options available for Supernodes and check out the docs for details around restricting origins, disk size and more.

Also to clean up the instances (and stop the billing) do

docker-machine ls --filter DRIVER=google |awk '{print $1}'|sed 1d | while read -r dc; do docker-machine rm $dc -y; done

Until next time!

Like this article? Let us know @Edgemeshinc on Twitter