Mastodon the hard way

By Richard Crowley

Like a great many Twitter-addicted computer professionals, I’ve been giving Mastodon a much more serious look this week than I have in years past. I can’t quite put my finger on why. Anyway.

Inspired in somewhat equal parts by the lesson we’re all learning (again) with Twitter, by Simon, Jacob, Mastodon itself, and others paving the way, and by the magic productivity hack of procrastinating on something else, I set out to make @rcrowley@rcrowley.org my Mastodon handle.

I could absolutely have gone to Masto.Host. That would’ve been faster and possibly cheaper. I may still move there one day. The domain’s mine. I can do whatever I want. For now, I want to masochistically host open-source software on the Internet by myself.

My work these days has me preaching the virtues of using lots of AWS accounts so I began by using Substrate to open an AWS account to host my Mastodon server. At the outset, my intention was to use the smallest EC2 instance I could to host Mastodon and its (ephemeral) Redis server and Aurora to host the (durable) Postgres database with configuration on https://rcrowley.org/.well-known/webfinger to redirect to the Mastodon server. Let’s see how that worked out.

Aurora Postgres

Writing the Terraform to provision an Aurora Postgres cluster was straightforward and boring — almost a straight copy-paste from the AWS Terraform provider docs. I chose to use the Aurora Serverless v2 variant which I’m anticipating being very much more expensive than something called “serverless” should be. I fully expect to have to change this but, for now, I’m enjoying the experiment.

Mastodon, etc.

Mastodon’s installing from source documentation is excellent. I made it harder on myself by insisting on using Amazon Linux 2, which I did because I’ve standardized on that everywhere else in AWS. Here’s a (large) excerpt from the program I provide in the EC2 instance’s user-data to handle the majority of the installation, complete with the RedHat-style package names translated from Mastodon’s Debian-style documentation:

# Node.js and Yarn.
if ! which "node"
then curl -L -s "https://rpm.nodesource.com/setup_16.x" | bash -
fi
yum -y install "nodejs"
corepack enable
yarn set version "stable"

# Postgres (client, not server, since we're using Aurora).
amazon-linux-extras enable "postgresql14"
yum -y install "postgresql" # client-only since we're using Aurora

# C dependencies, compilers, etc.
amazon-linux-extras enable "epel"
yum -y install "epel-release"
yum -y install "autoconf" "bison" "gcc" "gcc-c++" "make" "pkgconfig"
yum -y install "ImageMagick" "ffmpeg"
# TODO ffmpeg
yum -y install "git"
yum -y install \
    "gdbm-devel" \
    "jemalloc-devel" \
    "libffi-devel" \
    "libicu-devel" \
    "libidn-devel" \
    "libxml2-devel" \
    "libxslt-devel" \
    "libyaml-devel" \
    "ncurses-devel" \
    "openssl-devel" \
    "postgresql-devel" \
    "protobuf-compiler" \
    "protobuf-devel" \
    "readline-devel" \
    "zlib-devel"

# Redis.
amazon-linux-extras enable "redis6"
yum -y install "redis"
systemctl enable --now "redis" || :

# Ruby via rbenv.
if [ ! -d "/usr/local/rbenv" ]
then git clone "https://github.com/rbenv/rbenv.git" "/usr/local/rbenv"
fi
if [ ! -d "/usr/local/rbenv/plugins/ruby-build" ]
then git clone "https://github.com/rbenv/ruby-build.git" "/usr/local/rbenv/plugins/ruby-build"
fi
cat >"/etc/profile.d/rbenv.sh" <<'EOF'
export RBENV_ROOT="/usr/local/rbenv"
export PATH="$RBENV_ROOT/bin:$RBENV_ROOT/shims:$PATH"
EOF
. "/etc/profile.d/rbenv.sh"
RUBY_CONFIGURE_OPTS="--with-jemalloc" rbenv install -s "3.0.3"
rbenv global "3.0.3"
gem install bundler --no-document

# Mastodon itself.
useradd -U "mastodon" || :
chmod +rx "/home/mastodon"
if [ ! -d "/home/mastodon/live" ]
then su -c'git clone "https://github.com/tootsuite/mastodon.git" "/home/mastodon/live"' "mastodon"
fi
(
    cd "/home/mastodon/live"
    su -c 'git checkout "$(git tag -l | grep -v 'rc[0-9]*$' | sort -V | tail -n"1")"' "mastodon"
    su -c'bundle config deployment "true"' "mastodon"
    su -c'bundle config set force_ruby_platform "true"' "mastodon"
    su -c'bundle config without "development test"' "mastodon"
    su -c'bundle install -j"$(getconf "_NPROCESSORS_ONLN")"' "mastodon"
    su -c'sed -i "s/emoji-mart-lazyload\"/emoji-mart-lazyload@^3.0.1-j\"/" "package.json"' "mastodon" # <https://github.com/mastodon/mastodon/issues/19755>
    su -c'yarn install' "mastodon"
    cat >".env.production" <<EOF
# ... snip ...
DB_HOST="$(
    aws rds describe-db-clusters --db-cluster-identifier "mastodon" |
    jq -r ".DBClusters[].Endpoint"
)"
LOCAL_DOMAIN="rcrowley.org"
# ... snip ...
REDIS_HOST="127.0.0.1"
REDIS_PASSWORD=""
# ... snip ...
SINGLE_USER_MODE="true"
# ... snip ...
WEB_DOMAIN="mastodon.rcrowley.org"
# ... snip ...
EOF
    chown "mastodon:mastodon" ".env.production"
    su -c'RAILS_ENV="production" bundle exec rails assets:precompile' "mastodon"
)
find "/home/mastodon/live/dist" -name "*.service" |
xargs -I"_" cp "_" "/etc/systemd/system"
find "/etc/systemd/system" -name "mastodon-*.service" |
xargs sed -i "s/\/home\/mastodon\/.rbenv/\/usr\/local\/rbenv/"
systemctl daemon-reload
systemctl enable --now "mastodon-sidekiq" "mastodon-streaming" "mastodon-web" || :

# Nginx with a certificate from Let's Encrypt.
amazon-linux-extras enable "nginx1"
yum -y install "certbot" "nginx" "python2-certbot-nginx"
cp "/home/mastodon/live/dist/nginx.conf" "/etc/nginx/conf.d/mastodon.conf"
sed -i "s/example.com/mastodon.rcrowley.org/" "/etc/nginx/conf.d/mastodon.conf"
sed -i "s/# ssl_/ssl_/" "/etc/nginx/conf.d/mastodon.conf"
systemctl enable --now "nginx" || :
certbot -d"mastodon.rcrowley.org" -n --nginx --no-redirect
systemctl restart "nginx"

A diligent reader will notice that I didn’t bother to find RPMs for ffmpeg. I will live without being able to upload video. More importantly, I found that I had to patch package.json with a version number for emoji-mart-lazyload. Finally, I had to grab Redis from Amazon Linux Extras because the upstream version is too old for Sidekiq.

I’ve been having very good experiences with Linux on ARM. My Mastodon server began as a t4g.nano but got replaced over and over until I ended up on a t4g.medium to accommodate the memory demands of the asset compilation step. I’d like to get this back down to a smaller instance, which means I’ll soon separate building Mastodon (on a bigger instance) from running Mastodon (on a smaller instance).

The futile quest to use AWS Certificate Manager

I use AWS Certificate Manager in all sorts of places so I had aspirations to use it instead of Let’s Encrypt for my Mastodon server, too. The thing about ACM, though, is that you can never touch the private key. You use it by configuring CloudFront, ALB, API Gateway, or an EC2 Nitro Enclave and ACM handles the rest.

EC2 Nitro Enclaves

My first choice was to use an EC2 Nitro Enclave to make the private key available to Nginx without the private key sitting on the EC2 instance’s filesystem. However, Nitro Enclaves are only available on EC2 instances with more than 4 VCPUs, the cheapest of which is around $75 per month. So EC2 Nitro Enclaves are out.

API Gateway

My second choice was to use API Gateway to front my EC2 instance. This was actually going pretty well and I’m very impressed with API Gateway v2. Automatic deployment and the $default route, in particular, make it really easy to use.

However, you can’t just give API Gateway an EC2 instance as its backend/upstream. Instead, I had to register the EC2 instance with Cloud Map and configure API Gateway to discover it via Cloud Map. Fine, that works.

Unfortunately, even once that yak was shaved, I had trouble convincing Mastodon that it was being accessed via HTTPS because the hop from API Gateway to the EC2 instance was plaintext HTTP. I also had trouble convincing API Gateway to access the Mastodon server via HTTPS without a valid certificate. That is to say: Unlike ALB, API Gateway does actually verify upstream certificates. I tried reintroducing Let’s Encrypt to satisfy API Gateway but found it difficult to route the plaintext ACME challenge protocol.

Giving up

Removing API Gateway and wiring up Let’s Encrypt required the Mastodon server to be in a public subnet and directly accessible on the Internet. And that really didn’t motivate me to try adding CloudFront. Perhaps another time.

Removing API Gateway gave me the opportunity to remove Cloud Map, too, which I took. It seems like a fine service that I just don’t need in this situation. So I changed the EC2 instance’s bootstrapping to create A and AAAA DNS records instead of registering with Cloud Map.

Webfinger redirects

I want my fully-qualified Mastodon handle to be @rcrowley@rcrowley.org, not @rcrowley@mastodon.rcrowley.org. The easy part of this configuration is adding LOCAL_DOMAIN="rcrowley.org" to Mastodon’s configuration. The harder part is arranging for https://rcrowley.org/.well-known/webfinger to redirect to https://mastodon.rcrowley.org/.well-known/webfinger. I thought I’d be doing this with CloudFront, which has been serving rcrowley.org from S3 for years. Actually, though, redirects are in S3’s wheelhouse. Here’s the Terraform I added to my existing aws_s3_bucket_website_configuration resource:

routing_rule {
  condition {
    key_prefix_equals = ".well-known/host-meta"
  }
  redirect {
    host_name          = "mastodon.rcrowley.org"
    http_redirect_code = 301
    protocol           = "https"
  }
  }
  routing_rule {
  condition {
    key_prefix_equals = ".well-known/nodeinfo"
  }
  redirect {
    host_name          = "mastodon.rcrowley.org"
    http_redirect_code = 301
    protocol           = "https"
  }
  }
  routing_rule {
  condition {
    key_prefix_equals = ".well-known/webfinger"
  }
  redirect {
    host_name          = "mastodon.rcrowley.org"
    http_redirect_code = 301
    protocol           = "https"
  }
}

Everything appeared internally to be working but @rcrowley@rcrowley.org was invisible to search on other Mastodon servers and refused to be followed. The problem was that CloudFront wasn’t forwarding query strings to S3, so the query strings weren’t being written into the 301 Moved Permanently responses, so Mastodon was never able to respond to webfinger requests. A one line change and a CloudFront redistribution later, I was in business.

See you on Mastodon. I’m @rcrowley@rcrowley.org.