Jobs

A script will aggregate all build information from Mozilla archives, and another is in charge of keeping it up to date.

Everything can be executed from a command-line, but we use Amazon Lambda in production.

_images/overview.png

Currently we use Kinto as a generic database service. It allows us to leverage its simple API for storing and querying records. It also comes with a set of client libraries for JavaScript, Python etc.

Initialization

Note

The user:pass in the command-line examples is the Basic auth for Kinto.

The following is not mandatory but recommended. Kinto can use the JSON schema to validate the records. The following setting should be set to true in the server configuration file:

kinto.experimental_collection_schema_validation = true

Load latest S3 inventory

A command to initialize the remote Kinto server, download the latest S3 manifests, containing information about all available files on archive.mozilla.org, and send that information as buildhub records to the remote Kinto server.

latest-inventory-to-kinto

The command will go through the list of files, pick release files, and deduce their metadata. It is meant to be executed on an empty server, or periodically to catch up with recent releases in case the other event-based lambda had failed.

Its configuration is read from environment variables:

  • SERVER_URL (default: http://localhost:8888/v1)
  • BUCKET (default: build-hub)
  • COLLECTION (default: releases)
  • AUTH (default: user:pass)
  • CACHE_FOLDER (default: .)
  • NB_RETRY_REQUEST (default: 3)
  • BATCH_MAX_REQUESTS (default: taken from server)
  • TIMEOUT_SECONDS (default: 300)
  • INITIALIZE_SERVER (default: true): whether to initialize the destination bucket/collection.
  • SENTRY_DSN (default: empty/disabled. Example: https://<key>:<secret>@sentry.io/buildhub)

To use this script as an Amazon Lambda function, use the entry point:

  • buildhub.lambda_s3_inventory.lambda_handler

S3 Event lambda

The Amazon Lambda function that is in charge of keeping the database up-to-date. This one cannot be executed from the command-line.

When releases are published on S3, an S3 Event is triggered and the lambda is invoked.

Use the following entry point:

  • buildhub.lambda_s3_event.lambda_handler

Note

Since release records contain information from JSON metadata files, we handle the case when the JSON metdata file is published before the actual archive, and vice-versa.

The lambda accepts the following configuration (from environment variables):

  • SERVER_URL (default: http://localhost:8888/v1)
  • BUCKET (default: build-hub)
  • COLLECTION (default: releases)
  • CACHE_FOLDER (default: .)
  • AUTH (default: user:pass)
  • NB_RETRY_REQUEST (default: 3)
  • TIMEOUT_SECONDS (default: 300)
  • SENTRY_DSN (default: empty/disabled. Example: https://<key>:<secret>@sentry.io/buildhub)

Setup and configure Amazon Lambda

In order to build the AWS Lambda Zip archive in an isolated environment, we use Docker:

  • make lambda.zip

(...or most likely sudo make lambda.zip)

This will produce a zip file that has to be uploaded in AWS Lambda configuration panel.

_images/lambda-1.png _images/lambda-2.png _images/lambda-3.png _images/lambda-4.png

Using Docker

Some commands are exposed in the container entry-point command (docker run).

The exhaustive list of available commands and description is available using:

docker run -t mozilla/buildhub

For example, run tests:

docker run -t mozilla/buildhub test

Or load the latest S3 inventory:

docker run -e "SERVER_URL=https://buildhub.prod.mozaws.net/v1" -e "AUTH=user:pass" -t mozilla/buildhub latest-inventory-to-kinto

Load S3 inventory manually

In order to fetch inventories from S3, install the dedicated Amazon Services client:

sudo apt-get install awscli

We are interested in two listing: firefox and archive (thunderbird, mobile).

export LISTING=archive

List available manifests in the inventories folder:

aws --no-sign-request --region us-east-1 s3 ls "s3://net-mozaws-prod-delivery-inventory-us-east-1/public/inventories/net-mozaws-prod-delivery-$LISTING/delivery-$LISTING/"

Download the latest manifest:

aws --no-sign-request --region us-east-1 s3 cp s3://net-mozaws-prod-delivery-inventory-us-east-1/public/inventories/net-mozaws-prod-delivery-$LISTING/delivery-$LISTING/2017-08-02T00-11Z/manifest.json

Download the associated files (using jq):

files=$(jq -r '.files[] | .key' < 2017-08-01T00-12Z/manifest.json)
for file in $files; do
    aws --no-sign-request --region us-east-1 s3 cp "s3://net-mozaws-prod-delivery-inventory-us-east-1/public/$file" .
done

Initialize the remote server from a manifest that will define the buckets, collection, records schema, and related permissions. This command is idempotent, and will only modify existing objects if something was changed.

kinto-wizard load --server https://kinto/ --auth user:pass jobs/buildhub/initialization.yml

Parse S3 inventory, fetch metadata, and print records as JSON in stdout:

zcat *.csv.gz | inventory-to-records > records.data

Load records into Kinto:

cat records.data | to-kinto --server https://kinto/ --bucket build-hub --collection release --auth user:pass

Repeat with LISTING=firefox.

Note

All three commands can be piped together with their respective parameters:

zcat *.csv.gz | inventory-to-records | to-kinto