James Thomas

Notes on software.

Using Custom Domains With IBM Cloud Functions

In this tutorial, I’m going to show you how to use a custom domain for serverless functions exposed as APIs on IBM Cloud. APIs endpoints use a random sub-domain on IBM Cloud by default. Importing your own domains means endpoints can be accessible through custom URLs.

Registering a custom domain with IBM Cloud needs you to complete the following steps…

This tutorial assumes you already have actions on IBM Cloud Functions exposed as HTTP APIs using the built-in API service. If you haven’t done that yet, please see the documentation here before you proceed.

The instructions below set up a sub-domain (api.<YOUR_DOMAIN>) to access serverless functions.

Generating SSL/TLS Certificates with Let’s Encrypt

IBM Cloud APIs only supports HTTPS traffic with custom domains. Users needs to upload valid SSL/TLS certificates for those domains to IBM Cloud before being able to use them.

Let’s Encrypt is a Certificate Authority which provides free SSL/TLS certificates for domains. Let’s Encrypt is trusted by all root identify providers. This means certificates generated by this provider will be trusted by all major operating systems, web browsers, and devices.

Using this service, valid certificates can be generated to support custom domains on IBM Cloud.

domain validation

Let’s Encrypt needs to verify you control the domain before generating certificates.

During the verification process, the user makes an authentication token available through the domain. The service supports numerous methods for exposing the authentication token, including HTTP endpoints, DNS TXT records or TLS SNI.

There is an application (certbot) which automates generating authentication tokens and certificates.

I’m going to use the DNS TXT record as the challenge mechanism. Using this approach, certbot will provide a random authentication token I need to create as the TXT record value under the _acme-challenge.<YOUR_DOMAIN> sub-domain before validation.

using certbot with dns txt validation

1
brew install certbot
1
certbot certonly --manual --preferred-challenges=dns -d *.<YOUR_DOMAIN>

I’m generating a wildcard certificate for any sub-domains under <YOUR_DOMAIN>. This allows me to use the same certificate with different sub-domains on IBM Cloud, rather than generating a certificate per sub-domain.

During the validation process, certbot should display the following message with the challenge token.

1
2
3
4
5
6
7
8
Please deploy a DNS TXT record under the name
_acme-challenge.<YOUR_DOMAIN> with the following value:

<CHALLENGE_TOKEN>

Before continuing, verify the record is deployed.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Press Enter to Continue

setting challenge token

  • Take the challenge token from certbot and create a new TXT record with this value for the _acme-challenge.<YOUR_DOMAIN> sub-domain.

  • Use the dig command to verify the TXT record is available.

1
dig -t txt _acme-challenge.<YOUR_DOMAIN>

The challenge token should be available in the DNS response shown by dig. ๐Ÿ‘

1
2
;; ANSWER SECTION:
_acme-challenge.<YOUR_DOMAIN>. 3599 IN  TXT "<CHALLENGE_TOKEN>"
  • Press Enter in the terminal session running certbot when the challenge token is available.

retrieving domain certificates

certbot will now retrieve the TXT record for the sub-domain and verify it matches the challenge token. If the domain has been validated, certbot will show the directory containing the newly created certificates.

1
2
3
4
5
6
7
IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/<YOUR_DOMAIN>/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/<YOUR_DOMAIN>/privkey.pem
   Your cert will expire on 2019-03-03.
...

certbot creates the following files.

  • cert.pem - public domain certificate
  • privkey.pem - private key for domain certificate
  • chain.pem - intermediate domain certificates
  • fullchain.pem - public and intermediate domain certificates in a single file.

Registering the domain with IBM Cloud will require the public, private and intermediate certificate files.

Registering Custom Domain with IBM Cloud

Certificates for custom domains in IBM Cloud are managed by the Certificate Manager service.

  • Create a new instance of the service from the IBM Cloud Catalog.
  • From the service homepage, click the ”Import Certificate” button.
  • Fill in the following fields in the import form. Use the generated certificate files in the upload fields.
    • Name
    • Certificate File (cert.pem)
    • Private key file (privkey.pem)
    • Intermediate certificate file (chain.pem)

After importing the certificate, check the certificate properties match the expected values

Binding Domain to IBM Cloud Functions APIs

Custom domains for APIs on IBM Cloud are managed through the IBM Cloud APIs console.

  • Open the ”Custom Domains” section on the IBM Cloud APIs console.
  • Check the “Region” selector matches the region chosen for your actions and APIs.
  • Click the ยทยทยท icon on the row where “Organisation” and “Space” values match your APIs.
  • Click ”Change Settings” from the pop-up menu.

domain validation

IBM Cloud now needs to verify you control the custom domain being used.

Another DNS TXT record needs to be created before attempting to bind the domain.

  • From the ”Custom Domain Settings” menu, make a note of the ”Default domain / alias” value. This should be in the format: <APP_ID>.<REGION>.apiconnect.appdomain.cloud.
  • Create a new TXT record for the custom sub-domain (api.<YOUR_DOMAIN>) with the default domain alias as the record value (<APP_ID>.<REGION>.apiconnect.appdomain.cloud).
  • Use the dig command to check the sub-domain TXT record exists and contains the correct value.
1
dig -t txt api.<YOUR_DOMAIN>

The default domain alias value should be available in the DNS response shown by dig. ๐Ÿ‘

1
2
;; ANSWER SECTION:
api.<YOUR_DOMAIN>. 3599 IN  TXT "<APP_ID>.<REGION>.apiconnect.appdomain.cloud"

Having created the TXT record, fill in the Custom Domain Settings form.

custom domain settings

  • Select the ”Assign custom domain” checkbox in the ”Custom domain settings” form.
  • Fill in the following form fields.
    • Domain Name: use the custom sub-domain to bind (api.<YOUR-DOMAIN>).
    • Certificate Manager service: select the certificate manger instance.
    • Certificate: select the domain certificate from the drop-down menu.
  • Click the ”Save” button.

Once the domain has been validated, the form will redirect to the custom domains overview. The “Custom Domain” field will now show the sub-domain bound to the correct default domain alias.

add CNAME record

  • Remove the existing TXT record for the custom sub-domain (api.<YOUR-DOMAIN>).
  • Add a new CNAME record mapping the custom sub-domain (api.<YOUR-DOMAIN>) to the ”Default domain / alias” on IBM Cloud (<APP_ID>.<REGION>.apiconnect.appdomain.cloud).
  • Use the dig command to check the CNAME record is correct.
1
dig -t CNAME api.<YOUR_DOMAIN>

The default domain alias value should be available in the DNS response shown by dig. ๐Ÿ‘

1
2
;; ANSWER SECTION:
api.<YOUR_DOMAIN>.  3599    IN  CNAME   <APP_ID>.<REGION>.apiconnect.appdomain.cloud.

Testing It Out

Functions should now be accessible through both the default domain alias and the new custom domain. ๐Ÿ‘

  • Invoke the default domain alias API URL for the function.
1
curl https://<APP_ID>.<REGION>.apiconnect.appdomain.cloud/<BASE_PATH>/<SUB_PATH> 

Both the BASE_PATH and SUB_PATH values come from the API definitions configured by the user.

  • Invoke the custom domain API URL for the function.
1
curl https://api.<YOUR_DOMAIN>/<BASE_PATH>/<SUB_PATH> 

Make sure you use HTTPS protocol in the URL. IBM Cloud does not support HTTP traffic with custom domains.

Both responses for these URLs should the same! Hurrah. ๐Ÿ˜Ž

Finding Photos on Twitter Using Face Recognition With TensorFlow.js

As a developer advocate, I spend a lot of time at developer conferences (talking about serverless ๐Ÿ˜Ž). Upon returning from each trip, I need to compile a “trip report” on the event for my bosses. This helps demonstrate the value in attending events and that I’m not just accruing air miles and hotel points for fun… ๐Ÿ›ซ๐Ÿจ

I always include any social media content people post about my talks in the trip report. This is usually tweets with photos of me on stage. If people are tweeting about your session, I assume they enjoyed it and wanted to share with their followers.

Finding tweets with photos about your talk from attendees is surprisingly challenging.

Attendees often forget to include your twitter username in their tweets. This means the only way to find those photos is to manually scroll through all the results from the conference hashtag. This is problematic at conferences with thousands of attendees all tweeting during the event. #devrelproblems.

Having become bored of manually trawling through all the tweets for each conference, I had a thought…

“Can’t I write some code to do this for me?”

This didn’t seem like too ridiculous an idea. Twitter has an API, which would allow me to retrieve all tweets for a conference hashtag. Once I had all the tweet photos, couldn’t I run some magic AI algorithm over the images to tell me if I was in them? ๐Ÿค”

After a couple of weeks of hacking around (and overcoming numerous challenges) I had (to my own amazement) managed to build a serverless application which can find unlabelled photos of a person on twitter using machine learning with TensorFlow.js.

FindMe Example

If you just want to try this application yourself, follow the instructions in the Github repo: https://github.com/jthomas/findme

architecture

FindMe Architecture Diagram

This application has four serverless functions (two API handlers and two backend services) and a client-side application from a static web page. Users log into the client-side application using Auth0 with their Twitter account. This provides the backend application with the user’s profile image and Twitter API credentials.

When the user invokes a search query, the client-side application invokes the API endpoint for the register_search function with the query terms and twitter credentials. This function registers a new search job in Redis and fires a new search_request trigger event with the query and job id. This job identifier is returned to the client to poll for real-time status updates.

The twitter_search function is connected to the search_request trigger and invoked for each event. It uses the Twitter Search API to retrieve all tweets for the search terms. If tweets retrieved from the API contain photos, those tweet ids (with photo urls) are fired as new tweet_image trigger events.

The compare_images function is connected to the tweet_image trigger. When invoked, it downloads the user’s twitter profile image along with the tweet image and runs face detection against both images, using the face-api.js library. If any faces in the tweet photo match the face in the user’s profile image, tweet ids are written to Redis before exiting.

The client-side web page polls for real-time search results by polling the API endpoint for the search_status function with the search job id. Tweets with matching faces are displayed on the web page using the Twitter JS library.

challenges

Since I had found an NPM library to handle face detection, I could just use this on a serverless platform by including the library within the zip file used to create my serverless application? Sounds easy, right?!

ahem - not so faas-t…. โœ‹

As discussed in previous blog posts, there are numerous challenges in using TF.js-based libraries on serverless platforms. Starting with making the packages available in the runtime and loading model files to converting images for classification, these libraries are not like using normal NPM modules.

Here are the main challenges I had to overcome to make this serverless application work…

using tf.js libraries on a serverless platform

The Node.js backend drivers for TensorFlow.js use a native shared C++ library (libtensorflow.so) to execute models on the CPU or GPU. This native dependency is compiled for the platform during the npm install process. The shared library file is around 142MB, which is too large to include in the deployment package for most serverless platforms.

Normal workarounds for this issue store large dependencies in an object store. These files are dynamically retrieved during cold starts and stored in the runtime filesystem, as shown in this pseudo-code. This workaround does add an additional delay to cold start invocations.

1
2
3
4
5
6
7
8
9
10
11
let cold_start = false

const library = 'libtensorflow.so'

if (cold_start) {
  const data = from_object_store(library)
  write_to_fs(library, data)
  cold_start = true
}

// rest of function codeโ€ฆ

Fortunately, I had a better solution using Apache OpenWhisk’s support for custom Docker runtimes!

This feature allows serverless applications to use custom Docker images as the runtime environment. Creating custom images with large libraries pre-installed means they can be excluded from deployment packages. ๐Ÿ’ฏ

Apache OpenWhisk publishes all existing runtime images on Docker Hub. Using existing runtime images as base images means Dockerfiles for custom runtimes are minimal. Here’s the Dockerfile needed to build a custom runtime with the TensorFlow.js Node.js backend drivers pre-installed.

1
2
3
FROM openwhisk/action-nodejs-v8:latest

RUN npm install @tensorflow/tfjs-node

Once this image has been built and published on Dockerhub, you can use it when creating new functions.

I used this approach to build a custom TensorFlow.js runtime which is available on Docker Hub: jamesthomas/action-nodejs-v8:tfjs-faceapi

OpenWhisk actions created using the wsk command-line use a configuration flag (--docker) to specify custom runtime images.

1
wsk action create classify source.js --docker jamesthomas/action-nodejs-v8:tfjs-faceapi

The OpenWhisk provider plugin for The Serverless Framework also supports custom runtime images through a configuration parameter (image) under the function configuration.

1
2
3
4
5
6
7
8
9
service: machine-learning

provider:
  name: openwhisk

functions:
  classify:
    handler: source.main
    image: jamesthomas/action-nodejs-v8:tfjs-faceapi

Having fixed the issue of library loading on serverless platforms, I could move onto the next problem, loading the pre-trained models… ๐Ÿ’ฝ

loading pre-trained models

Running the example code to load the pre-trained models for face recognition gave me this error:

1
ReferenceError: fetch is not defined

In the previous blog post, I discovered how to manually load TensorFlow.js models from the filesystem using the file:// URI prefix. Unfortunately, the face-api.js library doesn’t support this feature. Models are automatically loaded using the fetch HTTP client. This HTTP client is available into modern browsers but not in the Node.js runtime.

Overcoming this issue relies on providing an instance of a compatible HTTP client in the runtime. The node-fetch library is a implementation of the fetch client API for the Node.js runtime. By manually installing this module and exporting as a global variable, the library can then use the HTTP client as expected.

1
2
// Make HTTP client available in runtime
global.fetch = require('node-fetch')

Model configuration and weight files can then be loaded from the library’s Github repository using this URL:

https://raw.githubusercontent.com/justadudewhohacks/face-api.js/master/weights/

1
faceapi.loadFaceDetectionModel('<GITHUB_URL>')

face detection in images

The face-api.js library has a utility function (models.allFaces) to automatically detect and calculate descriptors for all faces found in an image. Descriptors are a feature vector (of 128 32-bit float values) which uniquely describes the characteristics of a persons face.

1
const results = await models.allFaces(input, minConfidence)

The input to this function is the input tensor with the RGB values from an image. In a previous blog post, I explained how to convert an image from the filesystem in Node.js to the input tensor needed by the model.

Finding a user by comparing their twitter profile against photos from tweets starts by running face detection against both images. By comparing computed descriptor values, a measure of similarity can be established between faces from the images.

face comparison

Once the face descriptors have been calculated the library provides a utility function to compute the euclidian distance between two descriptors vectors. If the difference between two face descriptors is less than a threshold value, this is used to identify the same person in both images.

1
2
3
4
5
6
const distance = faceapi.euclideanDistance(descriptor1, descriptor2)

if (distance < 0.6)
  console.log('match')
else
  console.log('no match')

I’ve no idea why 0.6 is chosen as the threshold value but this seemed to work for me! Even small changes to this value dramatically reduced the precision and recall rates for my test data. I’m calling it the Goldilocks value, just use it…

performance

Once I had the end to end application working, I wanted to make it was fast as possible. By optimising the performance, I could improve the application responsiveness and reduce compute costs for my backend. Time is literally money with serverless platforms.

baseline performance

Before attempting to optimise my application, I needed to understand the baseline performance. Setting up experiments to record invocation durations gave me the following average test results.

  • Warm invocations: ~5 seconds
  • Cold invocations: ~8 seconds

Instrumenting the code with console.time statements revealed execution time was comprised of five main sections.

Cold Starts Warm Starts
Initialisation 1200 ms 0 ms
Model Loading 3200 ms 2000 ms
Image Loading 500 ms x 2 500 ms x 2
Face Detection 700 ms - 900 ms x 2 700 ms - 900 ms x 2
Everything Else 1000 ms 500 ms
Total Duration ~ 8 seconds ~ 5 seconds

Initialisation was the delay during cold starts to create the runtime environment and load all the library files and application code. Model Loading recorded the time spent instantiating the TF.js models from the source files. Image Loading was the time spent converting the RGB values from images into input tensors, this happened twice, once for the twitter profile picture and again for the tweet photo. Face Detection is the elapsed time to execute the models.allFaces method and faceapi.euclideanDistance methods for all the detected faces. Everything else is well… everything else.

Since model loading was the largest section, this seemed like an obvious place to start optimising. ๐Ÿ“ˆ๐Ÿ“‰

loading model files from disk

Overcoming the initial model loading issue relied on manually exposing the expected HTTP client in the Node.js runtime. This allowed models to be dynamically loaded (over HTTP) from the external Github repository. Models files were about 36MB.

My first idea was to load these model files from the filesystem, which should be much faster than downloading from Github. Since I was already building a custom Docker runtime, it was a one-line change to include the model files within the runtime filesystem.

1
2
3
4
5
FROM openwhisk/action-nodejs-v8:latest

RUN npm install @tensorflow/tfjs-node

COPY weights weights

Having re-built the image and pushed to Docker Hub, the classification function’s runtime environment now included models files in the filesystem.

But how do we make the face-api.js library load models files from the filesystem when it is using a HTTP client?

My solution was to write a fetch client that proxied calls to retrieve files from a HTTP endpoint to the local filesystem. ๐Ÿ˜ฑ I’d let you decide whether this is a brilliant or terrible idea!

1
2
3
4
5
6
7
8
global.fetch = async (file) => {
  return {
    json: () => JSON.parse(fs.readFileSync(file, 'utf8')),
    arrayBuffer: () => fs.readFileSync(file)
  }
}

const model = await models.load('/weights')

The face-api.js library only used two methods (json() & arrayBuffer()) from the HTTP client. Stubbing out these methods to proxy fs.readFileSync meant files paths were loaded from the filesystem. Amazingly, this seemed to just work, hurrah!

Implementing this feature and re-running performance tests revealed this optimisation saved about 500 ms from the Model Loading section.

Cold Starts Warm Starts
Initialisation 1200 ms 0 ms
Model Loading 2700 ms 1500 ms
Image Loading 500 ms x 2 500 ms x 2
Face Detection 700 ms - 900 ms x 2 700 ms - 900 ms x 2
Everything Else 1000 ms 500 ms
Total Duration ~ 7.5 seconds ~ 4.5 seconds

This was less of an improvement than I’d expected. Parsing all the model files and instantiating the internal objects was more computationally intensive than I realised. This performance improvement did improve both cold and warm invocations, which was a bonus.

Despite this optimisation, model loading was still the largest section in the classification function…

caching loaded models

There’s a good strategy to use when optimising serverless functions…

CACHE ALL THE THINGS

Serverless runtimes re-use runtime containers for consecutive requests, known as warm environments. Using local state, like global variables or the runtime filesystem, to cache data between requests can be used to improve performance during those invocations.

Since model loading was such an expensive process, I wanted to cache initialised models. Using a global variable, I could control whether to trigger model loading or return the pre-loaded models. Warm environments would re-use pre-loaded models and remove model loading delay.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
const faceapi = require('face-api.js')

let LOADED = false

exports.load = async location => {
  if (!LOADED) {
    await faceapi.loadFaceDetectionModel(location)
    await faceapi.loadFaceRecognitionModel(location)
    await faceapi.loadFaceLandmarkModel(location)

    LOADED = true
  }

  return faceapi
}

This performance improvement had a significant impact of the performance for warm invocations. Model loading became “free”. ๐Ÿ‘

Cold Starts Warm Starts
Initialisation 1200 ms 0 ms
Model Loading 2700 ms 0 ms
Image Loading 500 ms x 2 500 ms x 2
Face Detection 700 ms - 900 ms x 2 700 ms - 900 ms x 2
Everything Else 1000 ms 500 ms
Total Duration ~ 7.5 seconds ~ 3 seconds

caching face descriptors

In the initial implementation, the face comparison function was executing face detection against both the user’s twitter profile image and tweet photo for comparison. Since the twitter profile image was the same in each search request, running face detection against this image would always return the same results.

Rather than having this work being redundantly computed in each function, caching the results of the computed face descriptor for the profile image meant it could re-used across invocations. This would reduce by 50% the work necessary in the Image & Model Loading sections.

The face-api.js library returns the face descriptor as a typed array with 128 32-bit float values. Encoding this values as a hex string allows them to be stored and retrieved from Redis. This code was used to convert float values to hex strings, whilst maintaining the exact precision of those float values.

1
2
3
4
5
6
7
8
9
10
11
const encode = typearr => {
  const encoded = Buffer.from(typearr.buffer).toString('hex')
  return encoded
}

const decode = encoded => {
  const decoded = Buffer.from(encoded, 'hex')
  const uints = new Uint8Array(decoded)
  const floats = new Float32Array(uints.buffer)
  return floats
}

This optimisation improves the performance of most cold invocations and all warm invocations, removing over 1200 ms of computation time to compute the results.

Cold Starts (Cached) Warm Starts
Initialisation 1200 ms 0 ms |
Model Loading 2700 ms 1500 ms |
Image Loading 500 ms 500 ms |
Face Detection 700 ms - 900 ms 700 ms - 900 ms |
Everything Else 1000 ms 500 ms |
Total Duration ~ 6 seconds ~ 2.5 seconds |
Cold Starts Warm Starts
Initialisation 1200 ms 0 ms
Model Loading 2700 ms 0 ms
Image Loading 500 ms 500 ms
Face Detection 700 ms - 900 ms 700 ms - 900 ms
Everything Else 1000 ms 500 ms
Total Duration ~ 7.5 seconds ~ 3 seconds

final results + cost

Application performance was massively improved with all these optimisations. As demonstrated in the video above, the application could process tweets in real-time, returning almost instant results. Average invocation durations were now.

  • Warm invocations: ~2.5 seconds
  • Cold invocations (Cached): ~6 seconds

Serverless platforms charge for compute time by the millisecond, so these improvements led to cost savings of 25% for cold invocations (apart the first classification for a user) and 50% for warm invocations.

Classification functions used 512MB of RAM which meant IBM Cloud Functions would provide 320,000 “warm” classifications or 133,333 “cold” classifications within the free tier each month. Ignoring the free tier, 100,000 “warm” classifications would cost $5.10 and 100,000 “cold” classifications $2.13.

conclusion

Using TensorFlow.js with serverless cloud platforms makes it easy to build scalable machine learning applications in the cloud. Using the horizontal scaling capabilities of serverless platforms, thousands of model classifications can be ran in parallel. This can be more performant than having dedicated hardware with a GPU, especially with compute costs for serverless applications being so cheap.

TensorFlow.js is ideally suited to serverless application due to the JS interface, (relatively) small library size and availability of pre-trained models. Despite having no prior experience in Machine Learning, I was able to use the library to build a face recognition pipeline, processing 100s of images in parallel, for real-time results. This amazing library opens up machine learning to a whole new audience!

Serverless Machine Learning With TensorFlow.js

In a previous blog post, I showed how to use TensorFlow.js on Node.js to run visual recognition on images from the local filesystem. TensorFlow.js is a JavaScript version of the open-source machine learning library from Google.

Once I had this working with a local Node.js script, my next idea was to convert it into a serverless function. Running this function on IBM Cloud Functions (Apache OpenWhisk) would turn the script into my own visual recognition microservice.

Sounds easy, right? It’s just a JavaScript library? So, zip it up and away we go… ahem ๐Ÿ‘Š

Converting the image classification script to run in a serverless environment had the following challenges…

  • TensorFlow.js libraries need to be available in the runtime.
  • Native bindings for the library must be compiled against the platform architecture.
  • Models files need to be loaded from the filesystem.

Some of these issues were more challenging than others to fix! Let’s start by looking at the details of each issue, before explaining how Docker support in Apache OpenWhisk can be used to resolve them all.

Challenges

TensorFlow.js Libraries

TensorFlow.js libraries are not included in the Node.js runtimes provided by the Apache OpenWhisk.

External libraries can be imported into the runtime by deploying applications from a zip file. Custom node_modules folders included in the zip file will be extracted in the runtime. Zip files are limited to a maximum size of 48MB.

Library Size

Running npm install for the TensorFlow.js libraries used revealed the first problem… the resulting node_modules directory was 175MB. ๐Ÿ˜ฑ

Looking at the contents of this folder, the tfjs-node module compiles a native shared library (libtensorflow.so) that is 135M. This means no amount of JavaScript minification is going to get those external dependencies under the magic 48 MB limit. ๐Ÿ‘Ž

Native Dependencies

The libtensorflow.so native shared library must be compiled using the platform runtime. Running npm install locally automatically compiles native dependencies against the host platform. Local environments may use different CPU architectures (Mac vs Linux) or link against shared libraries not available in the serverless runtime.

MobileNet Model Files

TensorFlow models files need loading from the filesystem in Node.js. Serverless runtimes do provide a temporary filesystem inside the runtime environment. Files from deployment zip files are automatically extracted into this environment before invocations. There is no external access to this filesystem outside the lifecycle of the serverless function.

Models files for the MobileNet model were 16MB. If these files are included in the deployment package, it leaves 32MB for the rest of the application source code. Although the model files are small enough to include in the zip file, what about the TensorFlow.js libraries? Is this the end of the blog post? Not so fast….

Apache OpenWhisk’s support for custom runtimes provides a simple solution to all these issues!

Custom Runtimes

Apache OpenWhisk uses Docker containers as the runtime environments for serverless functions (actions). All platform runtime images are published on Docker Hub, allowing developers to start these environments locally.

Developers can also specify custom runtime images when creating actions. These images must be publicly available on Docker Hub. Custom runtimes have to expose the same HTTP API used by the platform for invoking actions.

Using platform runtime images as parent images makes it simple to build custom runtimes. Users can run commands during the Docker build to install additional libraries and other dependencies. The parent image already contains source files with the HTTP API service handling platform requests.

TensorFlow.js Runtime

Here is the Docker build file for the Node.js action runtime with additional TensorFlow.js dependencies.

1
2
3
4
5
FROM openwhisk/action-nodejs-v8:latest

RUN npm install @tensorflow/tfjs @tensorflow-models/mobilenet @tensorflow/tfjs-node jpeg-js

COPY mobilenet mobilenet

openwhisk/action-nodejs-v8:latest is the Node.js action runtime image published by OpenWhisk.

TensorFlow libraries and other dependencies are installed using npm install in the build process. Native dependencies for the @tensorflow/tfjs-node library are automatically compiled for the correct platform by installing during the build process.

Since I’m building a new runtime, I’ve also added the MobileNet model files to the image. Whilst not strictly necessary, removing them from the action zip file reduces deployment times.

Want to skip the next step? Use this image jamesthomas/action-nodejs-v8:tfjs rather than building your own.

Building The Runtime

In the previous blog post, I showed how to download model files from the public storage bucket.

  • Download a version of the MobileNet model and place all files in the mobilenet directory.
  • Copy the Docker build file from above to a local file named Dockerfile.
  • Run the Docker build command to generate a local image.
1
docker build -t tfjs .
1
docker tag tfjs <USERNAME>/action-nodejs-v8:tfjs

Replace <USERNAME> with your Docker Hub username.

1
 docker push <USERNAME>/action-nodejs-v8:tfjs

Once the image is available on Docker Hub, actions can be created using that runtime image. ๐Ÿ˜Ž

Example Code

This source code implements image classification as an OpenWhisk action. Image files are provided as a Base64 encoded string using the image property on the event parameters. Classification results are returned as the results property in the response.

Caching Loaded Models

Serverless platforms initialise runtime environments on-demand to handle invocations. Once a runtime environment has been created, it will be re-used for further invocations with some limits. This improves performance by removing the initialisation delay (“cold start”) from request processing.

Applications can exploit this behaviour by using global variables to maintain state across requests. This is often use to cache opened database connections or store initialisation data loaded from external systems.

I have used this pattern to cache the MobileNet model used for classification. During cold invocations, the model is loaded from the filesystem and stored in a global variable. Warm invocations then use the existence of that global variable to skip the model loading process with further requests.

Caching the model reduces the time (and therefore cost) for classifications on warm invocations.

Memory Leak

Running the Node.js script from blog post on IBM Cloud Functions was possible with minimal modifications. Unfortunately, performance testing revealed a memory leak in the handler function. ๐Ÿ˜ข

Reading more about how TensorFlow.js works on Node.js uncovered the issue…

TensorFlow.js’s Node.js extensions use a native C++ library to execute the Tensors on a CPU or GPU engine. Memory allocated for Tensor objects in the native library is retained until the application explicitly releases it or the process exits. TensorFlow.js provides a dispose method on the individual objects to free allocated memory. There is also a tf.tidy method to automatically clean up all allocated objects within a frame.

Reviewing the code, tensors were being created as model input from images on each request. These objects were not disposed before returning from the request handler. This meant native memory grew unbounded. Adding an explicit dispose call to free these objects before returning fixed the issue.

Profiling & Performance

Action code records memory usage and elapsed time at different stages in classification process.

Recording memory usage allows me to modify the maximum memory allocated to the function for optimal performance and cost. Node.js provides a standard library API to retrieve memory usage for the current process. Logging these values allows me to inspect memory usage at different stages.

Timing different tasks in the classification process, i.e. model loading, image classification, gives me an insight into how efficient classification is compared to other methods. Node.js has a standard library API for timers to record and print elapsed time to the console.

Demo

Deploy Action

  • Run the following command with the IBM Cloud CLI to create the action.
1
ibmcloud fn action create classify --docker <IMAGE_NAME> index.js

Replace <IMAGE_NAME> with the public Docker Hub image identifier for the custom runtime. Use jamesthomas/action-nodejs-v8:tfjs if you haven’t built this manually.

Testing It Out

1
wget http://bit.ly/2JYSal9 -O panda.jpg
  • Invoke the action with the Base64 encoded image as an input parameter.
1
 ibmcloud fn action invoke classify -r -p image $(base64 panda.jpg)
  • Returned JSON message contains classification probabilities. ๐Ÿผ๐Ÿผ๐Ÿผ
1
2
3
4
5
6
{
  "results":  [{
    className: 'giant panda, panda, panda bear, coon bear',
    probability: 0.9993536472320557
  }]
}

Activation Details

  • Retrieve logging output for the last activation to show performance data.
1
ibmcloud fn activation logs --last

Profiling and memory usage details are logged to stdout

1
2
3
4
5
6
7
8
9
10
11
12
13
14
prediction function called.
memory used: rss=150.46 MB, heapTotal=32.83 MB, heapUsed=20.29 MB, external=67.6 MB
loading image and model...
decodeImage: 74.233ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.05 MB, external=40.63 MB
imageByteArray: 5.676ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.05 MB, external=45.51 MB
imageToInput: 5.952ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.06 MB, external=45.51 MB
mn_model.classify: 274.805ms
memory used: rss=149.83 MB, heapTotal=24.33 MB, heapUsed=20.57 MB, external=45.51 MB
classification results: [...]
main: 356.639ms
memory used: rss=144.37 MB, heapTotal=24.33 MB, heapUsed=20.58 MB, external=45.51 MB

main is the total elapsed time for the action handler. mn_model.classify is the elapsed time for the image classification. Cold start requests print an extra log message with model loading time, loadModel: 394.547ms.

Performance Results

Invoking the classify action 1000 times for both cold and warm activations (using 256MB memory) generated the following performance results.

warm invocations

Classifications took an average of 316 milliseconds to process when using warm environments. Looking at the timing data, converting the Base64 encoded JPEG into the input tensor took around 100 milliseconds. Running the model classification task was in the 200 - 250 milliseconds range.

cold invocations

Classifications took an average of 1260 milliseconds to process when using cold environments. These requests incur penalties for initialising new runtime containers and loading models from the filesystem. Both of these tasks took around 400 milliseconds each.

One disadvantage of using custom runtime images in Apache OpenWhisk is the lack of pre-warmed containers. Pre-warming is used to reduce cold start times by starting runtime containers before they are needed. This is not supported for non-standard runtime images.

classification cost

IBM Cloud Functions provides a free tier of 400,000 GB/s per month. Each further second of execution is charged at $0.000017 per GB of memory allocated. Execution time is rounded up to the nearest 100ms.

If all activations were warm, a user could execute more than 4,000,000 classifications per month in the free tier using an action with 256MB. Once outside the free tier, around 600,000 further invocations would cost just over $1.

If all activations were cold, a user could execute more than 1,2000,000 classifications per month in the free tier using an action with 256MB. Once outside the free tier, around 180,000 further invocations would cost just over $1.

Conclusion

TensorFlow.js brings the power of deep learning to JavaScript developers. Using pre-trained models with the TensorFlow.js library makes it simple to extend JavaScript applications with complex machine learning tasks with minimal effort and code.

Getting a local script to run image classification was relatively simple, but converting to a serverless function came with more challenges! Apache OpenWhisk restricts the maximum application size to 50MB and native libraries dependencies were much larger than this limit.

Fortunately, Apache OpenWhisk’s custom runtime support allowed us to resolve all these issues. By building a custom runtime with native dependencies and models files, those libraries can be used on the platform without including them in the deployment package.

Machine Learning in Node.js With TensorFlow.js

TensorFlow.js is a new version of the popular open-source library which brings deep learning to JavaScript. Developers can now define, train, and run machine learning models using the high-level library API.

Pre-trained models mean developers can now easily perform complex tasks like visual recognition, generating music or detecting human poses with just a few lines of JavaScript.

Having started as a front-end library for web browsers, recent updates added experimental support for Node.js. This allows TensorFlow.js to be used in backend JavaScript applications without having to use Python.

Reading about the library, I wanted to test it out with a simple task… ๐Ÿง

Use TensorFlow.js to perform visual recognition on images using JavaScript from Node.js

Unfortunately, most of the documentation and example code provided uses the library in a browser. Project utilities provided to simplify loading and using pre-trained models have not yet been extended with Node.js support. Getting this working did end up with me spending a lot of time reading the Typescript source files for the library. ๐Ÿ‘Ž

However, after a few days’ hacking, I managed to get this completed! Hurrah! ๐Ÿคฉ

Before we dive into the code, let’s start with an overview of the different TensorFlow libraries.

TensorFlow

TensorFlow is an open-source software library for machine learning applications. TensorFlow can be used to implement neural networks and other deep learning algorithms.

Released by Google in November 2015, TensorFlow was originally a Python library. It used either CPU or GPU-based computation for training and evaluating machine learning models. The library was initially designed to run on high-performance servers with expensive GPUs.

Recent updates have extended the software to run in resource-constrained environments like mobile devices and web browsers.

TensorFlow Lite

Tensorflow Lite, a lightweight version of the library for mobile and embedded devices, was released in May 2017. This was accompanied by a new series of pre-trained deep learning models for vision recognition tasks, called MobileNet. MobileNet models were designed to work efficiently in resource-constrained environments like mobile devices.

TensorFlow.js

Following Tensorflow Lite, TensorFlow.js was announced in March 2018. This version of the library was designed to run in the browser, building on an earlier project called deeplearn.js. WebGL provides GPU access to the library. Developers use a JavaScript API to train, load and run models.

TensorFlow.js was recently extended to run on Node.js, using an extension library called tfjs-node.

The Node.js extension is an alpha release and still under active development.

Importing Existing Models Into TensorFlow.js

Existing TensorFlow and Keras models can be executed using the TensorFlow.js library. Models need converting to a new format using this tool before execution. Pre-trained and converted models for image classification, pose detection and k-nearest neighbours are available on Github.

Using TensorFlow.js in Node.js

Installing TensorFlow Libraries

TensorFlow.js can be installed from the NPM registry.

1
2
3
npm install @tensorflow/tfjs @tensorflow/tfjs-node
// or...
npm install @tensorflow/tfjs @tensorflow/tfjs-node-gpu

Both Node.js extensions use native dependencies which will be compiled on demand.

Loading TensorFlow Libraries

TensorFlow’s JavaScript API is exposed from the core library. Extension modules to enable Node.js support do not expose additional APIs.

1
2
3
4
5
const tf = require('@tensorflow/tfjs')
// Load the binding (CPU computation)
require('@tensorflow/tfjs-node')
// Or load the binding (GPU computation)
require('@tensorflow/tfjs-node-gpu')

Loading TensorFlow Models

TensorFlow.js provides an NPM library (tfjs-models) to ease loading pre-trained & converted models for image classification, pose detection and k-nearest neighbours.

The MobileNet model used for image classification is a deep neural network trained to identify 1000 different classes.

In the project’s README, the following example code is used to load the model.

1
2
3
4
import * as mobilenet from '@tensorflow-models/mobilenet';

// Load the model.
const model = await mobilenet.load();

One of the first challenges I encountered was that this does not work on Node.js.

1
Error: browserHTTPRequest is not supported outside the web browser.

Looking at the source code, the mobilenet library is a wrapper around the underlying tf.Model class. When the load() method is called, it automatically downloads the correct model files from an external HTTP address and instantiates the TensorFlow model.

The Node.js extension does not yet support HTTP requests to dynamically retrieve models. Instead, models must be manually loaded from the filesystem.

After reading the source code for the library, I managed to create a work-around…

Loading Models From a Filesystem

Rather than calling the module’s load method, if the MobileNet class is created manually, the auto-generated path variable which contains the HTTP address of the model can be overwritten with a local filesystem path. Having done this, calling the load method on the class instance will trigger the filesystem loader class, rather than trying to use the browser-based HTTP loader.

1
2
3
4
const path = "mobilenet/model.json"
const mn = new mobilenet.MobileNet(1, 1);
mn.path = `file://${path}`
await mn.load()

Awesome, it works!

But how where do the models files come from?

MobileNet Models

Models for TensorFlow.js consist of two file types, a model configuration file stored in JSON and model weights in a binary format. Model weights are often sharded into multiple files for better caching by browsers.

Looking at the automatic loading code for MobileNet models, models configuration and weight shards are retrieved from a public storage bucket at this address.

1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v${version}_${alpha}_${size}/

The template parameters in the URL refer to the model versions listed here. Classification accuracy results for each version are also shown on that page.

According to the source code, only MobileNet v1 models can be loaded using the tensorflow-models/mobilenet library.

The HTTP retrieval code loads the model.json file from this location and then recursively fetches all referenced model weights shards. These files are in the format groupX-shard1of1.

Downloading Models Manually

Saving all model files to a filesystem can be achieved by retrieving the model configuration file, parsing out the referenced weight files and downloading each weight file manually.

I want to use the MobileNet V1 Module with 1.0 alpha value and image size of 224 pixels. This gives me the following URL for the model configuration file.

1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/model.json

Once this file has been downloaded locally, I can use the jq tool to parse all the weight file names.

1
2
3
4
5
$ cat model.json | jq -r ".weightsManifest[].paths[0]"
group1-shard1of1
group2-shard1of1
group3-shard1of1
...

Using the sed tool, I can prefix these names with the HTTP URL to generate URLs for each weight file.

1
2
3
4
5
$ cat model.json | jq -r ".weightsManifest[].paths[0]" | sed 's/^/https:\/\/storage.googleapis.com\/tfjs-models\/tfjs\/mobilenet_v1_1.0_224\//'
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group1-shard1of1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group2-shard1of1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group3-shard1of1
...

Using the parallel and curl commands, I can then download all of these files to my local directory.

1
cat model.json | jq -r ".weightsManifest[].paths[0]" | sed 's/^/https:\/\/storage.googleapis.com\/tfjs-models\/tfjs\/mobilenet_v1_1.0_224\//' |  parallel curl -O

Classifying Images

This example code is provided by TensorFlow.js to demonstrate returning classifications for an image.

1
2
3
4
const img = document.getElementById('img');

// Classify the image.
const predictions = await model.classify(img);

This does not work on Node.js due to the lack of a DOM.

The classify method accepts numerous DOM elements (canvas, video, image) and will automatically retrieve and convert image bytes from these elements into a tf.Tensor3D class which is used as the input to the model. Alternatively, the tf.Tensor3D input can be passed directly.

Rather than trying to use an external package to simulate a DOM element in Node.js, I found it easier to construct the tf.Tensor3D manually.

Generating Tensor3D from an Image

Reading the source code for the method used to turn DOM elements into Tensor3D classes, the following input parameters are used to generate the Tensor3D class.

1
2
3
4
const values = new Int32Array(image.height * image.width * numChannels);
// fill pixels with pixel channel bytes from image
const outShape = [image.height, image.width, numChannels];
const input = tf.tensor3d(values, outShape, 'int32');

pixels is a 2D array of type (Int32Array) which contains a sequential list of channel values for each pixel. numChannels is the number of channel values per pixel.

Creating Input Values For JPEGs

The jpeg-js library is a pure javascript JPEG encoder and decoder for Node.js. Using this library the RGB values for each pixel can be extracted.

1
const pixels = jpeg.decode(buffer, true);

This will return a Uint8Array with four channel values (RGBA) for each pixel (width * height). The MobileNet model only uses the three colour channels (RGB) for classification, ignoring the alpha channel. This code converts the four channel array into the correct three channel version.

1
2
3
4
5
6
7
8
9
const numChannels = 3;
const numPixels = image.width * image.height;
const values = new Int32Array(numPixels * numChannels);

for (let i = 0; i < numPixels; i++) {
  for (let channel = 0; channel < numChannels; ++channel) {
    values[i * numChannels + channel] = pixels[i * 4 + channel];
  }
}

MobileNet Models Input Requirements

The MobileNet model being used classifies images of width and height 224 pixels. Input tensors must contain float values, between -1 and 1, for each of the three channels pixel values.

Input values for images of different dimensions needs to be re-sized before classification. Additionally, pixels values from the JPEG decoder are in the range 0 - 255, rather than -1 to 1. These values also need converting prior to classification.

TensorFlow.js has library methods to make this process easier but, fortunately for us, the tfjs-models/mobilenet library automatically handles this issue! ๐Ÿ‘

Developers can pass in Tensor3D inputs of type int32 and different dimensions to the classify method and it converts the input to the correct format prior to classification. Which means there’s nothing to do… Super ๐Ÿ•บ๐Ÿ•บ๐Ÿ•บ.

Obtaining Predictions

MobileNet models in Tensorflow are trained to recognise entities from the top 1000 classes in the ImageNet dataset. The models output the probabilities that each of those entities is in the image being classified.

The full list of trained classes for the model being used can be found in this file.

The tfjs-models/mobilenet library exposes a classify method on the MobileNet class to return the top X classes with highest probabilities from an image input.

1
const predictions = await mn_model.classify(input, 10);

predictions is an array of X classes and probabilities in the following format.

1
2
3
4
{
  className: 'panda',
  probability: 0.9993536472320557
}

Example

Having worked how to use the TensorFlow.js library and MobileNet models on Node.js, this script will classify an image given as a command-line argument.

source code

  • Save this script file and package descriptor to local files.

testing it out

  • Download the model files to a mobilenet directory using the instructions above.

  • Install the project dependencies using NPM

1
npm install
  • Download a sample JPEG file to classify
1
wget http://bit.ly/2JYSal9 -O panda.jpg

  • Run the script with the model file and input image as arguments.
1
node script.js mobilenet/model.json panda.jpg

If everything worked, the following output should be printed to the console.

1
2
3
4
classification results: [ {
    className: 'giant panda, panda, panda bear, coon bear',
    probability: 0.9993536472320557
} ]

The image is correctly classified as containing a Panda with 99.93% probability! ๐Ÿผ๐Ÿผ๐Ÿผ

Conclusion

TensorFlow.js brings the power of deep learning to JavaScript developers. Using pre-trained models with the TensorFlow.js library makes it simple to extend JavaScript applications with complex machine learning tasks with minimal effort and code.

Having been released as a browser-based library, TensorFlow.js has now been extended to work on Node.js, although not all of the tools and utilities support the new runtime. With a few days’ hacking, I was able to use the library with the MobileNet models for visual recognition on images from a local file.

Getting this working in the Node.js runtime means I now move on to my next idea… making this run inside a serverless function! Come back soon to read about my next adventure with TensorFlow.js. ๐Ÿ‘‹

Monitoring Dashboards With Kibana for IBM Cloud Functions

Following all the events from the World Cup can be hard. So many matches, so many goals. Rather than manually refreshing BBC Football to check the scores, I decided to created a Twitter bot that would automatically tweet out each goal.

The Twitter bot runs on IBM Cloud Functions. It is called once a minute to check for new goals, using the alarm trigger feed. If new goals have been scored, it calls another action to send the tweet messages.

Once it was running, I need to ensure it was working correctly for the duration of the tournament. Using the IBM Cloud Logging service, I built a custom monitoring dashboard to help to me recognise and diagnose issues.

The dashboard showed counts for successful and failed activations, when they occurred and a list of failed activations. If issues have occurred, I can retrieve the failed activation identifiers and investigate further.

Let’s walk through the steps used to create this dashboard to help you create custom visualisations for serverless applications running on IBM Cloud Functions…

IBM Cloud Logging

IBM Cloud Logging can be accessed using the link on the IBM Cloud Functions dashboard. This will open the logging service for the current organisation and space.

All activation records and application logs are automatically forwarded to the logging service by IBM Cloud Functions.

Log Message Fields

Activation records and application log messages have a number of common record fields.

  • activationId_str - activation identifier for log message.
  • timestamp - log draining time.
  • @timestamp - message ingestion time.
  • action_str - fully qualified action name

Log records for different message types are identified using the type field. This is either activation_record or user_logs for IBM Cloud Functions records.

Activation records have the following custom fields.

  • duration_int - activation duration in milliseconds
  • status_str - activation status response (non-zero for errors)
  • message - activation response returned from action
  • time_date - activation record start time
  • end_date - activation record end time

Applications log lines, written to stdout or stderr, are forwarded as individual records. One application log line per record. Log message records have the following custom fields.

  • message - single application log line output
  • stream_str - log message source, either stdout or stderr
  • time_date - timestamp parsed from application log line

Finding Log Messages For One Activation

Use this query string in the ”Discover tab to retrieve all logs messages from a particular activation.

1
activationId_str: <ACTIVATION_ID>

Search queries are executed against log records within a configurable time window.

Monitoring Dashboard

This is the monitoring dashboard I created. It contains visualisations showing counts for successful and failed activations, histograms of when they occurred and a list of the recent failed activation identifiers.

It allows me to quickly review the previous 24 hours activations for issues. If there are notable issues, I can retrieve the failed activation identifiers and investigate further.

Before being able to create the dashboard, I needed to define two resources: saved searches and visualisations.

Saved Searches

Kibana supports saving and referring to search queries from visualisations using explicit names.

Using saved searches with visualisations, rather than explicit queries, removes the need to manually update visualisations’ configuration when queries change.

This dashboard uses two custom queries in visualisations. Queries are needed to find activation records from both successful and failed invocations.

  • Create a new “Saved Search” named “activation records (success)” using the following search query.
1
type: activation_record AND status_str: 0
  • Create a new “Saved Search” named “activation records (failed)” using the following search query.
1
type: activation_record AND NOT status_str: 0

The status_str field is set to a non-zero value for failures. Using the type field ensures log messages from other sources are excluded from the results.

Indexed Fields

Before referencing log record fields in visualisations, those fields need to be indexed correctly. Use these instructions to verify activation records fields are available.

  • Check IBM Cloud Functions logs are available in IBM Cloud Logging using the ”Discover” tab.
  • Click the “โš™๏ธ (Management)” menu item on the left-hand drop-down menu in IBM Cloud Logging.
  • Click the ”Index Patterns” link.
  • Click the ๐Ÿ”„ button to refresh the field list.

Visualisations

Three types of visualisation are used on the monitoring dashboard. Metric displays are used for the activation counts, vertical bar charts for the activation times and a data table to list failed activations.

Visualisations can be created by opening the “Visualize” menu item and select a new visualisation type under the “Create New Visualization” menu.

Create five different visualisations, using the instructions below, before moving on to create the dashboard.

Activation Counts

Counts for successful and failed activations are displayed as singular metric values.

  • Select the “Metric” visualisation from the visualisation type list.
  • Use the “activation records (success)” saved search as the data source.
  • Ensure the Metric Aggregation is set to “Count”
  • Set the “Font Size” under the Options menu to 120pt.
  • Save the visualisation as “Activation Counts (Success)”

  • Repeat this process to create the failed activation count visualisation.
  • Use the “activation records (failed)” saved search as the data source.
  • Save the visualisation as “Activation Counts (Failed)”.

Activation Times

Activation counts over time, for successful and failed invocations, are displayed in vertical bar charts.

  • Select the “Vertical bar chart” visualisation from the visualisation type list.
  • Use the “activation records (success)” saved search as the data source.
  • Set the “Custom Label” to Invocations
  • Add an “X-Axis” bucket type under the Buckets section.
  • Choose “Date Histogram” for the aggregation, “@timestamp” for the field and “Minute” for the interval.
  • Save the visualisation as “Activation Times (Success)”

  • Repeat this process to create the failed activation times visualisation.
  • Use the “activation records (failed)” saved search as the data source.
  • Save the visualisation as “Activation Times (Failed)”

Failed Activations List

Activation identifiers for failed invocations are shown using a data table.

  • Select the “Data table” visualisation from the visualisation type list.
  • Use the “activation records (failed)” saved search as the data source.
  • Add a “Split Rows” bucket type under the Buckets section.
  • Choose “Date Histogram” for the aggregation, “@timestamp” for the field and “Second” for the interval.
  • Add a “sub-bucket” with the “Split Rows” type.
  • Set sub aggregation to “Terms”, field to “activationId_str” and order by “Term”.
  • Save the visualisation as “Errors Table”

Creating the dashboard

Having created the individual visualisations components, the monitoring dashboard can be constructed.

  • Click the “Dashboard” menu item from the left-and menu panel.
  • Click the “Add” button to import visualisations into the current dashboard.
  • Add each of the five visualisations created above.

Hovering the mouse cursor over visualisations will reveal icons for moving and re-sizing.

  • Re-order the visualisations into the following rows:
    • Activations Metrics
    • Activation Times
    • Failed Activations List
  • Select the “Last 24 hours” time window, available from the relative time ranges menu.
  • Save the dashboard as ”Cloud Functions Monitoring”. Tick the ”store time with dashboard” option.

Having saved the dashboard with time window, re-opening the dashboard will show our visualisations with data for the previous 24 hours. This dashboard can be used to quickly review recent application issues.

Conclusion

Monitoring serverless applications is crucial to diagnosing issues on serverless platforms.

IBM Cloud Functions provides automatic integration with the IBM Cloud Logging service. All activation records and application logs from serverless applications are automatically forwarded as log records. This makes it simple to build custom monitoring dashboards using these records for serverless applications running on IBM Cloud Functions.

Using this service with World Cup Twitter bot allowed me to easily monitor the application for issues. This was much easier than manually retrieving and reviewing activation records using the CLI!

Debugging Node.js OpenWhisk Actions

Debugging serverless applications is one of the most challenging issues developers face when using serverless platforms. How can you use debugging tools without any access to the runtime environment?

Last week, I worked out how to expose the Node.js debugger in the Docker environment used for the application runtime in Apache OpenWhisk.

Using the remote debugging service, we can set breakpoints and step through action handlers live, rather than just being reliant on logs and metrics to diagnose bugs.

So, how does this work?

Let’s find out more about how Apache OpenWhisk executes serverless functions…

Background

Apache OpenWhisk is the open-source serverless platform which powers IBM Cloud Functions. OpenWhisk uses Docker containers to create isolated runtime environments for executing serverless functions.

Containers are started on-demand as invocation requests arrive. Serverless function source files are dynamically injected into the runtime and executed for each invocation. Between invocations, containers are paused and kept in a cache for re-use with further invocations.

The benefit of using an open-source serverless platform is that the build files used to create runtime images are also open-source. OpenWhisk also automatically builds and publishes all runtime images externally on Docker Hub. Running containers using these images allows us to simulate the remote serverless runtime environment.

Runtime Images

All OpenWhisk runtime images are published externally on Docker Hub.

Runtime images start a HTTP server which listens on port 8080. This HTTP server must implement two API endpoints (/init & /run) accepting HTTP POST requests. The platform uses these endpoints to initialise the runtime with action code and then invoke the action with event parameters.

More details on the API endpoints can be found in this blog post on creating Docker-based actions.

Node.js Runtime Image

This repository contains the source code used to create Node.js runtime environment image.

https://github.com/apache/incubator-openwhisk-runtime-nodejs

Both Node.js 8 and 6 runtimes are built from a common base image. This base image contains an Express.js server which handles the platform API requests. The app.js file containing the server is executed when the containers starts.

JavaScript code is injected into the runtime using the /init API. Actions created from source code are dynamically evaluated to instantiate the code in the runtime. Actions created from zip files are extracted into a temporary directory and imported as a Node.js module.

Once instantiated, actions are executed using the /run API. Event parameters are come from the request body. Each time a new request is received, the server calls the action handler with event parameters. Returned values are serialised as the JSON body in the API response.

Starting Node.js Runtime Containers

Use this command to start the Node.js runtime container locally.

1
$ docker run -it -p 8080:8080 openwhisk/action-nodejs-v8

Once the container has started, port 8080 on localhost will be mapped to the HTTP service exposed by the runtime environment. This can be used to inject serverless applications into the runtime environment and invoke the serverless function handler with event parameters.

Node.js Remote Debugging

Modern versions of the Node.js runtime have a command-line flag (--inspect) to expose a remote debugging service. This service runs a WebSocket server on localhost which implements the Chrome DevTools Protocol.

1
2
$ node --inspect index.js
Debugger listening on 127.0.0.1:9229.

External tools can connect to this port to provide debugging capabilities for Node.js code.

Docker images for the OpenWhisk Node.js runtimes use the following command to start the internal Node.js process. Remote debugging is not enabled by default.

1
node --expose-gc app.js

Docker allows containers to override the default image start command using a command line argument.

This command will start the OpenWhisk Node.js runtime container with the remote debugging service enabled. Binding the HTTP API and WebSocket ports to the host machine allows us to access those services remotely.

1
docker run -p 8080:8080 -p 9229:9229 -it openwhisk/action-nodejs-v8 node --inspect=0.0.0.0:9229 app.js

Once a container from the runtime image has started, we can connect our favourite debugging tools…

Chrome Dev Tools

To connect Chrome Dev Tools to the remote Node.js debugging service, follow these steps.

Chrome Dev Tools is configured to open a connection on port 9229 on localhost. If the web socket connection succeeds, the debugging target should be listed in the “Remote Target” section.

  • Click the ”Open dedicated DevTools for Node” link.

In the “Sources” panel the JavaScript files loaded by the Node.js process are available.

Setting breakpoints in the runner.js file will allow you to halt execution for debugging upon invocations.

VSCode

Visual Studio Code supports remote debugging of Node.js code using the Chrome Dev Tools protocol. Follow these steps to connect the editor to the remote debugging service.

  • Click the menu item ”Debug -> Add Configuration
  • Select the ”Node.js: Attach to Remote Program” from the Intellisense menu.
  • Edit the default configuration to have the following values.
1
2
3
4
5
6
7
8
{
  "type": "node",
  "request": "attach",
  "name": "Attach to Remote",
  "address": "127.0.0.1",
  "port": 9229,
  "localRoot": "${workspaceFolder}"
}

  • Choose the new ”attach to remote” debugging profile and click the Run button.

The ”Loaded Scripts” window will show all the JavaScript files loaded by the Node.js process.

Setting breakpoints in the runner.js file will allow you to halt execution for debugging upon invocations.

Breakpoint Locations

Here are some useful locations to set breakpoints to catch errors in your serverless functions for the OpenWhisk Node.js runtime environments.

Initialisation Errors - Source Actions

If you are creating OpenWhisk actions from JavaScript source files, the code is dynamically evaluated during the /init request at this location. Putting a breakpoint here will allow you to catch errors thrown during that eval() call.

Initialisation Errors - Binary Actions

If you are creating OpenWhisk actions from a zip file containing JavaScript modules, this location is where the archive is extracted in the runtime filesystem. Putting a breakpoint here will catch errors from the extraction call and runtime checks for a valid JavaScript module.

This code is where the JavaScript module is imported once it has been extracted. Putting a breakpoint here will catch errors thrown importing the module into the Node.js environment.

Action Handler Errors

For both source file and zipped module actions, this location is where the action handler is invoked on each /run request. Putting a breakpoint here will catch errors thrown from within action handlers.

Invoking OpenWhisk Actions

Once you have attached the debugger to the remote Node.js process, you need to send the API requests to simulate the platform invocations. Runtime containers use separate HTTP endpoints to import the action source code into the runtime environment (/init) and then fire the invocation requests (/run).

Generating Init Request Body - Source Files

If you are creating OpenWhisk actions from JavaScript source files, send the following JSON body in the HTTP POST to the /init endpoint.

1
2
3
4
5
6
{
  "value": {
    "main": "<FUNCTION NAME IN SOURCE FILE>",
    "code": "<INSERT SOURCE HERE>"
  }
}

code is the JavaScript source to be evaluated which contains the action handler. main is the function name in the source file used for the action handler.

Using the jq command-line tool, we can create the JSON body for the source code in file.js.

1
$ cat file.js | jq -sR  '{value: {main: "main", code: .}}'

Generating Init Request Body - Zipped Modules

If you are creating OpenWhisk actions from a zip file containing JavaScript modules, send the following JSON body in the HTTP POST to the /init endpoint.

1
2
3
4
5
6
7
{
  "value": {
    "main": "<FUNCTION NAME ON JS MODULE>",
    "code": "<INSERT BASE64 ENCODED STRING FROM ZIP FILE HERE>",
    "binary": true
  }
}

code must be a Base64 encoded string for the zip file. main is the function name returned in the imported JavaScript module to call as the action handler.

Using the jq command-line tool, we can create the JSON body for the zip file in action.zip.

1
$ base64 action.zip | tr -d '\n' | jq -sR '{value: {main: "main", binary: true, code: .}}'

Sending Init Request

The HTTPie tool makes it simple to send HTTP requests from the command-line.

Using this tool, the following command will initialise the runtime container with an OpenWhisk action.

1
2
3
4
5
6
$ http post localhost:8080/init < init.json
HTTP/1.1 200 OK
...
{
    "OK": true
}

If this HTTP request returns without an error, the action is ready to be invoked.

No further initialisation requests are needed unless you want to modify the action deployed.

Generating Run Request Body

Invocations of the action handler functions are triggered from a HTTP POST to the /run API endpoint.

Invocations parameters are sent in the JSON request body, using a JSON object with a value field.

1
2
3
4
5
6
{
  "value": {
    "some-param-name": "some-param-value",
    "another-param-name": "another-param-value",
  }
}

Sending Run Request

Using the HTTPie tool, the following command will invoke the OpenWhisk action.

1
2
3
4
5
6
$ http post localhost:8080/run < run.json
HTTP/1.1 200 OK
...
{
    "msg": "Hello world"
}

Returned values from the action handler are serialised as the JSON body in the HTTP response. Issuing further HTTP POST requests to the /run endpoint allows us to re-invoke the action.

Conclusion

Lack of debugging tools is one of the biggest complaints from developers migrating to serverless platforms.

Using an open-source serverless platform helps with this problem, by making it simple to run the same containers locally that are used for the platform’s runtime environments. Debugging tools can then be started from inside these local environments to simulate remote access.

In this example, this approach was used to enable the remote debugging service from the OpenWhisk Node.js runtime environment. The same approach could be used for any language and debugging tool needing local access to the runtime environment.

Having access to the Node.js debugger is huge improvement when debugging challenging issues, rather than just being reliant on logs and metrics collected by the platform.

Binding IAM Services to IBM Cloud Functions

Binding service credentials to actions and packages is a much better approach to handling authentication credentials in IBM Cloud Functions, than manually updating (and maintaining) default parameters ๐Ÿ”.

IBM Cloud Functions supports binding credentials from IAM-based and Cloud Foundry provisioned services.

Documentation and blog posts demonstrating service binding focuses on traditional platform services, created using the Cloud Foundry service broker. As IBM Cloud integrates IAM across the platform, more platform services will migrate to use the IAM service for managing authentication credentials.

How do we bind credentials for IAM-based services to IBM Cloud Functions? ๐Ÿค”

Binding IAM-based services to IBM Cloud Functions works the same as traditional platform services, but has some differences in how to retrieve details needed for the service bind command.

Let’s look at how this works…

Binding IAM Credentials

Requirements

Before binding an IAM-based service to IBM Cloud Functions, the following conditions must be met.

You will need the following information to bind a service credentials.

  • Service name.
  • (Optional) Instance name.
  • (Optional) Credentials identifier.

Using the CLI

Use the ibmcloud wsk service bind command to bind service credentials to actions or packages.

1
bx wsk service bind <SERVICE_NAME> <ACTION|PACKAGE> --instance <INSTANCE> --keyname <KEY>

This command supports the following (optional) flags: --instance and --keyname.

If the instance and/or key names are not specified, the CLI uses the first instance and credentials returned from the system for the service identifier.

Accessing from actions

Credentials are stored as default parameters on the action or package.

The command uses a special parameter name (__bx_creds) to store all credentials. Individual service credentials are indexed using the service name.

1
2
3
4
5
6
7
8
{
   "__bx_creds":{
      "service-name":{
         "apikey":"<API_KEY>",
         ...
      }
   }
}

Default parameters are automatically merged into the request parameters during invocations.

Common Questions

How can I tell whether a service instance uses IAM-based authentication?

Running the ibmcloud resource service-instances command will return the IAM-based service instances provisioned.

Cloud Foundry provisioned services are available using a different command: ibmcloud service list.

Both service types can be bound using the CLI but the commands to retrieve the necessary details are different.

How can I find the service name for an IAM-based service instance?

Run the ibmcloud resource service-instance <INSTANCE_NAME> command.

Service names are shown as the Service Name: field value.

How can I list available service credentials for an IAM-based service instance?

Use the ibmcloud resource service-keys --instance-name <NAME> command.

Replace the <NAME> value with the service instance returned from the ibmcloud service list command.

How can I manually retrieve IAM-based credentials for an instance?

Use the ibmcloud resource service-key <CREDENTIALS_NAME> command.

Replace the <CREDENTIALS_NAME> value with credential names returned from the ibmcloud service service-keys command.

How can I create new service credentials?

Credentials can be created through the service management page on IBM Cloud.

You can also use the CLI to create credentials using the ibmcloud resource service-key-create command. This command needs a name for the credentials, IAM role and service instance identifier.

Example - Cloud Object Storage

Having explained how to bind IAM-based services to IBM Cloud Functions, let’s look at an example….

Cloud Object Storage is the service used to manage files for serverless applications on IBM Cloud. This service supports the newer IAM-based authentication service.

Let’s look at how to bind authentication credentials for an instance of this service to an action.

Using the CLI, we can check an instance of this service is available…

1
2
3
4
5
$ ibmcloud resource service-instances
Retrieving service instances in resource group default..
OK
Name                     Location   State    Type               Tags
my-cos-storage           global     active   service_instance

In this example, we have a single instance of IBM Cloud Object Storage provisioned as my-cos-storage.

Retrieving instance details will show us the service name to use in the service binding command.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ ibmcloud resource service-instance my-cos-storage
Retrieving service instance my-cos-storage in resource group default..
OK

Name:                  my-cos-storage
ID:                    crn:v1:bluemix:public:cloud-object-storage:global:<GUID>:
GUID:                  <GUID>
Location:              global
Service Name:          cloud-object-storage
Service Plan Name:     lite
Resource Group Name:   default
State:                 active
Type:                  service_instance
Tags:

The IBM Cloud Object Storage service name is cloud-object-storage.

Before we can bind service credentials, we need to verify service credentials are available for this instance.

1
2
3
4
5
$ ibmcloud resource service-keys --instance-name my-cos-storage
Retrieving service keys in resource group default...
OK
Name                     State    Created At
serverless-credentials   active   Tue Jun  5 09:11:06 UTC 2018

This instance has a single service key available, named serverless-credentials.

Retrieving the service key details shows us the API secret for this credential.

1
2
3
4
5
6
7
8
9
10
11
$ ibmcloud resource service-key serverless-credentials
Retrieving service key serverless-credentials in resource group default...
OK

Name:          serverless-credentials
ID:            <ID>
Created At:    Tue Jun  5 09:11:06 UTC 2018
State:         active
Credentials:
               ...
               apikey:                   <SECRET_API_KEY_VALUE>

apikey denotes the secret API key used to authenticate calls to the service API.

Having retrieved the service name, instance identifier and available credentials, we can use these values to bind credentials to an action.

1
2
$ bx wsk service bind cloud-object-storage params --instance my-cos-storage --keyname serverless-credentials
Credentials 'serverless-credentials' from 'cloud-object-storage' service instance 'my-cos-storage' bound to 'params'.

Retrieving action details shows default parameters bound to an action. These will now include the API key for the Cloud Object Storage service.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ bx wsk action get params
ok: got action params
{
  ...
  "parameters": [{
    "key": "__bx_creds",
    "value": {
      "cloud-object-storage": {
        "apikey": "<API_KEY_SECRET>",
        ...
      }
    }
  }]
}

Under the __bx_creds default parameter, there is a cloud-object-storage property with the API key amongst other service credential values.

Using Cloud Object Storage From IBM Cloud Functions (Node.js)

How do you manage files for a serverless application? ๐Ÿค”

Previous blog posts discussed this common problem and introduced the most popular solution, using a cloud-based object storage service. ๐Ÿ‘๐Ÿ‘๐Ÿ‘

Object stores provide elastic storage in the cloud, with a billing model which charges for capacity used. These services are the storage solution for serverless applications, which do not have access to a traditional file system. ๐Ÿ‘

I’m now going to demonstrate how to use IBM Cloud Object Storage from IBM Cloud Functions.

This blog post will show you…

  • How to provision IBM Cloud Object Storage and create authentication tokens.
  • How use client libraries to access IBM Cloud Object Storage from IBM Cloud Functions.
  • Example serverless functions for common use-cases, e.g uploading files.

Code examples in this blog post will focus on the Node.js runtime.

Instructions on service provisioning and authentication credentials are relevant for any runtime.

IBM Cloud Accounts and Storage Services

IBM Cloud Object Storage is available to all IBM Cloud users.

IBM Cloud has three different account types: lite, pay-as-you-go or subscription.

Lite Accounts

Lite accounts do not require a credit card to register and do not expire after a limited time period.

Numerous platform services, including Cloud Object Storage, provide free resources for lite account users. IBM Cloud Object Storage’s free resource tier comes the following monthly limits.

  • Store 25GB of new data.
  • Issue 20,000 GET and 2,000 PUT requests.
  • Use 10GB of public bandwidth.

Lite tier usage supports all resiliency and storage class options but are limited to a single service instance.

Users can sign up for a free “Lite” account here. Please follow the instructions to install the IBM Cloud CLI.

Pay-as-you-Go & Subscription Accounts

Lite accounts can be upgraded to Pay-As-You-Go or Subscription accounts. Upgraded accounts still have access to the free tiers provided in Lite accounts. Users with Pay-As-You-Go or Subscriptions accounts can access services and tiers not included in the Lite account.

Benefits of the additional service tiers for IBM Cloud Object Storage include unlimited instances of the object storage service. Costs are billed according to usage per month. See the pricing page for more details: https://www.ibm.com/cloud-computing/bluemix/pricing-object-storage#s3api

Provisioning IBM Cloud Object Storage

IBM Cloud Object Storage can be provisioned through the IBM Cloud service catalog.

From the Service Details page, follow these instructions to provision a new instance.

  • Give the service an identifying name.
  • Leave the resource group as ”default”.
  • Click the “Create” button.

Once the service has been provisioned, it will be shown under the “Services” section of the IBM Cloud Dashboard. IBM Cloud Object Storage services are global services and not bound to individual regions.

  • Click the service instance from the dashboard to visit the service management page.

Once the service has been provisioned, we need to create authentication credentials for external accessโ€ฆ

Service Credentials

Service credentials for IBM Cloud Object Storage use IBM Cloud’s IAM service.

I’m just going to cover the basics of using IAM with Cloud Object Storage. Explaining all the concepts and capabilities of the IAM service would need a separate (and lengthy) blog post!

Auto-Binding Service Credentials

IBM Cloud Functions can automatically provision and bind service credentials to actions.

This feature is supported through the IBM Cloud CLI command: bx wsk service bind.

Bound service credentials are stored as default action parameters. Default parameters are automatically included as request parameters for each invocation.

Using this approach means users do not have to manually provision and manage service credentials. ๐Ÿ‘

Service credentials provisioned in this manner use the following configuration options:

  • IAM Role: Manager
  • Optional Configuration Parameters: None.

If you need to use different configuration options, you will have to manually provision service credentials.

Manually Creating Credentials

  • Select the ”Service Credentials” menu item from the service management page.
  • Click the “New credential” button.

Fill in the details for the new credentials.

  • Choose an identifying name for the credentials.
  • Select an access role. Access roles define which operations applications using these credentials can perform. Permissions for each role are listed in the documentation.

    Note: If you want to make objects publicly accessible make sure you use the manager permission.

  • Leave the Service ID unselected.

If you need HMAC service keys, which are necessary for generating presigned URLs, use the following inline configuration parameters before. Otherwise, leave this field blank.

1
{"HMAC": true}
  • Click the “Add” button.

๐Ÿ” Credentials shown in this GIF were deleted after the demo (before you get any ideas…) ๐Ÿ”

Once created, new service credentials will be shown in the credentials table.

IBM Cloud Object Storage API

Cloud Object Storage exposes a HTTP API for interacting with buckets and files.

This API implements the same interface as AWS S3 API.

Service credentials created above are used to authenticate requests to the API endpoints. Full details on the API operations are available in the documentation.

HTTP Endpoints

IBM Cloud Object Storage’s HTTP API is available through region-based endpoints.

When creating new buckets to store files, the data resiliency for the bucket (and therefore the files within it) is based upon the endpoint used for the bucket create operation.

Current endpoints are listed in the external documentation and available through an external API: https://cos-service.bluemix.net/endpoints

Choosing an endpoint

IBM Cloud Functions is available in the following regions: US-South, United Kingdom and Germany.

Accessing Cloud Object Storage using regional endpoints closest to the Cloud Functions application region will result in better application performance.

IBM Cloud Object Storage lists public and private endpoints for each region (and resiliency) choice. IBM Cloud Functions only supports access using public endpoints.

In the following examples, IBM Cloud Functions applications will be hosted in the US-South region. Using the US Regional endpoint for Cloud Object Storage will minimise network latency when using the service from IBM Cloud Functions.

This endpoint will be used in all our examples: s3-api.us-geo.objectstorage.softlayer.net

Client Libraries

Rather than manually creating HTTP requests to interact with the Cloud Object Storage API, client libraries are available.

IBM Cloud Object Storage publishes modified versions of the Node.js, Python and Java AWS S3 SDKs, enhanced with IBM Cloud specific features.

Both the Node.js and Python COS libraries are pre-installed in the IBM Cloud Functions runtime environments for those languages. They can be used without bundling those dependencies in the deployment package.

We’re going to look at using the JavaScript client library from the Node.js runtime in IBM Cloud Functions.

JavaScript Client Library

When using the JavaScript client library for IBM Cloud Object Storage, endpoint and authentication credentials need to be passed as configuration parameters.

1
2
3
4
5
6
7
8
9
const COS = require('ibm-cos-sdk');

const config = {
    endpoint: '<endpoint>',
    apiKeyId: '<api-key>',
    serviceInstanceId: '<resource-instance-id>',
};

const cos = new COS.S3(config);

Hardcoding configuration values within source code is not recommended. IBM Cloud Functions allows default parameters to be bound to actions. Default parameters are automatically passed into action invocations within the event parameters.

Default parameters are recommended for managing application secrets for IBM Cloud Functions applications.

Having provisioned the storage service instance, learnt about service credentials, chosen an access endpoint and understood how to use the client library, there’s one final step before we can start to creating functionsโ€ฆ

Creating Buckets

IBM Cloud Object Storage organises files into a flat hierarchy of named containers, called buckets. Buckets can be created through the command-line, using the API or the web console.

Let’s create a new bucket, to store all files for our serverless application, using the web console.

  • Open the ”Buckets” page from the COS management page.
  • Click the ”Create Bucket” link.

  • Create a bucket name. Bucket names must be unique across the entire platform, rather than just your account.

  • Select the following configuration options
    • Resiliency: Cross Region
    • Location: us-geo
    • Storage class: Standard
  • Click the ”Create” button.

Once the bucket has been created, you will be taken back to the bucket management page.

Test Files

We need to put some test files in our new bucket. Download the following images files.

Using the bucket management page, upload these files to the new bucket.

Using Cloud Object Storage from Cloud Functions

Having created a storage bucket containing test files, we can start to develop our serverless application.

Let’s begin with a serverless function that returns a list of files within a bucket. Once this works, we will extend the application to support retrieving, removing and uploading files to a bucket. We can also show how to make objects publicly accessible and generate pre-signed URLs, allowing external clients to upload new content directly.

Separate IBM Cloud Functions actions will be created for each storage operation.

Managing Default Parameters

Serverless functions will need the bucket name, service endpoint and authentication parameters to access the object storage service. Configuration parameters will be bound to actions as default parameters.

Packages can be used to share configuration values across multiple actions. Actions created within a package inherit all default parameters stored on that package. This removes the need to manually configure the same default parameters for each action.

Let’s create a new package (serverless-files) for our serverless application.

1
2
$ bx wsk package create serverless-files
ok: created package serverless-files

Update the package with default parameters for the bucket name (bucket) and service endpoint (cos_endpoint).

1
2
$ bx wsk package update serverless-files -p bucket <MY_BUCKET_NAME> -p cos_endpoint s3-api.us-geo.objectstorage.softlayer.net
ok: updated package serverless-files

Did you notice we didn’t provide authentication credentials as default parameters?

Rather than manually adding these credentials, the CLI can automatically provision and bind them. Let’s do this now for the cloud-object-storage service…

  • Bind service credentials to the serverless-files package using the bx wsk service bind command.
1
2
$ bx wsk service bind cloud-object-storage serverless-files
Credentials 'cloud-fns-key' from 'cloud-object-storage' service instance 'object-storage' bound to 'serverless-files'.
  • Retrieve package details to check default parameters contain expected configuration values.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ bx wsk package get serverless-files
ok: got package serverless-files
{
    ...
    "parameters": [
        {
            "key": "bucket",
            "value": "<MY_BUCKET_NAME>"
        },
        {
            "key": "cos_endpoint",
            "value": "s3-api.us-geo.objectstorage.softlayer.net"
        },
        {
            "key": "__bx_creds",
            "value": {
                "cloud-object-storage": {
                    ...
                }
            }
        }
    ]
}

List Objects Within the Bucket

  • Create a new file (action.js) with the following contents.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
const COS = require('ibm-cos-sdk')

function cos_client (params) {
  const bx_creds = params['__bx_creds']
  if (!bx_creds) throw new Error('Missing __bx_creds parameter.')

  const cos_creds = bx_creds['cloud-object-storage']
  if (!cos_creds) throw new Error('Missing cloud-object-storage parameter.')

  const endpoint = params['cos_endpoint']
  if (!endpoint) throw new Error('Missing cos_endpoint parameter.')

  const config = {
    endpoint: endpoint,
    apiKeyId: cos_creds.apikey,
    serviceInstanceId: cos_creds.resource_instance_id
  }

  return new COS.S3(config);
}

function list (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  const client = cos_client(params)

  return client.listObjects({ Bucket: params.bucket }).promise()
    .then(results => ({ files: results.Contents }))
}

This action retrieves the bucket name, service endpoint and authentication credentials from invocation parameters. Errors are returned if those parameters are missing.

  • Create a new package action from this source file with the following command.
1
2
$ bx wsk action create serverless-files/list-files actions.js --main list --kind nodejs:8
ok: created action list-files

The โ€”main flag set the function name to call for each invocation. This defaults to main. Setting this to an explicit value allows us to use a single source file for multiple actions.

The โ€”kind sets the action runtime. This optional flag ensures we use the Node.js 8 runtime rather than Node.js 6, which is the default for JavaScript actions. The IBM Cloud Object Storage client library is only included in the Node.js 8 runtime.

  • Invoke the new action to verify it works.
1
2
3
4
5
6
7
8
$ bx wsk action invoke serverless-files/list-files -r
{
    "files": [
        { "Key": "jumping pug.jpg", ... },
        { "Key": "pug blanket.jpg", ... },
        { "Key": "swimming pug.jpg", ... }
    ]
}

The action response should contain a list of the files uploaded before. ๐Ÿ’ฏ๐Ÿ’ฏ๐Ÿ’ฏ

Retrieve Object Contents From Bucket

Let’s add another action for retrieving object contents from a bucket.

  • Add a new function (retrieve) to the existing source file (action.js) with the following source code.
1
2
3
4
5
6
7
8
function retrieve (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")
  const client = cos_client(params)

  return client.getObject({ Bucket: params.bucket, Key: params.name }).promise()
    .then(result => ({ body: result.Body.toString('base64') }))
}

Retrieving files needs a file name in addition to the bucket name. File contents needs encoding as a Base64 string to support returning in the JSON response returned by IBM Cloud Functions.

  • Create an additional action from this updated source file with the following command.
1
2
$ bx wsk action create serverless-files/retrieve-file actions.js --main retrieve --kind nodejs:8
ok: created action serverless-files/retrieve-file
  • Invoke this action to test it works, passing the parameter name for the file to retrieve.
1
2
3
4
$ bx wsk action invoke serverless-files/retrieve-file -r -p name "jumping pug.jpg"
{
    "body": "<BASE64 ENCODED STRING>"
}

If this is successful, a (very long) response body containing a base64 encoded image should be returned. ๐Ÿ‘

Delete Objects From Bucket

Let’s finish this section by adding a final action that removes objects from our bucket.

  • Update the source file (actions.js) with this additional function.
1
2
3
4
5
6
7
function remove (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")
  const client = cos_client(params)

  return client.deleteObject({ Bucket: params.bucket, Key: params.name }).promise()
}
  • Create a new action (remove-file) from the updated source file.
1
2
$ bx wsk action create serverless-files/remove-file actions.js --main remove --kind nodejs:8
ok: created action serverless-files/remove-file
  • Test this new action using it to remove a file from the bucket.
1
2
$ bx wsk action invoke serverless-files/remove-file -r -p name "jumping pug.jpg"
{}
  • Listing bucket files should now return two files, rather than three.
1
2
3
4
5
6
7
$ bx wsk action invoke serverless-files/list-files -r
{
    "files": [
        { "Key": "pug blanket.jpg", ... },
        { "Key": "swimming pug.jpg", ... }
    ]
}

Listing, retrieving and removing files using the client library is relatively simple. Functions just need to call the correct method passing the bucket and object name.

Let’s move onto a more advanced example, creating new files in the bucket from our actionโ€ฆ

Create New Objects Within Bucket

File content will be passed into our action as Base64 encoded strings. JSON does not support binary data.

When creating new objects, we should set the MIME type. This is necessary for public access from web browsers, something we’ll be doing later on. Node.js libraries can calculate the correct MIME type value, rather than requiring this as an invocation parameter.

  • Update the source file (action.js) with the following additional code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
const mime = require('mime-types');

function upload (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")
  if (!params.body) throw new Error("Missing object parameter.")

  const client = cos_client(params)
  const body = Buffer.from(params.body, 'base64')

  const ContentType = mime.contentType(params.name) || 'application/octet-stream'
  const object = {
    Bucket: params.bucket,
    Key: params.name,
    Body: body,
    ContentType
  }

  return client.upload(object).promise()
}

exports.upload = upload;

As this code uses an external NPM library, we need to create the action from a zip file containing source files and external dependencies.

  • Create a package.json file with the following contents.
1
2
3
4
5
6
7
{
  "name": "upload-files",
  "main": "actions.js",
  "dependencies": {
    "mime-types": "^2.1.18"
  }
}
  • Install external libraries in local environment.
1
2
$ npm install
added 2 packages in 0.804s
  • Bundle source file and dependencies into zip file.
1
2
3
4
$ zip -r upload.zip package.json actions.js node_modules
  adding: actions.js (deflated 72%)
  adding: node_modules/ (stored 0%)
  ...
  • Create a new action from the zip file.
1
2
$ bx wsk action create serverless-files/upload-file upload.zip --main upload --kind nodejs:8
ok: created action serverless-files/upload-file
  • Create the Base64-encoded string used to pass the new file’s content.
1
2
$ wget http://www.pugnow.com/wp-content/uploads/2016/04/fly-pug-300x300.jpg
$ base64 fly-pug-300x300.jpg > body.txt
  • Invoke the action with the file name and content as parameters.
1
$ bx wsk action invoke serverless-files/upload-file -r -p body $(cat body.txt) -p name "flying pug.jpg"

Object details should be returned if the file was uploaded correctly.

1
2
3
4
5
6
7
{
    "Bucket": "my-serverless-files",
    "ETag": "\"b2ae0fb61dc827c03d6920dfae58e2ba\"",
    "Key": "flying pug.jpg",
    "Location": "https://<MY_BUCKET_NAME>.s3-api.us-geo.objectstorage.softlayer.net/flying%20pug.jpg",
    "key": "flying pug.jpg"
}

Accessing the object storage dashboard shows the new object in the bucket, with the correct file name and size.

Having actions to create, delete and access objects within a bucket, what’s left to do? ๐Ÿค”

Expose Public Objects From Buckets

Users can also choose to make certain objects within a bucket public. Public objects can be retrieved, using the external HTTP API, without any further authentication.

Public file access allows external clients to access files directly. It removes the need to invoke (and pay for) a serverless function to serve content. This is useful for serving static assets and media files.

Objects have an explicit property (x-amz-acl) which controls access rights. Files default to having this value set as private, meaning all operations require authentication. Setting this value to public-read will enable GET operations without authentication.

Files can be created with an explicit ACL property using credentials with the Writer or Manager role. Modifying ACL values for existing files is only supported using credentials with the Manager role.

  • Add the following source code to the existing actions file (action.js).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function make_public (params) {
  return update_acl(params, 'public-read')
}

function make_private (params) {
  return update_acl(params, 'private')
}

function update_acl (params, acl) => {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")
  const client = cos_client(params)

  const options = {
    Bucket: params.bucket,
    Key: params.name,
    ACL: acl
  }

  return client.putObjectAcl(options).promise()
}
  • Create two new actions with the update source file.
1
2
3
4
$ bx wsk action create serverless-files/make-public actions.js --main make_public --kind nodejs:8
ok: created action serverless-files/make-public
$ bx wsk action create serverless-files/make-private actions.js --main make_private --kind nodejs:8
ok: created action serverless-files/make-private

Bucket objects use the following URL scheme: https://./

We have been using the following endpoint hostname: s3-api.us-geo.objectstorage.softlayer.net.

  • Checking the status code returned when accessing an existing object confirms it defaults to private.
1
2
3
$ curl -I https://<BUCKET_NAME>.s3-api.us-geo.objectstorage.softlayer.net/flying%20pug.jpg
HTTP/1.1 403 Forbidden
...
  • Invoke the make-public action to allow GET requests without authentication.
1
$ bx wsk action invoke serverless-files/make-public -r -p name "flying pug.jpg"
  • Retry file access using the external HTTP API. This time a 200 response is returned with the content.
1
2
3
4
$ curl -I https://<BUCKET_NAME>.s3-api.us-geo.objectstorage.softlayer.net/flying%20pug.jpg
HTTP/1.1 200 OK
Content-Type: image/jpeg
...

Having set an explicit content type for the file, opening this URL in a web browser will show the image.

  • Disable public access using the other new action.
1
bx wsk action invoke serverless-files/make-private -r -p name "flying pug.jpg"
  • Re-issue the curl request to the file location.
1
2
3
$ curl -I https://<BUCKET_NAME>.s3-api.us-geo.objectstorage.softlayer.net/flying%20pug.jpg
HTTP/1.1 403 Forbidden
...

HTTP requests to this file now return a 403 status. Authentication is required again. ๐Ÿ”‘

In addition to allowing public read access we can go even further in allowing clients to interact with bucketsโ€ฆ

Provide Direct Upload Access To Buckets

Cloud Object Storage provides a mechanism (presigned URLs) to generate temporary links that allow clients to interact with buckets without further authentication. Passing these links to clients means they can access to private objects or upload new files to buckets. Presigned URLs expire after a configurable time period.

Generating presigned URLs is only supported from HMAC authentication keys.

HMAC service credentials must be manually provisioned, rather than using the bx wsk service bind command. See above for instructions on how to do this.

  • Save provisioned HMAC keys into a file called credentials.json.

Let’s create an action that returns presigned URLs, allowing users to upload files directly. Users will call the action with a new file name. Returned URLs will support an unauthenticated PUT request for the next five minutes.

  • Create a new file called presign.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
'use strict';

const COS = require('ibm-cos-sdk');
const mime = require('mime-types');

function cos_client (params) {
  const creds = params.cos_hmac_keys
  if (!creds) throw new Error('Missing cos_hmac_keys parameter.')

  const endpoint = params.cos_endpoint
  if (!endpoint) throw new Error('Missing cos_endpoint parameter.')

  const config = {
    endpoint: endpoint,
    accessKeyId: creds.access_key_id,
    secretAccessKey: creds.secret_access_key
  }

  return new COS.S3(config);
}

function presign (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")

  const client = cos_client(params)

  const options = {
    Bucket: params.bucket,
    Key: params.name,
    Expires: 300,
    ContentType: mime.contentType(params.name) || 'application/octet-stream'
  }

  return { url: client.getSignedUrl('putObject', options) }
}

exports.presign = presign;
  • Update the package.json file with the following contents.
1
2
3
4
5
6
7
{
  "name": "presign",
  "main": "presign.js",
  "dependencies": {
    "mime-types": "^2.1.18"
  }
}
  • Bundle source file and dependencies into zip file.
1
2
3
4
$ zip -r presign.zip package.json presign.js node_modules
  adding: actions.js (deflated 72%)
  adding: node_modules/ (stored 0%)
  ...
  • Create a new action from the zip file.
1
2
$ bx wsk action create serverless-files/presign presign.zip --main presign --kind nodejs:8 -P credentials.json
ok: created action serverless-files/presign
  • Invoke the action to return a presigned URL for a new file.
1
2
3
4
$ bx wsk action invoke serverless-files/presign -r -p name pug.jpg
{
    "url": "https://<BUCKET>.s3-api.us-geo.objectstorage.softlayer.net/pug.jpg?AWSAccessKeyId=<SECRET>&Content-Type=image%2Fjpeg&Expires=<TIME>&Signature=<KEY>"
}

Using this URL we can upload a new image without providing authentication credentials.

  • This curl command โ€”upload-file will send a HTTP PUT, with image file as request body, to that URL.
1
$ curl --upload-file "my pug.jpg" <URL> --header "Content-Type: image/jpeg"

The HTTP request must include the correct “Content-Type” header. Use the value provided when creating the presigned URL. If these values do not match, the request will be rejected.

Exploring the objects in our bucket confirms we have uploaded a file! ๐Ÿ•บ๐Ÿ’ƒ

Presigned URLs are a brilliant feature of Cloud Object Storage. Allowing users to upload files directly overcomes the payload limit for cloud functions. It also reduces the cost for uploading files, removing the cloud functions’ invocation cost.

conclusion

Object storage services are the solution for managing files with serverless applications.

IBM Cloud provides both a serverless runtime (IBM Cloud Functions) and an object storage service (IBM Cloud Object Store). In this blog post, we looked at how integrate these services to provide a file storage solution for serverless applications.

We showed you how to provision new COS services, create and manage authentication credentials, access files using a client library and even allow external clients to interact directly with buckets. Sample serverless functions using the Node.js runtime were also provided.

Do you have any questions, comments or issues about the content above? Please leave a comment below, find me on the openwhisk slack or send me a tweet.

File Storage for Serverless Applications

“Where do you store files without a server?”

โ€ฆis the most common question I get asked during Q&A after one of my ”Introduction to Serverless Platforms” conference talks. Searching for this question online, this is the answer you will often find.

“Use an object store for file storage and access using the S3-compatible interface. Provide direct access to files by making buckets public and return pre-signed URLs for uploading content. Easy, right?”

Responding to people with this information often leads to the following response:

๐Ÿค”๐Ÿค”๐Ÿค”

Developers who are not familiar with cloud platforms, can often understand the benefits and concepts behind serverless, but don’t know the other cloud services needed to replicate application services from traditional (or server-full) architectures.

In this blog post, I want to explain why we do not use the file system for files in serverless applications and introduce the cloud services used to handle this.

serverless runtime file systems

Serverless runtimes do provide access to a filesystem with a (small) amount of ephemeral storage.

Serverless application deployment packages are extracted into this filesystem prior to execution. Uploading files into the environment relies on them being included within the application package. Serverless functions can read, modify and create files within this local file system.

These temporary file systems come with the following restrictionsโ€ฆ

  • Maximum application package size limits additional files that can be uploaded.
  • Serverless platforms usually limit total usable space to around 512MB.
  • Modifications to the file system are lost once the environment is not used for further invocations.
  • Concurrent executions of the same function use independent runtime environments and do not share filesystem storage.
  • There is no access to these temporary file systems outside the runtime environment.

All these limitations make the file system provided by serverless platforms unsuitable as a scalable storage solution for serverless applications.

So, what is the alternative?

object stores

Object stores manage data as objects, as opposed to other storage architectures like file systems which manage data as a file hierarchy. Object-storage systems allow retention of massive amounts of unstructured data, with simple retrieval and search capabilities.

https://en.wikipedia.org/wiki/Object_storage

Object stores provide “storage-as-a-service” solutions for cloud applications.

These services are used for file storage within serverless applications.

Unlike traditional block storage devices, data objects in object storage services are organised using flat hierarchies of containers, known as ”buckets”. Objects within buckets are identified by unique identifiers, known as ”keys”. Metadata can also be stored alongside data objects for additional context.

Object stores provide simple access to files by applications, rather than users.

advantages of an object store

scalable and elastic storage

Rather than having a disk drive, with a fixed amount of storage, object stores provide scalable and elastic storage for data objects. Users are charged based upon the amount of data stored, API requests and bandwidth used. Object stores are built to scale as storage needs grow towards the petabyte range.

simple http access

Object stores provide a HTTP-based API endpoint to interact with the data objects.

Rather than using a standard library methods to access the file system, which translates into system calls to the operating system, files are available over a standard HTTP endpoint.

Client libraries provide a simple interface for interacting with the remote endpoints.

expose direct access to files

Files stored in object storage can be made publicly accessible. Client applications can access files directly without needing to use an application backend as a proxy.

Special URLs can also be generated to provide temporary access to files for external clients. Clients can even use these URLs to directly upload and modify files. URLs are set to expire after a fixed amount of time.

ibm cloud object storage

IBM Cloud provides an object storage service called IBM Cloud Object Storage. This service provides the following features concerning resiliency, reliability and cost.

data resiliency

Buckets’ contents can be stored with the following automatic data resiliency choices.

  • Cross Region. Store data across three regions within a geographic area.
  • Regional. Store data in multiple data centres within a single geographic region.
  • Single Data Centre. Store data across multiple devices in a single data centre.

Cross Region is the best choice for ”regional concurrent access and highest availability”. Regional is used for “high availability and performance”. Single Data Centre is appropriate when “when data locality matters most”.

storage classes

Data access patterns can be used to save costs by choosing the appropriate storage class for data storage.

IBM Cloud Object Storage offers the following storage classes: Standard, Vault, Cold Vault, Flex.

Standard class is used for workloads with frequent data access. Vault and Cold Vault are used with infrequent data retrieval and data archiving workloads. Flex is a mixed storage class for workloads where access patterns are more difficult to predict.

costs

Storage class and data resiliency options are used to calculate the cost of service usage.

Storage is charged based upon the amount of data storage used, operational requests (GET, POST, PUTโ€ฆ) and outgoing public bandwidth.

Storage classes affect the price of data retrieval operations and storage costs. Storage classes used for archiving, e.g. cold vault, charge less for data storage and more for operational requests. Storage classes used for frequency access, e.g. standard, charge more for data storage and less for operational requests.

Higher resiliency data storage is more expensive than lower resiliency storage.

lite plan

IBM Cloud Object Storage provides a generous free tier (25GB storage per month, 5GB public bandwidth) for Lite account users. IBM Cloud Lite accounts provide perpetual access to a free set of IBM Cloud resources. Lite accounts do not expire after a time period or need a credit card to sign up.

conclusion

Serving files from serverless runtimes is often accomplished using object storage services.

Object stores provide a scalable and cost-effective service for managing files without using storage infrastructure directly. Storing files in an object store provides simple access from serverless runtimes and even allows the files to be made directly accessible to end users.

In the next blog posts, I’m going to show you how to set up IBM Cloud Object Storage and access files from serverless applications on IBM Cloud Functions. I’ll be demonstrating this approach for both the Node.js and Swift runtimes.

Configuring Alert Notifications Using Serverless Metrics

This blog post is the final part of a series on “Monitoring Serverless Applications Metrics”. See the introduction post for details and links to other posts.

In previous blog posts, we showed how to capture serverless metrics from IBM Cloud Functions, send those values into the IBM Cloud Monitoring service and build visualisation dashboards using Grafana.

Dashboards are a great way to monitor metrics but rely on someone watching them! We need a way to be alerted to issues without having to manually review dashboards.

Fortunately, IBM Cloud Monitoring service comes with an automatic alerting mechanism. Users configure rules that define metrics to monitor and expected values. When values fall outside normal ranges, alerts are sent using installed notification methods.

Let’s finish off this series on monitoring serverless applications by setting up a sample alert notification monitoring errors from our serverless applicationsโ€ฆ

Alerting in IBM Cloud Monitoring

IBM Cloud Monitoring service supports defining custom monitoring alerts. Users define rules to identify metric values to monitor and expected values. Alerts are triggered when metric values fall outside thresholds. Notification methods including email, webhooks and PagerDuty are supported.

Let’s set up a sample monitoring alert for IBM Cloud Functions applications.

We want to be notified when actions start to return error codes, rather than successful responses. The monitoring library already records boolean values for error responses from each invocation.

Creating monitoring alerts needs us to use the IBM Cloud Monitoring API.

Using the IBM Cloud Monitoring API needs authentication credentials and a space domain identifier. In a previous blog post, we showed how to retrieve these values.

Monitoring Rules API

Monitoring rules can be registered by sending a HTTP POST request to the /alert/rule endpoint.

Configuration parameters are included in the JSON body. This includes the metric query, threshold values and monitoring time window. Rules are connected to notification methods using notification identifiers.

This is an example rule configuration for monitoring errors from IBM Cloud Function applications.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
  "name": "ibm_cloud_functions",
  "description": "Monitor errors from all actions",
  "expression": "sumSeries(ibm.public.cloud-functions.<region>.<namespace>.*.*.error)",
  "enabled": true,
  "from": "-5min",
  "until": "now",
  "comparison": "above",
  "comparison_scope": "last",
  "error_level" : 10,
  "warning_level" : 1,
  "frequency": "1min",
  "dashboard_url": "https://metrics.ng.bluemix.net",
  "notifications": [
    "email_alert"
  ]
}

The expression parameter defines the query used to monitor values.

1
sumSeries(ibm.public.cloud-functions.<region>.<namespace>.*.*.error)

Error metric values use 0 for normal responses and 1 for errors. sumSeries adds up all error values recorded within the monitoring window.

Using a wildcard for the sixth field means all actions are monitored. Replacing this field value with an action name will restrict monitoring to just that action. Region and namespace templates need substituting with actual values for your application.

Threshold values for triggering alerts are defined using the warning_level and error_level parameters. Warning messages are triggered after a single action failure and error messages after ten failures.

Notification identifiers, registered using the API, are provided in the notifications field. Rules may include more than one notification identifiers.

Notifications API

Notifications can be registered by sending a HTTP POST request to the /alert/notification endpoint. Configuration parameters are included in the JSON body.

This is an example configuration for email notifications.

1
2
3
4
5
6
{
  "name": "email_alert",
  "type": "Email",
  "description" : "Email alerting notifications",
  "detail": "email@address.com"
}

Notifications are configured using the type parameter in the body. Valid values for this field include Email, Webhook or PagerDuty. The detail field is used to include the email address, webhook endpoint or PagerDuty API key. The name field is used to reference this notification method when setting up rules.

Setting up alerts for serverless errors

Creating an email notification

  • Create the notify.json file from the template above.
1
2
3
4
5
6
7
$ cat notify.json
{
  "name": "email_alert",
  "type": "Email",
  "description" : "Email alerting notifications",
  "detail": "your_email@address.com"
}
  • Send the following HTTP request using curl. Include scope and auth token values in the headers.
1
2
3
4
5
6
7
8
9
$ curl --request POST \
    --url https://metrics.ng.bluemix.net/v1/alert/notification \
    --header 'x-auth-scope-id: s-<YOUR_DOMAIN_SPACE_ID>' \
    --header 'x-auth-user-token: apikey <YOUR_API_KEY>' \
    --data @notify.json
{
  "status": 200,
  "message": "Created notification 'email_alert'"
}

Testing email notification

  • Sending the following HTTP request using curl to generate a test email.
1
2
3
4
$ curl --request POST \
    --url https://metrics.ng.bluemix.net/v1/alert/notification/test/email_alert \
    --header 'x-auth-scope-id: s-<YOUR_DOMAIN_SPACE_ID>' \
    --header 'x-auth-user-token: apikey <YOUR_API_KEY>'
  • This returns the test notification message which will be emailed to the address.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
{
    "status": 200,
    "message": "Triggered test for notification 'email_alert'",
    "content": {
      "rule_name": "test_rule_name",
      "description": "test_rule_description",
      "notification_name": "email_alert",
      "scope_id": "s-<YOUR_DOMAIN_SPACE_ID>",
      "expression": "test_rule_expression",
      "warning_level": "80",
      "error_level": "90.9",
      "dashboard_url": "https://metrics.ng.bluemix.net",
      "alert_messages": [
        {
          "target": "test_alert_target",
          "from_type": "OK",
          "to_type": "ERROR",
          "current_value": "95.0",
          "comparison": "above",
          "timestamp": "2018-01-25T12:36:05Z"
        }
      ]
    }
}
  • Check the email inbox to verify the message has arrived.

Create monitoring rule for errors

  • Create the rule.json file from the template above, replacing region and namespace values.

  • Send the following HTTP request using curl. Include scope and auth token values in the headers.

1
2
3
4
5
6
7
8
$ curl --request POST --url https://metrics.ng.bluemix.net/v1/alert/rule \
    --header 'x-auth-scope-id: s-<YOUR_DOMAIN_SPACE_ID>' \
    --header 'x-auth-user-token: apikey <YOUR_API_KEY>' \
    --data @rule.json
{
  "status": 200,
  "message": "Created rule 'ibm_cloud_functions'"
}

Testing alerts for serverless errors

Let’s generate some errors in a sample action to check the monitoring rule works.

Create failing action

  • Create a new Node.js library called “fails”.
1
$ mkdir fails && cd fails && npm init
  • Install the openwhisk-metrics library.
1
$ npm install openwhisk-metrics
  • Edit the index.js file to have the following source code.
1
2
3
4
5
6
7
const metrics = require('openwhisk-metrics')

const main = params => {
  return { error: 'Oh dear, this action failed...' }
}

exports.main = metrics(main)
1
2
3
4
5
$ zip -r action.zip *
  adding: index.js (deflated 22%)
  ...
$ bx wsk action create fails action.zip --kind nodejs:8
ok: created action fails
  • Invoke the action. Check the activation response is an error.
1
2
3
4
5
6
7
8
9
10
11
12
13
$ bx wsk action invoke fails -b
ok: invoked /_/fails with id cbee42f77c6543c6ae42f77c6583c6a7
{
  "activationId": "cbee42f77c6543c6ae42f77c6583c6a7",
  "response": {
    "result": {
      "error": "Oh dear, this action failed..."
    },
    "status": "application error",
    "success": false
  },
  ...
}

response.result.success should be false.

  • Update actions parameter for the metric-forwarder action to include the fails action name.
1
2
3
4
5
6
7
8
9
10
11
$ cat params.json
{
  "actions": ["fails"],
  "service": {
    "api_key": "<API_KEY>",
    "host": "metrics.ng.bluemix.net",
    "scope": "s-<SPACE_ID>"
  },
  "since": 1516894777975
}
$ wsk action update metric-forwarder -P params.json

Generate serverless errors

Invoking the fails action should now trigger an email notification. Let’s test this out and trace metrics values through the platform.

  • Fire an action invocation using the CLI.
1
2
3
4
$ wsk action invoke fails -b
bx wsk action invoke fails -b
ok: invoked /_/fails with id 524b27044fd84b6a8b27044fd84b6ad8
...
  • Review the activation logs to show the error metric was recorded.
1
2
3
$ wsk activation logs 524b27044fd84b6a8b27044fd84b6ad8
...
stdout: METRIC <namespace>.fails.524b27044fd84b6a8b27044fd84b6ad8.error 1 1516895270
  • Invoke the metric-forwarder action to push metric values into the IBM Cloud Monitoring service.
1
2
$ bx wsk action invoke metric-forwarder -b
ok: invoked /_/metric-forwarder with id 295c47f05ea042849c47f05ea08284f0
  • Review activation logs to verify metric values were retrieved.
1
2
3
4
5
6
7
$ bx wsk activation logs 295c47f05ea042849c47f05ea08284f0
2018-01-25T15:51:47.160135346Z stdout: actions being monitored: [ 'fails' ]
2018-01-25T15:51:47.160177305Z stdout: retrieving logs since: 1516894777975
2018-01-25T15:51:47.290529179Z stdout: found 11 metric values from 1 activations
2018-01-25T15:51:47.291234046Z stdout: saving to metrics service -> metrics.ng.bluemix.net
2018-01-25T15:51:48.232790321Z stdout: saving metrics to service took: 941.169ms
2018-01-25T15:51:48.233334982Z stdout: updating since parameter: 1516895270458
  • Use the IBM Cloud Monitoring dashboard to show the error has been recorded.

  • Check your email inbox for the message showing the error notification!

  • Using the Cloud Monitoring API, we can retrieve the notification history to show this message was sent.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ curl --request POST --url https://metrics.ng.bluemix.net/v1/alert/history \
    --header 'x-auth-scope-id: s-<YOUR_DOMAIN_SPACE_ID>' \
    --header 'x-auth-user-token: apikey <YOUR_API_KEY>'
[
  {
    "from_level": "OK",
    "metric_name": "sumSeries(ibm.public.cloud-functions.<region>.<namespace>.*.*.error)",
    "notification_names": [
      "email_alert"
    ],
    "rule_name": "ibm_cloud_functions",
    "timestamp": "2018-01-23T15:29:48Z",
    "to_level": "WARN",
    "value": 1
  }
]

Invoking the fails action more than ten times will trigger a second alert when the rule moves from warning to error thresholds.

Conclusion

IBM Cloud Monitoring service supports sending notification alerts based upon application metric values. Configuring notifications rules, based upon our serverless metrics, ensures we will be alerted immediately when issues occur with our serverless applications. Notifications can be sent over email, webhooks or using PagerDuty.

In this series on “Monitoring Serverless Application Metrics”, we have shown you how to monitor serverless applications using IBM Cloud. Starting with capturing runtime metrics from IBM Cloud Functions, we then showed how to forward metrics into the IBM Cloud Monitoring service. Once metric values were being recorded, visualisation dashboards were built to help diagnose and resolve application issues. Finally, we configured automatic alerting rules to notify us over email as soon as issues developed.

Serverless applications are not โ€œNo Opsโ€, but “Different Ops”. Monitoring runtime metrics is still crucial. IBM Cloud provides a comprehensive set of tools for monitoring cloud applications. Utilising these services, you can create a robust monitoring solution for IBM Cloud Functions applications.