James Thomas

Notes on software.

Loosely-coupled Serverless Functions With Apache Openwhisk

Just like software engineering, best practices for serverless applications advise keeping functions small and focused on a single task, aka ”do one thing and do it well”. Small single-purpose functions are easier to develop, test and debug. ๐Ÿ‘

But what happens when you need execute multiple asynchronous tasks (implemented as separate functions) from an incoming event, like an API request? ๐Ÿค”

Functions Calling Functions?

Functions can invoke other functions directly, using asynchronous calls through the client SDK. This works at the cost of introducing tighter coupling between functions, which is generally avoided in software engineering! Disadvantages of this approach include…

  • Functions which call other functions can be more difficult to test. Test cases needs to mock out the client SDK to remove side-effects during unit or integration tests.
  • It can lead to repetitive code if you want to fire multiple tasks with the same event. Each invocation needs to manually handle error conditions and re-tries on network or other issues, which complicates the business logic.
  • Modifying the functions being invoked cannot be changed dynamically. The function doing the invoking has to be re-deployed with updated code.

Some people have even labelled ”functions calling functions” an anti-pattern in serverless development! ๐Ÿ˜ฑ

Hmmm… so what should we do?

Apache OpenWhisk has an awesome feature to help with this problem, triggers and rules! ๐Ÿ‘

OpenWhisk Triggers & Rules

Triggers and Rules in OpenWhisk are similar to the Observer pattern from software engineering.

Users can fire “events” in OpenWhisk by invoking a named trigger with parameters. Rules are used to “subscribe” actions to all events for a given trigger name. Actions are invoked with event parameters when a trigger is fired. Multiple rules can be configured to support multiple “listeners” to the same trigger events. Event senders are decoupled from event receivers.

Developers using OpenWhisk are most familiar with triggers when used with feed providers. This is used to subscribe actions to external event sources. The feed provider is responsible for listening to the event source and automatically firing trigger events with event details.

But triggers can be fired manually from actions to provide custom event streams! ๐Ÿ™Œ

1
2
3
4
5
6
7
8
const openwhisk = require('openwhisk')
const params = {msg: 'event parameters'}

// replace code like this...
const result = await ow.actions.invoke({name: "some-action", params})

// ...with this
const result = await ow.triggers.invoke({name: "some-trigger", params})

This allows applications to move towards an event-driven architecture and promotes loose-coupling between functions with all the associated benefits for testing, deployment and scalability. ๐Ÿ‘Œ

creating triggers

Triggers are managed through the platform API. They can be created, deleted, retrieved and fired using HTTP requests. Users normally interact with triggers through the CLI or platform SDKs.

Triggers can be created using the following CLI command.

1
wsk trigger create <TRIGGER_NAME>

default parameters

Triggers support default parameters like actions. Default parameters are stored in the platform and included in all trigger events. If the event object includes parameters with the same key, default parameter values are ignored.

1
wsk trigger create <TRIGGER_NAME> -p <PARAM> <PARAM_VALUE> -p <PARAM_2> <PARAM_VALUE> ...

binding triggers to actions with rules

Rules bind triggers to actions. When triggers are fired, all actions connected via rules are invoked with the trigger event. Multiple rules can refer to the same trigger supporting multiple listeners to the same event.

Rules can also be created using the following CLI command.

1
wsk rule create RULE_NAME TRIGGER_NAME ACTION_NAME

Tools like The Serverless Framework and wskdeploy allow users to configure triggers and rules declaratively through YAML configuration files.

firing triggers

The JS SDK can be used to fire triggers programatically from applications.

1
2
3
4
const openwhisk = require('openwhisk')
const name = 'sample-trigger'
const params = {msg: 'event parameters'}
const result = ow.triggers.invoke({name, params})

CLI commands (wsk trigger fire) can fire triggers manually with event parameters for testing.

1
wsk trigger fire sample-trigger -p msg "event parameters"

activation records for triggers

Activation records are created for trigger events. These activation records contain event parameters, rules fired, activations ids and invocation status for each action invoked. This is useful for debugging trigger events when issues are occurring.

1
2
3
4
5
6
7
$ wsk trigger fire sample-trigger -p hello world
ok: triggered /_/sample-trigger with id <ACTIVATION_ID>
$ wsk activation get <ACTIVATION_ID>
ok: got activation <ACTIVATION_ID>
{
 ...
}

The response.result property in the activation record contains the fired trigger event (combining default and event parameter values).

Rules fired by the trigger are recorded in activation records as the JSON values under the logs parameter.

1
2
3
4
5
6
7
{
  "statusCode": 0,
  "success": true,
  "activationId": "<ACTION_ACTIVATION_ID>",
  "rule": "<RULE_NAME>",
  "action": "<ACTION_NAME>"
}

Activation records are only generated when triggers have enabled rules with valid actions attached

Example - WC Goal Bot

This is great in theory but what about in practice?

Goal Bot was a small serverless application I built in 2018 for the World Cup. It was a Twitter bot which tweeted out all goals scored in real-time. The application used the “actions connected via triggers events” architecture pattern. This made development and testing easier and faster.

This function has two functions goals and twitter.

goals was responsible for detecting new goals scored using an external API. When invoked, it would retrieve all goals currently scored in the World Cup. Comparing the API response to a previous cached version calculated new goals scored. This function was connected to the alarm event source to run once a minute.

twitter was responsible for sending tweets from the @WC_Goals account. Twitter’s API was used to create goal tweets constructed from the event parameters.

Goal events detected in the goals function need to be used to invoke the twitter function.

Rather than the goals function invoke the twitter function directly, a trigger event (goal) was fired. The twitter function was bound to the goal trigger using a custom rule.

De-coupling the two tasks in my application (checking for new goals and creating tweets) using triggers and rules had the following benefits…

  • The goals function could be invoked in testing without tweets being sent. By disabling the rule binding the twitter function to the trigger, the goals function can fire events without causing side-effects.

  • Compared to having a “mono-function” combining both tasks, splitting tasks into functions means the twitter function can be tested with manual events, rather than having to manipulate the database and stub API responses to generate the correct test data.

  • It would also be easy to extend this architecture with additional notification services, like slack bots. New notification services could be attached to the same trigger source with an additional rule. This would not require any changes to the goals or twitter functions.

Triggers versus Queues

Another common solution to de-coupling functions in serverless architectures is using message queues.

Functions push events in external queues, rather than invoking triggers directly. Event sources are responsible for firing the registered functions with new messages. Apache OpenWhisk supports Kafka as an event source which could be used with this approach.

How does firing triggers directly compare to pushing events into an external queue (or other event source)?

Both queues and triggers can be used to achieve the same goal (”connect functions via events”) but have different semantics. It is important to understand the benefits of both to choose the most appropriate architecture for your application.

benefits of using triggers against queues

Triggers are built into the Apache OpenWhisk platform. There is no configuration needed to use them. External event sources like queues need to be provisioned and managed as additional cloud services.

Trigger invocations are free in IBM Cloud Functions. IBM Cloud Functions charges only for execution time and memory used in functions. Queues will incur additional usage costs based on the service’s pricing plan.

disadvantages of using triggers against queues

Triggers are not queues. Triggers are not queues. Triggers are not queues. ๐Ÿ’ฏ

If a trigger is fired and no actions are connected, the event is lost. Trigger events are not persisted until listeners are attached. If you need event persistence, message priorities, disaster recovery and other advanced features provided by message queues, use a message queue!

Triggers are subject to rate limiting in Apache OpenWhisk. In IBM Cloud Functions, this defaults to 1000 concurrent invocations and 5000 total invocations per namespace per minute. These limits can be raised through a support ticket but there are practical limits to the maximum rates allowed. Queues have support for much higher throughput rates.

External event providers are also responsible for handling the retries when triggers have been rate-limited due to excess events. Invoking triggers manually relies on the invoking function to handle this. Emulating retry behaviour from an event provider is impractical due to costs and limits on function duration.

Other hints and tips

Want to invoke an action which fires triggers without setting off listeners?

Rules can be dynamically disabled without having to remove them. This can be used during integration testing or debugging issues in production.

1
2
wsk rule disable RULE_NAME
wsk rule enable RULE_NAME

Want to verify triggers are fired with correct events without mocking client libraries?

Trigger events are not logged unless there is at least one enabled rule. Create a new rule which binds the /whisk.system/utils/echo action to the trigger. This built-in function just returns input parameters as the function response. This means the activation records with trigger events will now be available.

conclusion

Building event-driven serverless applications from loosely-coupled functions has numerous benefits including development speed, improved testability, deployment velocity, lower costs and more.

Decomposing “monolithic” apps into independent serverless functions often needs event handling functions to trigger off multiple backend operations, implemented in separate serverless functions. Developers unfamiliar with serverless often resort to direct function invocations.

Whilst this works, it introduces tight coupling between those functions, which is normally avoided in software engineering. This approach has even been highlighted as a “serverless” anti-pattern.

Apache OpenWhisk has an awesome feature to help with this problems, triggers and rules!

Triggers provide a lightweight event firing mechanism in the platform. Rules bind actions to triggers to automate invoking actions when events are fired. Applications can fire trigger events to invoke other operations, rather than using direct invocations. This keeps the event sender and receivers de-coupled from each other. ๐Ÿ‘

Highly Available Serverless Apps With Cloudant’s Cross-Region Replication

Building highly available serverless applications relies on eliminating ”single points of failure” from application architectures.

Existing tutorials showed how to deploy the same serverless application on IBM Cloud in different regions. Using the Global Load Balancer from IBM Cloud Internet Services, traffic is distributed across multiple applications from the same hostname. The Global Load Balancer automatically detects outages in the regional applications and redirects traffics as necessary.

But what if all instances rely on the same database service and that has issues? ๐Ÿ˜ฑ๐Ÿ”ฅ

In addition to running multiple instances of the application, independent databases in different regions are also necessary for a highly available serverless application. Maintaining consistent application state across regions needs all database changes to be automatically synchronised between instances. ๐Ÿค”

In this blog post, we’re going to look at using IBM Cloudant’s replication service to set up a ”multi-master” replication between regional database instances.

Once this is enabled, database changes will automatically be synchronised in real-time between all database instances. Serverless applications can use their regional database instance and be confident application state will be consistent globally (for some definition of consistent…). ๐Ÿ’ฏ

example serverless application - todo backend

This serverless application implements a TODO backend using IBM Cloud Functions and IBM Cloudant.

It provides an REST API for interacting with a TODO service. This can be used with the front-end client to add, complete and remove todos from a list.

Let’s make this example serverless application “highly available”. ๐Ÿ‘

The application will be deployed to two different IBM Cloud regions (London and Dallas). Separate database instances will be provisioned in each region. Applications will use their regional database instance but share global state via replication.

deploy serverless app to multiple regions

This Github repo has an automatic deployment script to deploy the serverless application (using wskdeploy) and application services (using terraform).

Install the prerequisites listed here before proceeding with these instructions.

download example application

  • Clone the Git repository to a local directory.
1
git clone https://github.com/IBM/ibm-cloud-functions-refarch-serverless-apis
  • Enter the source code directory.
1
cd ibm-cloud-functions-refarch-serverless-apis

create IAM key for serverless app

Have you already signed up for an IBM Cloud account and installed the CLI? If not, please do that before proceeding.

  • Create an IAM key which will be used to deploy the serverless application.
1
ibmcloud iam api-key-create serverless_api --file serverless_api.apikey

configure deployment variables

  • Create the local.env file in the current directory will the following contents.
1
2
3
4
5
6
IBMCLOUD_API_KEY=<IAM_API_KEY>
IBMCLOUD_ORG=<YOUR_ORG>
IBMCLOUD_SPACE=<REGION_SPACE>
IBMCLOUD_REGION=
PROVISION_INFRASTRUCTURE=true
API_USE_APPID=false
  • Replace the <IAM_API_KEY> value with the apikey value from the serverless_api.apikey file.
  • Replace the <IBMCLOUD_ORG> value with an IBM Cloud organisation.
  • Replace the <IBMCLOUD_SPACE> value with an IBM Cloud space.

The PROVISION_INFRASTRUCTURE parameter makes the deployment script automatically provision all application resources using Terraform.

Secured API endpoints are not required for this demonstration. Setting the API_USE_APPID parameter to false disables authentication on the endpoints and provisioning the AppID service.

deploy to london

  • Set the IBMCLOUD_REGION to eu-gb in the local.env file.
  • Run the following command to deploy the application and provision all application resources.
1
./deploy.sh --install

If the deployment have succeed, the following message should be printed to the console.

1
2
3
4
5
2019-01-08 10:51:51 All done.
ok: APIs
Action                                      Verb  API Name  URL
/<ORG>_<SPACE>/todo_package/todo/get_todo   get   todos     https://<UK_APIGW_URL>/todo
...

deploy to dallas

  • Rename the terraform.tfstate file in the infra folder to terraform.tfstate.london

  • Set the IBMCLOUD_REGION to us-south in the local.env file.

  • Run the following command to deploy the application and provision all application resources.
1
./deploy.sh --install

If the deployment have succeed, the following message should be printed to the console.

1
2
3
4
5
2019-01-08 10:51:51 All done.
ok: APIs
Action                                      Verb  API Name  URL
/<ORG>_<SPACE>/todo_package/todo/get_todo   get   todos     https://<US_APIGW_URL>/todo
...

configure cloudant cross-region replication

There are now multiple copies of the same serverless application in different regions. Each region has an independent instance of Cloudant provisioned.

Cloudant replication is a one-way synchronisation from a source to a destination database. To set up a bi-directional data synchronisation, two different replications will need to be configured.

create api keys for replication access

Before configuring replication between the regional databases, API keys need to be created to allow remote access on both hosts. API keys need to be created per regional instance.

  • Open the Cloudant Dashboard for each service instance.

Follow these instructions on both hosts to generate API keys for replication with the correct permissions.

  • Click the “Databases” icon to show all the databases on this instance.
  • Click the ๐Ÿ”’ icon in the “todos” database row in the table to open the permissions page.

Can’t find the “todos” database in the Cloudant dashboard? Make sure you interact with the TODO backend from the front-end application. This will automatically create the database if it doesn’t exist.

  • Click “Generate API Key” on the permissions page.
  • Make a note of the key identifier and password.
  • Set the _reader_, _writer and _replicator permissions for the newly created key.

set up cross-region replication

Replication jobs need to be configured on both database hosts. These can be created from the Cloudant dashboard. Repeat these instructions on both hosts.

  • Open the Cloudant Dashboard for each service instance.
  • Click the “Replication” icon from the panel menu.
  • Click the “New Replication” button.
  • Set the following “Source” values in the “Job configuration” panel.
    • Type: “Local Database”
    • Name: “todos”
    • Authentication: “Cloudant username or API Key”
    • Fill in the API key and password for this local database host in the input fields.

  • Set the following “Target” values in the “Job configuration” panel.
    • Type: “Existing Remote Database”
    • Name: “https:///todos”
    • Authentication: “Cloudant username or API Key”
    • Fill in the API key and password for the remote database host in the input fields.

Wondering what the REMOTE_CLOUDANT_HOST is? Use hostname from the Cloudant dashboard, e.g. XXXX-bluemix.cloudant.com

  • Set the following “Options” values in the “Job configuration” panel.
    • Replication type: “Continuous”

  • Click “Start Replication”
  • Verify the replication table shows the new replication task state as ”Running”. ๐Ÿ‘

test it out!

Use the TODO front-end application with the APIGW URLs for each region simultaneously. Interactions with the todo list in one region should automatically propagate to the other region.

The “Active Tasks” panel on the Cloudant Dashboard shows the documents replicated between instances and pending changes. If there are errors synchronising changes to the replication target, the host uses exponential backoff to re-try the replication tasks.

Conflicts between document changes are handled using CouchDB’s conflict mechanism. Applications are responsible for detecting and resolving document conflicts in the front-end.

conclusion

Running the same serverless application in multiple regions, using the GLB to proxy traffic, allows applications to manage regional outages. But what if all the application instances rely on the same database service? The “single point of failure” has shifted from the application runtime to the database host. ๐Ÿ‘Ž

Provisioning independent databases in each application regions is one solution. Applications use their regional database instance and are protected from issues in other regions. This strategy relies on database changes being synchronised between instances to keep the application state consistent. ๐Ÿ‘

IBM Cloudant has a built-in replication service to synchronised changes between source and host databases. Setting up bi-directional replication tasks between all instances enables a “multi-master” replication strategy. This allows applications to access any database instance and have the same state available globally. ๐Ÿ•บ๐Ÿ•บ๐Ÿ•บ

Using Custom Domains With IBM Cloud Functions

In this tutorial, I’m going to show you how to use a custom domain for serverless functions exposed as APIs on IBM Cloud. APIs endpoints use a random sub-domain on IBM Cloud by default. Importing your own domains means endpoints can be accessible through custom URLs.

Registering a custom domain with IBM Cloud needs you to complete the following steps…

This tutorial assumes you already have actions on IBM Cloud Functions exposed as HTTP APIs using the built-in API service. If you haven’t done that yet, please see the documentation here before you proceed.

The instructions below set up a sub-domain (api.<YOUR_DOMAIN>) to access serverless functions.

Generating SSL/TLS Certificates with Let’s Encrypt

IBM Cloud APIs only supports HTTPS traffic with custom domains. Users needs to upload valid SSL/TLS certificates for those domains to IBM Cloud before being able to use them.

Let’s Encrypt is a Certificate Authority which provides free SSL/TLS certificates for domains. Let’s Encrypt is trusted by all root identify providers. This means certificates generated by this provider will be trusted by all major operating systems, web browsers, and devices.

Using this service, valid certificates can be generated to support custom domains on IBM Cloud.

domain validation

Let’s Encrypt needs to verify you control the domain before generating certificates.

During the verification process, the user makes an authentication token available through the domain. The service supports numerous methods for exposing the authentication token, including HTTP endpoints, DNS TXT records or TLS SNI.

There is an application (certbot) which automates generating authentication tokens and certificates.

I’m going to use the DNS TXT record as the challenge mechanism. Using this approach, certbot will provide a random authentication token I need to create as the TXT record value under the _acme-challenge.<YOUR_DOMAIN> sub-domain before validation.

using certbot with dns txt validation

1
brew install certbot
1
certbot certonly --manual --preferred-challenges=dns -d *.<YOUR_DOMAIN>

I’m generating a wildcard certificate for any sub-domains under <YOUR_DOMAIN>. This allows me to use the same certificate with different sub-domains on IBM Cloud, rather than generating a certificate per sub-domain.

During the validation process, certbot should display the following message with the challenge token.

1
2
3
4
5
6
7
8
Please deploy a DNS TXT record under the name
_acme-challenge.<YOUR_DOMAIN> with the following value:

<CHALLENGE_TOKEN>

Before continuing, verify the record is deployed.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Press Enter to Continue

setting challenge token

  • Take the challenge token from certbot and create a new TXT record with this value for the _acme-challenge.<YOUR_DOMAIN> sub-domain.

  • Use the dig command to verify the TXT record is available.

1
dig -t txt _acme-challenge.<YOUR_DOMAIN>

The challenge token should be available in the DNS response shown by dig. ๐Ÿ‘

1
2
;; ANSWER SECTION:
_acme-challenge.<YOUR_DOMAIN>. 3599 IN  TXT "<CHALLENGE_TOKEN>"
  • Press Enter in the terminal session running certbot when the challenge token is available.

retrieving domain certificates

certbot will now retrieve the TXT record for the sub-domain and verify it matches the challenge token. If the domain has been validated, certbot will show the directory containing the newly created certificates.

1
2
3
4
5
6
7
IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/<YOUR_DOMAIN>/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/<YOUR_DOMAIN>/privkey.pem
   Your cert will expire on 2019-03-03.
...

certbot creates the following files.

  • cert.pem - public domain certificate
  • privkey.pem - private key for domain certificate
  • chain.pem - intermediate domain certificates
  • fullchain.pem - public and intermediate domain certificates in a single file.

Registering the domain with IBM Cloud will require the public, private and intermediate certificate files.

Registering Custom Domain with IBM Cloud

Certificates for custom domains in IBM Cloud are managed by the Certificate Manager service.

  • Create a new instance of the service from the IBM Cloud Catalog.
  • From the service homepage, click the ”Import Certificate” button.
  • Fill in the following fields in the import form. Use the generated certificate files in the upload fields.
    • Name
    • Certificate File (cert.pem)
    • Private key file (privkey.pem)
    • Intermediate certificate file (chain.pem)

After importing the certificate, check the certificate properties match the expected values

Binding Domain to IBM Cloud Functions APIs

Custom domains for APIs on IBM Cloud are managed through the IBM Cloud APIs console.

  • Open the ”Custom Domains” section on the IBM Cloud APIs console.
  • Check the “Region” selector matches the region chosen for your actions and APIs.
  • Click the ยทยทยท icon on the row where “Organisation” and “Space” values match your APIs.
  • Click ”Change Settings” from the pop-up menu.

domain validation

IBM Cloud now needs to verify you control the custom domain being used.

Another DNS TXT record needs to be created before attempting to bind the domain.

  • From the ”Custom Domain Settings” menu, make a note of the ”Default domain / alias” value. This should be in the format: <APP_ID>.<REGION>.apiconnect.appdomain.cloud.
  • Create a new TXT record for the custom sub-domain (api.<YOUR_DOMAIN>) with the default domain alias as the record value (<APP_ID>.<REGION>.apiconnect.appdomain.cloud).
  • Use the dig command to check the sub-domain TXT record exists and contains the correct value.
1
dig -t txt api.<YOUR_DOMAIN>

The default domain alias value should be available in the DNS response shown by dig. ๐Ÿ‘

1
2
;; ANSWER SECTION:
api.<YOUR_DOMAIN>. 3599 IN  TXT "<APP_ID>.<REGION>.apiconnect.appdomain.cloud"

Having created the TXT record, fill in the Custom Domain Settings form.

custom domain settings

  • Select the ”Assign custom domain” checkbox in the ”Custom domain settings” form.
  • Fill in the following form fields.
    • Domain Name: use the custom sub-domain to bind (api.<YOUR-DOMAIN>).
    • Certificate Manager service: select the certificate manger instance.
    • Certificate: select the domain certificate from the drop-down menu.
  • Click the ”Save” button.

Once the domain has been validated, the form will redirect to the custom domains overview. The “Custom Domain” field will now show the sub-domain bound to the correct default domain alias.

add CNAME record

  • Remove the existing TXT record for the custom sub-domain (api.<YOUR-DOMAIN>).
  • Add a new CNAME record mapping the custom sub-domain (api.<YOUR-DOMAIN>) to the ”Default domain / alias” on IBM Cloud (<APP_ID>.<REGION>.apiconnect.appdomain.cloud).
  • Use the dig command to check the CNAME record is correct.
1
dig -t CNAME api.<YOUR_DOMAIN>

The default domain alias value should be available in the DNS response shown by dig. ๐Ÿ‘

1
2
;; ANSWER SECTION:
api.<YOUR_DOMAIN>.  3599    IN  CNAME   <APP_ID>.<REGION>.apiconnect.appdomain.cloud.

Testing It Out

Functions should now be accessible through both the default domain alias and the new custom domain. ๐Ÿ‘

  • Invoke the default domain alias API URL for the function.
1
curl https://<APP_ID>.<REGION>.apiconnect.appdomain.cloud/<BASE_PATH>/<SUB_PATH> 

Both the BASE_PATH and SUB_PATH values come from the API definitions configured by the user.

  • Invoke the custom domain API URL for the function.
1
curl https://api.<YOUR_DOMAIN>/<BASE_PATH>/<SUB_PATH> 

Make sure you use HTTPS protocol in the URL. IBM Cloud does not support HTTP traffic with custom domains.

Both responses for these URLs should the same! Hurrah. ๐Ÿ˜Ž

Finding Photos on Twitter Using Face Recognition With TensorFlow.js

As a developer advocate, I spend a lot of time at developer conferences (talking about serverless ๐Ÿ˜Ž). Upon returning from each trip, I need to compile a “trip report” on the event for my bosses. This helps demonstrate the value in attending events and that I’m not just accruing air miles and hotel points for fun… ๐Ÿ›ซ๐Ÿจ

I always include any social media content people post about my talks in the trip report. This is usually tweets with photos of me on stage. If people are tweeting about your session, I assume they enjoyed it and wanted to share with their followers.

Finding tweets with photos about your talk from attendees is surprisingly challenging.

Attendees often forget to include your twitter username in their tweets. This means the only way to find those photos is to manually scroll through all the results from the conference hashtag. This is problematic at conferences with thousands of attendees all tweeting during the event. #devrelproblems.

Having become bored of manually trawling through all the tweets for each conference, I had a thought…

“Can’t I write some code to do this for me?”

This didn’t seem like too ridiculous an idea. Twitter has an API, which would allow me to retrieve all tweets for a conference hashtag. Once I had all the tweet photos, couldn’t I run some magic AI algorithm over the images to tell me if I was in them? ๐Ÿค”

After a couple of weeks of hacking around (and overcoming numerous challenges) I had (to my own amazement) managed to build a serverless application which can find unlabelled photos of a person on twitter using machine learning with TensorFlow.js.

FindMe Example

If you just want to try this application yourself, follow the instructions in the Github repo: https://github.com/jthomas/findme

architecture

FindMe Architecture Diagram

This application has four serverless functions (two API handlers and two backend services) and a client-side application from a static web page. Users log into the client-side application using Auth0 with their Twitter account. This provides the backend application with the user’s profile image and Twitter API credentials.

When the user invokes a search query, the client-side application invokes the API endpoint for the register_search function with the query terms and twitter credentials. This function registers a new search job in Redis and fires a new search_request trigger event with the query and job id. This job identifier is returned to the client to poll for real-time status updates.

The twitter_search function is connected to the search_request trigger and invoked for each event. It uses the Twitter Search API to retrieve all tweets for the search terms. If tweets retrieved from the API contain photos, those tweet ids (with photo urls) are fired as new tweet_image trigger events.

The compare_images function is connected to the tweet_image trigger. When invoked, it downloads the user’s twitter profile image along with the tweet image and runs face detection against both images, using the face-api.js library. If any faces in the tweet photo match the face in the user’s profile image, tweet ids are written to Redis before exiting.

The client-side web page polls for real-time search results by polling the API endpoint for the search_status function with the search job id. Tweets with matching faces are displayed on the web page using the Twitter JS library.

challenges

Since I had found an NPM library to handle face detection, I could just use this on a serverless platform by including the library within the zip file used to create my serverless application? Sounds easy, right?!

ahem - not so faas-t…. โœ‹

As discussed in previous blog posts, there are numerous challenges in using TF.js-based libraries on serverless platforms. Starting with making the packages available in the runtime and loading model files to converting images for classification, these libraries are not like using normal NPM modules.

Here are the main challenges I had to overcome to make this serverless application work…

using tf.js libraries on a serverless platform

The Node.js backend drivers for TensorFlow.js use a native shared C++ library (libtensorflow.so) to execute models on the CPU or GPU. This native dependency is compiled for the platform during the npm install process. The shared library file is around 142MB, which is too large to include in the deployment package for most serverless platforms.

Normal workarounds for this issue store large dependencies in an object store. These files are dynamically retrieved during cold starts and stored in the runtime filesystem, as shown in this pseudo-code. This workaround does add an additional delay to cold start invocations.

1
2
3
4
5
6
7
8
9
10
11
let cold_start = false

const library = 'libtensorflow.so'

if (cold_start) {
  const data = from_object_store(library)
  write_to_fs(library, data)
  cold_start = true
}

// rest of function codeโ€ฆ

Fortunately, I had a better solution using Apache OpenWhisk’s support for custom Docker runtimes!

This feature allows serverless applications to use custom Docker images as the runtime environment. Creating custom images with large libraries pre-installed means they can be excluded from deployment packages. ๐Ÿ’ฏ

Apache OpenWhisk publishes all existing runtime images on Docker Hub. Using existing runtime images as base images means Dockerfiles for custom runtimes are minimal. Here’s the Dockerfile needed to build a custom runtime with the TensorFlow.js Node.js backend drivers pre-installed.

1
2
3
FROM openwhisk/action-nodejs-v8:latest

RUN npm install @tensorflow/tfjs-node

Once this image has been built and published on Dockerhub, you can use it when creating new functions.

I used this approach to build a custom TensorFlow.js runtime which is available on Docker Hub: jamesthomas/action-nodejs-v8:tfjs-faceapi

OpenWhisk actions created using the wsk command-line use a configuration flag (--docker) to specify custom runtime images.

1
wsk action create classify source.js --docker jamesthomas/action-nodejs-v8:tfjs-faceapi

The OpenWhisk provider plugin for The Serverless Framework also supports custom runtime images through a configuration parameter (image) under the function configuration.

1
2
3
4
5
6
7
8
9
service: machine-learning

provider:
  name: openwhisk

functions:
  classify:
    handler: source.main
    image: jamesthomas/action-nodejs-v8:tfjs-faceapi

Having fixed the issue of library loading on serverless platforms, I could move onto the next problem, loading the pre-trained models… ๐Ÿ’ฝ

loading pre-trained models

Running the example code to load the pre-trained models for face recognition gave me this error:

1
ReferenceError: fetch is not defined

In the previous blog post, I discovered how to manually load TensorFlow.js models from the filesystem using the file:// URI prefix. Unfortunately, the face-api.js library doesn’t support this feature. Models are automatically loaded using the fetch HTTP client. This HTTP client is available into modern browsers but not in the Node.js runtime.

Overcoming this issue relies on providing an instance of a compatible HTTP client in the runtime. The node-fetch library is a implementation of the fetch client API for the Node.js runtime. By manually installing this module and exporting as a global variable, the library can then use the HTTP client as expected.

1
2
// Make HTTP client available in runtime
global.fetch = require('node-fetch')

Model configuration and weight files can then be loaded from the library’s Github repository using this URL:

https://raw.githubusercontent.com/justadudewhohacks/face-api.js/master/weights/

1
faceapi.loadFaceDetectionModel('<GITHUB_URL>')

face detection in images

The face-api.js library has a utility function (models.allFaces) to automatically detect and calculate descriptors for all faces found in an image. Descriptors are a feature vector (of 128 32-bit float values) which uniquely describes the characteristics of a persons face.

1
const results = await models.allFaces(input, minConfidence)

The input to this function is the input tensor with the RGB values from an image. In a previous blog post, I explained how to convert an image from the filesystem in Node.js to the input tensor needed by the model.

Finding a user by comparing their twitter profile against photos from tweets starts by running face detection against both images. By comparing computed descriptor values, a measure of similarity can be established between faces from the images.

face comparison

Once the face descriptors have been calculated the library provides a utility function to compute the euclidian distance between two descriptors vectors. If the difference between two face descriptors is less than a threshold value, this is used to identify the same person in both images.

1
2
3
4
5
6
const distance = faceapi.euclideanDistance(descriptor1, descriptor2)

if (distance < 0.6)
  console.log('match')
else
  console.log('no match')

I’ve no idea why 0.6 is chosen as the threshold value but this seemed to work for me! Even small changes to this value dramatically reduced the precision and recall rates for my test data. I’m calling it the Goldilocks value, just use it…

performance

Once I had the end to end application working, I wanted to make it was fast as possible. By optimising the performance, I could improve the application responsiveness and reduce compute costs for my backend. Time is literally money with serverless platforms.

baseline performance

Before attempting to optimise my application, I needed to understand the baseline performance. Setting up experiments to record invocation durations gave me the following average test results.

  • Warm invocations: ~5 seconds
  • Cold invocations: ~8 seconds

Instrumenting the code with console.time statements revealed execution time was comprised of five main sections.

Cold Starts Warm Starts
Initialisation 1200 ms 0 ms
Model Loading 3200 ms 2000 ms
Image Loading 500 ms x 2 500 ms x 2
Face Detection 700 ms - 900 ms x 2 700 ms - 900 ms x 2
Everything Else 1000 ms 500 ms
Total Duration ~ 8 seconds ~ 5 seconds

Initialisation was the delay during cold starts to create the runtime environment and load all the library files and application code. Model Loading recorded the time spent instantiating the TF.js models from the source files. Image Loading was the time spent converting the RGB values from images into input tensors, this happened twice, once for the twitter profile picture and again for the tweet photo. Face Detection is the elapsed time to execute the models.allFaces method and faceapi.euclideanDistance methods for all the detected faces. Everything else is well… everything else.

Since model loading was the largest section, this seemed like an obvious place to start optimising. ๐Ÿ“ˆ๐Ÿ“‰

loading model files from disk

Overcoming the initial model loading issue relied on manually exposing the expected HTTP client in the Node.js runtime. This allowed models to be dynamically loaded (over HTTP) from the external Github repository. Models files were about 36MB.

My first idea was to load these model files from the filesystem, which should be much faster than downloading from Github. Since I was already building a custom Docker runtime, it was a one-line change to include the model files within the runtime filesystem.

1
2
3
4
5
FROM openwhisk/action-nodejs-v8:latest

RUN npm install @tensorflow/tfjs-node

COPY weights weights

Having re-built the image and pushed to Docker Hub, the classification function’s runtime environment now included models files in the filesystem.

But how do we make the face-api.js library load models files from the filesystem when it is using a HTTP client?

My solution was to write a fetch client that proxied calls to retrieve files from a HTTP endpoint to the local filesystem. ๐Ÿ˜ฑ I’d let you decide whether this is a brilliant or terrible idea!

1
2
3
4
5
6
7
8
global.fetch = async (file) => {
  return {
    json: () => JSON.parse(fs.readFileSync(file, 'utf8')),
    arrayBuffer: () => fs.readFileSync(file)
  }
}

const model = await models.load('/weights')

The face-api.js library only used two methods (json() & arrayBuffer()) from the HTTP client. Stubbing out these methods to proxy fs.readFileSync meant files paths were loaded from the filesystem. Amazingly, this seemed to just work, hurrah!

Implementing this feature and re-running performance tests revealed this optimisation saved about 500 ms from the Model Loading section.

Cold Starts Warm Starts
Initialisation 1200 ms 0 ms
Model Loading 2700 ms 1500 ms
Image Loading 500 ms x 2 500 ms x 2
Face Detection 700 ms - 900 ms x 2 700 ms - 900 ms x 2
Everything Else 1000 ms 500 ms
Total Duration ~ 7.5 seconds ~ 4.5 seconds

This was less of an improvement than I’d expected. Parsing all the model files and instantiating the internal objects was more computationally intensive than I realised. This performance improvement did improve both cold and warm invocations, which was a bonus.

Despite this optimisation, model loading was still the largest section in the classification function…

caching loaded models

There’s a good strategy to use when optimising serverless functions…

CACHE ALL THE THINGS

Serverless runtimes re-use runtime containers for consecutive requests, known as warm environments. Using local state, like global variables or the runtime filesystem, to cache data between requests can be used to improve performance during those invocations.

Since model loading was such an expensive process, I wanted to cache initialised models. Using a global variable, I could control whether to trigger model loading or return the pre-loaded models. Warm environments would re-use pre-loaded models and remove model loading delay.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
const faceapi = require('face-api.js')

let LOADED = false

exports.load = async location => {
  if (!LOADED) {
    await faceapi.loadFaceDetectionModel(location)
    await faceapi.loadFaceRecognitionModel(location)
    await faceapi.loadFaceLandmarkModel(location)

    LOADED = true
  }

  return faceapi
}

This performance improvement had a significant impact of the performance for warm invocations. Model loading became “free”. ๐Ÿ‘

Cold Starts Warm Starts
Initialisation 1200 ms 0 ms
Model Loading 2700 ms 0 ms
Image Loading 500 ms x 2 500 ms x 2
Face Detection 700 ms - 900 ms x 2 700 ms - 900 ms x 2
Everything Else 1000 ms 500 ms
Total Duration ~ 7.5 seconds ~ 3 seconds

caching face descriptors

In the initial implementation, the face comparison function was executing face detection against both the user’s twitter profile image and tweet photo for comparison. Since the twitter profile image was the same in each search request, running face detection against this image would always return the same results.

Rather than having this work being redundantly computed in each function, caching the results of the computed face descriptor for the profile image meant it could re-used across invocations. This would reduce by 50% the work necessary in the Image & Model Loading sections.

The face-api.js library returns the face descriptor as a typed array with 128 32-bit float values. Encoding this values as a hex string allows them to be stored and retrieved from Redis. This code was used to convert float values to hex strings, whilst maintaining the exact precision of those float values.

1
2
3
4
5
6
7
8
9
10
11
const encode = typearr => {
  const encoded = Buffer.from(typearr.buffer).toString('hex')
  return encoded
}

const decode = encoded => {
  const decoded = Buffer.from(encoded, 'hex')
  const uints = new Uint8Array(decoded)
  const floats = new Float32Array(uints.buffer)
  return floats
}

This optimisation improves the performance of most cold invocations and all warm invocations, removing over 1200 ms of computation time to compute the results.

Cold Starts (Cached) Warm Starts
Initialisation 1200 ms 0 ms |
Model Loading 2700 ms 1500 ms |
Image Loading 500 ms 500 ms |
Face Detection 700 ms - 900 ms 700 ms - 900 ms |
Everything Else 1000 ms 500 ms |
Total Duration ~ 6 seconds ~ 2.5 seconds |
Cold Starts Warm Starts
Initialisation 1200 ms 0 ms
Model Loading 2700 ms 0 ms
Image Loading 500 ms 500 ms
Face Detection 700 ms - 900 ms 700 ms - 900 ms
Everything Else 1000 ms 500 ms
Total Duration ~ 7.5 seconds ~ 3 seconds

final results + cost

Application performance was massively improved with all these optimisations. As demonstrated in the video above, the application could process tweets in real-time, returning almost instant results. Average invocation durations were now.

  • Warm invocations: ~2.5 seconds
  • Cold invocations (Cached): ~6 seconds

Serverless platforms charge for compute time by the millisecond, so these improvements led to cost savings of 25% for cold invocations (apart the first classification for a user) and 50% for warm invocations.

Classification functions used 512MB of RAM which meant IBM Cloud Functions would provide 320,000 “warm” classifications or 133,333 “cold” classifications within the free tier each month. Ignoring the free tier, 100,000 “warm” classifications would cost $5.10 and 100,000 “cold” classifications $2.13.

conclusion

Using TensorFlow.js with serverless cloud platforms makes it easy to build scalable machine learning applications in the cloud. Using the horizontal scaling capabilities of serverless platforms, thousands of model classifications can be ran in parallel. This can be more performant than having dedicated hardware with a GPU, especially with compute costs for serverless applications being so cheap.

TensorFlow.js is ideally suited to serverless application due to the JS interface, (relatively) small library size and availability of pre-trained models. Despite having no prior experience in Machine Learning, I was able to use the library to build a face recognition pipeline, processing 100s of images in parallel, for real-time results. This amazing library opens up machine learning to a whole new audience!

Serverless Machine Learning With TensorFlow.js

In a previous blog post, I showed how to use TensorFlow.js on Node.js to run visual recognition on images from the local filesystem. TensorFlow.js is a JavaScript version of the open-source machine learning library from Google.

Once I had this working with a local Node.js script, my next idea was to convert it into a serverless function. Running this function on IBM Cloud Functions (Apache OpenWhisk) would turn the script into my own visual recognition microservice.

Sounds easy, right? It’s just a JavaScript library? So, zip it up and away we go… ahem ๐Ÿ‘Š

Converting the image classification script to run in a serverless environment had the following challenges…

  • TensorFlow.js libraries need to be available in the runtime.
  • Native bindings for the library must be compiled against the platform architecture.
  • Models files need to be loaded from the filesystem.

Some of these issues were more challenging than others to fix! Let’s start by looking at the details of each issue, before explaining how Docker support in Apache OpenWhisk can be used to resolve them all.

Challenges

TensorFlow.js Libraries

TensorFlow.js libraries are not included in the Node.js runtimes provided by the Apache OpenWhisk.

External libraries can be imported into the runtime by deploying applications from a zip file. Custom node_modules folders included in the zip file will be extracted in the runtime. Zip files are limited to a maximum size of 48MB.

Library Size

Running npm install for the TensorFlow.js libraries used revealed the first problem… the resulting node_modules directory was 175MB. ๐Ÿ˜ฑ

Looking at the contents of this folder, the tfjs-node module compiles a native shared library (libtensorflow.so) that is 135M. This means no amount of JavaScript minification is going to get those external dependencies under the magic 48 MB limit. ๐Ÿ‘Ž

Native Dependencies

The libtensorflow.so native shared library must be compiled using the platform runtime. Running npm install locally automatically compiles native dependencies against the host platform. Local environments may use different CPU architectures (Mac vs Linux) or link against shared libraries not available in the serverless runtime.

MobileNet Model Files

TensorFlow models files need loading from the filesystem in Node.js. Serverless runtimes do provide a temporary filesystem inside the runtime environment. Files from deployment zip files are automatically extracted into this environment before invocations. There is no external access to this filesystem outside the lifecycle of the serverless function.

Models files for the MobileNet model were 16MB. If these files are included in the deployment package, it leaves 32MB for the rest of the application source code. Although the model files are small enough to include in the zip file, what about the TensorFlow.js libraries? Is this the end of the blog post? Not so fast….

Apache OpenWhisk’s support for custom runtimes provides a simple solution to all these issues!

Custom Runtimes

Apache OpenWhisk uses Docker containers as the runtime environments for serverless functions (actions). All platform runtime images are published on Docker Hub, allowing developers to start these environments locally.

Developers can also specify custom runtime images when creating actions. These images must be publicly available on Docker Hub. Custom runtimes have to expose the same HTTP API used by the platform for invoking actions.

Using platform runtime images as parent images makes it simple to build custom runtimes. Users can run commands during the Docker build to install additional libraries and other dependencies. The parent image already contains source files with the HTTP API service handling platform requests.

TensorFlow.js Runtime

Here is the Docker build file for the Node.js action runtime with additional TensorFlow.js dependencies.

1
2
3
4
5
FROM openwhisk/action-nodejs-v8:latest

RUN npm install @tensorflow/tfjs @tensorflow-models/mobilenet @tensorflow/tfjs-node jpeg-js

COPY mobilenet mobilenet

openwhisk/action-nodejs-v8:latest is the Node.js action runtime image published by OpenWhisk.

TensorFlow libraries and other dependencies are installed using npm install in the build process. Native dependencies for the @tensorflow/tfjs-node library are automatically compiled for the correct platform by installing during the build process.

Since I’m building a new runtime, I’ve also added the MobileNet model files to the image. Whilst not strictly necessary, removing them from the action zip file reduces deployment times.

Want to skip the next step? Use this image jamesthomas/action-nodejs-v8:tfjs rather than building your own.

Building The Runtime

In the previous blog post, I showed how to download model files from the public storage bucket.

  • Download a version of the MobileNet model and place all files in the mobilenet directory.
  • Copy the Docker build file from above to a local file named Dockerfile.
  • Run the Docker build command to generate a local image.
1
docker build -t tfjs .
1
docker tag tfjs <USERNAME>/action-nodejs-v8:tfjs

Replace <USERNAME> with your Docker Hub username.

1
 docker push <USERNAME>/action-nodejs-v8:tfjs

Once the image is available on Docker Hub, actions can be created using that runtime image. ๐Ÿ˜Ž

Example Code

This source code implements image classification as an OpenWhisk action. Image files are provided as a Base64 encoded string using the image property on the event parameters. Classification results are returned as the results property in the response.

Caching Loaded Models

Serverless platforms initialise runtime environments on-demand to handle invocations. Once a runtime environment has been created, it will be re-used for further invocations with some limits. This improves performance by removing the initialisation delay (“cold start”) from request processing.

Applications can exploit this behaviour by using global variables to maintain state across requests. This is often use to cache opened database connections or store initialisation data loaded from external systems.

I have used this pattern to cache the MobileNet model used for classification. During cold invocations, the model is loaded from the filesystem and stored in a global variable. Warm invocations then use the existence of that global variable to skip the model loading process with further requests.

Caching the model reduces the time (and therefore cost) for classifications on warm invocations.

Memory Leak

Running the Node.js script from blog post on IBM Cloud Functions was possible with minimal modifications. Unfortunately, performance testing revealed a memory leak in the handler function. ๐Ÿ˜ข

Reading more about how TensorFlow.js works on Node.js uncovered the issue…

TensorFlow.js’s Node.js extensions use a native C++ library to execute the Tensors on a CPU or GPU engine. Memory allocated for Tensor objects in the native library is retained until the application explicitly releases it or the process exits. TensorFlow.js provides a dispose method on the individual objects to free allocated memory. There is also a tf.tidy method to automatically clean up all allocated objects within a frame.

Reviewing the code, tensors were being created as model input from images on each request. These objects were not disposed before returning from the request handler. This meant native memory grew unbounded. Adding an explicit dispose call to free these objects before returning fixed the issue.

Profiling & Performance

Action code records memory usage and elapsed time at different stages in classification process.

Recording memory usage allows me to modify the maximum memory allocated to the function for optimal performance and cost. Node.js provides a standard library API to retrieve memory usage for the current process. Logging these values allows me to inspect memory usage at different stages.

Timing different tasks in the classification process, i.e. model loading, image classification, gives me an insight into how efficient classification is compared to other methods. Node.js has a standard library API for timers to record and print elapsed time to the console.

Demo

Deploy Action

  • Run the following command with the IBM Cloud CLI to create the action.
1
ibmcloud fn action create classify --docker <IMAGE_NAME> index.js

Replace <IMAGE_NAME> with the public Docker Hub image identifier for the custom runtime. Use jamesthomas/action-nodejs-v8:tfjs if you haven’t built this manually.

Testing It Out

1
wget http://bit.ly/2JYSal9 -O panda.jpg
  • Invoke the action with the Base64 encoded image as an input parameter.
1
 ibmcloud fn action invoke classify -r -p image $(base64 panda.jpg)
  • Returned JSON message contains classification probabilities. ๐Ÿผ๐Ÿผ๐Ÿผ
1
2
3
4
5
6
{
  "results":  [{
    className: 'giant panda, panda, panda bear, coon bear',
    probability: 0.9993536472320557
  }]
}

Activation Details

  • Retrieve logging output for the last activation to show performance data.
1
ibmcloud fn activation logs --last

Profiling and memory usage details are logged to stdout

1
2
3
4
5
6
7
8
9
10
11
12
13
14
prediction function called.
memory used: rss=150.46 MB, heapTotal=32.83 MB, heapUsed=20.29 MB, external=67.6 MB
loading image and model...
decodeImage: 74.233ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.05 MB, external=40.63 MB
imageByteArray: 5.676ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.05 MB, external=45.51 MB
imageToInput: 5.952ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.06 MB, external=45.51 MB
mn_model.classify: 274.805ms
memory used: rss=149.83 MB, heapTotal=24.33 MB, heapUsed=20.57 MB, external=45.51 MB
classification results: [...]
main: 356.639ms
memory used: rss=144.37 MB, heapTotal=24.33 MB, heapUsed=20.58 MB, external=45.51 MB

main is the total elapsed time for the action handler. mn_model.classify is the elapsed time for the image classification. Cold start requests print an extra log message with model loading time, loadModel: 394.547ms.

Performance Results

Invoking the classify action 1000 times for both cold and warm activations (using 256MB memory) generated the following performance results.

warm invocations

Classifications took an average of 316 milliseconds to process when using warm environments. Looking at the timing data, converting the Base64 encoded JPEG into the input tensor took around 100 milliseconds. Running the model classification task was in the 200 - 250 milliseconds range.

cold invocations

Classifications took an average of 1260 milliseconds to process when using cold environments. These requests incur penalties for initialising new runtime containers and loading models from the filesystem. Both of these tasks took around 400 milliseconds each.

One disadvantage of using custom runtime images in Apache OpenWhisk is the lack of pre-warmed containers. Pre-warming is used to reduce cold start times by starting runtime containers before they are needed. This is not supported for non-standard runtime images.

classification cost

IBM Cloud Functions provides a free tier of 400,000 GB/s per month. Each further second of execution is charged at $0.000017 per GB of memory allocated. Execution time is rounded up to the nearest 100ms.

If all activations were warm, a user could execute more than 4,000,000 classifications per month in the free tier using an action with 256MB. Once outside the free tier, around 600,000 further invocations would cost just over $1.

If all activations were cold, a user could execute more than 1,2000,000 classifications per month in the free tier using an action with 256MB. Once outside the free tier, around 180,000 further invocations would cost just over $1.

Conclusion

TensorFlow.js brings the power of deep learning to JavaScript developers. Using pre-trained models with the TensorFlow.js library makes it simple to extend JavaScript applications with complex machine learning tasks with minimal effort and code.

Getting a local script to run image classification was relatively simple, but converting to a serverless function came with more challenges! Apache OpenWhisk restricts the maximum application size to 50MB and native libraries dependencies were much larger than this limit.

Fortunately, Apache OpenWhisk’s custom runtime support allowed us to resolve all these issues. By building a custom runtime with native dependencies and models files, those libraries can be used on the platform without including them in the deployment package.

Machine Learning in Node.js With TensorFlow.js

TensorFlow.js is a new version of the popular open-source library which brings deep learning to JavaScript. Developers can now define, train, and run machine learning models using the high-level library API.

Pre-trained models mean developers can now easily perform complex tasks like visual recognition, generating music or detecting human poses with just a few lines of JavaScript.

Having started as a front-end library for web browsers, recent updates added experimental support for Node.js. This allows TensorFlow.js to be used in backend JavaScript applications without having to use Python.

Reading about the library, I wanted to test it out with a simple task… ๐Ÿง

Use TensorFlow.js to perform visual recognition on images using JavaScript from Node.js

Unfortunately, most of the documentation and example code provided uses the library in a browser. Project utilities provided to simplify loading and using pre-trained models have not yet been extended with Node.js support. Getting this working did end up with me spending a lot of time reading the Typescript source files for the library. ๐Ÿ‘Ž

However, after a few days’ hacking, I managed to get this completed! Hurrah! ๐Ÿคฉ

Before we dive into the code, let’s start with an overview of the different TensorFlow libraries.

TensorFlow

TensorFlow is an open-source software library for machine learning applications. TensorFlow can be used to implement neural networks and other deep learning algorithms.

Released by Google in November 2015, TensorFlow was originally a Python library. It used either CPU or GPU-based computation for training and evaluating machine learning models. The library was initially designed to run on high-performance servers with expensive GPUs.

Recent updates have extended the software to run in resource-constrained environments like mobile devices and web browsers.

TensorFlow Lite

Tensorflow Lite, a lightweight version of the library for mobile and embedded devices, was released in May 2017. This was accompanied by a new series of pre-trained deep learning models for vision recognition tasks, called MobileNet. MobileNet models were designed to work efficiently in resource-constrained environments like mobile devices.

TensorFlow.js

Following Tensorflow Lite, TensorFlow.js was announced in March 2018. This version of the library was designed to run in the browser, building on an earlier project called deeplearn.js. WebGL provides GPU access to the library. Developers use a JavaScript API to train, load and run models.

TensorFlow.js was recently extended to run on Node.js, using an extension library called tfjs-node.

The Node.js extension is an alpha release and still under active development.

Importing Existing Models Into TensorFlow.js

Existing TensorFlow and Keras models can be executed using the TensorFlow.js library. Models need converting to a new format using this tool before execution. Pre-trained and converted models for image classification, pose detection and k-nearest neighbours are available on Github.

Using TensorFlow.js in Node.js

Installing TensorFlow Libraries

TensorFlow.js can be installed from the NPM registry.

1
2
3
npm install @tensorflow/tfjs @tensorflow/tfjs-node
// or...
npm install @tensorflow/tfjs @tensorflow/tfjs-node-gpu

Both Node.js extensions use native dependencies which will be compiled on demand.

Loading TensorFlow Libraries

TensorFlow’s JavaScript API is exposed from the core library. Extension modules to enable Node.js support do not expose additional APIs.

1
2
3
4
5
const tf = require('@tensorflow/tfjs')
// Load the binding (CPU computation)
require('@tensorflow/tfjs-node')
// Or load the binding (GPU computation)
require('@tensorflow/tfjs-node-gpu')

Loading TensorFlow Models

TensorFlow.js provides an NPM library (tfjs-models) to ease loading pre-trained & converted models for image classification, pose detection and k-nearest neighbours.

The MobileNet model used for image classification is a deep neural network trained to identify 1000 different classes.

In the project’s README, the following example code is used to load the model.

1
2
3
4
import * as mobilenet from '@tensorflow-models/mobilenet';

// Load the model.
const model = await mobilenet.load();

One of the first challenges I encountered was that this does not work on Node.js.

1
Error: browserHTTPRequest is not supported outside the web browser.

Looking at the source code, the mobilenet library is a wrapper around the underlying tf.Model class. When the load() method is called, it automatically downloads the correct model files from an external HTTP address and instantiates the TensorFlow model.

The Node.js extension does not yet support HTTP requests to dynamically retrieve models. Instead, models must be manually loaded from the filesystem.

After reading the source code for the library, I managed to create a work-around…

Loading Models From a Filesystem

Rather than calling the module’s load method, if the MobileNet class is created manually, the auto-generated path variable which contains the HTTP address of the model can be overwritten with a local filesystem path. Having done this, calling the load method on the class instance will trigger the filesystem loader class, rather than trying to use the browser-based HTTP loader.

1
2
3
4
const path = "mobilenet/model.json"
const mn = new mobilenet.MobileNet(1, 1);
mn.path = `file://${path}`
await mn.load()

Awesome, it works!

But how where do the models files come from?

MobileNet Models

Models for TensorFlow.js consist of two file types, a model configuration file stored in JSON and model weights in a binary format. Model weights are often sharded into multiple files for better caching by browsers.

Looking at the automatic loading code for MobileNet models, models configuration and weight shards are retrieved from a public storage bucket at this address.

1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v${version}_${alpha}_${size}/

The template parameters in the URL refer to the model versions listed here. Classification accuracy results for each version are also shown on that page.

According to the source code, only MobileNet v1 models can be loaded using the tensorflow-models/mobilenet library.

The HTTP retrieval code loads the model.json file from this location and then recursively fetches all referenced model weights shards. These files are in the format groupX-shard1of1.

Downloading Models Manually

Saving all model files to a filesystem can be achieved by retrieving the model configuration file, parsing out the referenced weight files and downloading each weight file manually.

I want to use the MobileNet V1 Module with 1.0 alpha value and image size of 224 pixels. This gives me the following URL for the model configuration file.

1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/model.json

Once this file has been downloaded locally, I can use the jq tool to parse all the weight file names.

1
2
3
4
5
$ cat model.json | jq -r ".weightsManifest[].paths[0]"
group1-shard1of1
group2-shard1of1
group3-shard1of1
...

Using the sed tool, I can prefix these names with the HTTP URL to generate URLs for each weight file.

1
2
3
4
5
$ cat model.json | jq -r ".weightsManifest[].paths[0]" | sed 's/^/https:\/\/storage.googleapis.com\/tfjs-models\/tfjs\/mobilenet_v1_1.0_224\//'
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group1-shard1of1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group2-shard1of1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group3-shard1of1
...

Using the parallel and curl commands, I can then download all of these files to my local directory.

1
cat model.json | jq -r ".weightsManifest[].paths[0]" | sed 's/^/https:\/\/storage.googleapis.com\/tfjs-models\/tfjs\/mobilenet_v1_1.0_224\//' |  parallel curl -O

Classifying Images

This example code is provided by TensorFlow.js to demonstrate returning classifications for an image.

1
2
3
4
const img = document.getElementById('img');

// Classify the image.
const predictions = await model.classify(img);

This does not work on Node.js due to the lack of a DOM.

The classify method accepts numerous DOM elements (canvas, video, image) and will automatically retrieve and convert image bytes from these elements into a tf.Tensor3D class which is used as the input to the model. Alternatively, the tf.Tensor3D input can be passed directly.

Rather than trying to use an external package to simulate a DOM element in Node.js, I found it easier to construct the tf.Tensor3D manually.

Generating Tensor3D from an Image

Reading the source code for the method used to turn DOM elements into Tensor3D classes, the following input parameters are used to generate the Tensor3D class.

1
2
3
4
const values = new Int32Array(image.height * image.width * numChannels);
// fill pixels with pixel channel bytes from image
const outShape = [image.height, image.width, numChannels];
const input = tf.tensor3d(values, outShape, 'int32');

pixels is a 2D array of type (Int32Array) which contains a sequential list of channel values for each pixel. numChannels is the number of channel values per pixel.

Creating Input Values For JPEGs

The jpeg-js library is a pure javascript JPEG encoder and decoder for Node.js. Using this library the RGB values for each pixel can be extracted.

1
const pixels = jpeg.decode(buffer, true);

This will return a Uint8Array with four channel values (RGBA) for each pixel (width * height). The MobileNet model only uses the three colour channels (RGB) for classification, ignoring the alpha channel. This code converts the four channel array into the correct three channel version.

1
2
3
4
5
6
7
8
9
const numChannels = 3;
const numPixels = image.width * image.height;
const values = new Int32Array(numPixels * numChannels);

for (let i = 0; i < numPixels; i++) {
  for (let channel = 0; channel < numChannels; ++channel) {
    values[i * numChannels + channel] = pixels[i * 4 + channel];
  }
}

MobileNet Models Input Requirements

The MobileNet model being used classifies images of width and height 224 pixels. Input tensors must contain float values, between -1 and 1, for each of the three channels pixel values.

Input values for images of different dimensions needs to be re-sized before classification. Additionally, pixels values from the JPEG decoder are in the range 0 - 255, rather than -1 to 1. These values also need converting prior to classification.

TensorFlow.js has library methods to make this process easier but, fortunately for us, the tfjs-models/mobilenet library automatically handles this issue! ๐Ÿ‘

Developers can pass in Tensor3D inputs of type int32 and different dimensions to the classify method and it converts the input to the correct format prior to classification. Which means there’s nothing to do… Super ๐Ÿ•บ๐Ÿ•บ๐Ÿ•บ.

Obtaining Predictions

MobileNet models in Tensorflow are trained to recognise entities from the top 1000 classes in the ImageNet dataset. The models output the probabilities that each of those entities is in the image being classified.

The full list of trained classes for the model being used can be found in this file.

The tfjs-models/mobilenet library exposes a classify method on the MobileNet class to return the top X classes with highest probabilities from an image input.

1
const predictions = await mn_model.classify(input, 10);

predictions is an array of X classes and probabilities in the following format.

1
2
3
4
{
  className: 'panda',
  probability: 0.9993536472320557
}

Example

Having worked how to use the TensorFlow.js library and MobileNet models on Node.js, this script will classify an image given as a command-line argument.

source code

  • Save this script file and package descriptor to local files.

testing it out

  • Download the model files to a mobilenet directory using the instructions above.

  • Install the project dependencies using NPM

1
npm install
  • Download a sample JPEG file to classify
1
wget http://bit.ly/2JYSal9 -O panda.jpg

  • Run the script with the model file and input image as arguments.
1
node script.js mobilenet/model.json panda.jpg

If everything worked, the following output should be printed to the console.

1
2
3
4
classification results: [ {
    className: 'giant panda, panda, panda bear, coon bear',
    probability: 0.9993536472320557
} ]

The image is correctly classified as containing a Panda with 99.93% probability! ๐Ÿผ๐Ÿผ๐Ÿผ

Conclusion

TensorFlow.js brings the power of deep learning to JavaScript developers. Using pre-trained models with the TensorFlow.js library makes it simple to extend JavaScript applications with complex machine learning tasks with minimal effort and code.

Having been released as a browser-based library, TensorFlow.js has now been extended to work on Node.js, although not all of the tools and utilities support the new runtime. With a few days’ hacking, I was able to use the library with the MobileNet models for visual recognition on images from a local file.

Getting this working in the Node.js runtime means I now move on to my next idea… making this run inside a serverless function! Come back soon to read about my next adventure with TensorFlow.js. ๐Ÿ‘‹

Monitoring Dashboards With Kibana for IBM Cloud Functions

Following all the events from the World Cup can be hard. So many matches, so many goals. Rather than manually refreshing BBC Football to check the scores, I decided to created a Twitter bot that would automatically tweet out each goal.

The Twitter bot runs on IBM Cloud Functions. It is called once a minute to check for new goals, using the alarm trigger feed. If new goals have been scored, it calls another action to send the tweet messages.

Once it was running, I need to ensure it was working correctly for the duration of the tournament. Using the IBM Cloud Logging service, I built a custom monitoring dashboard to help to me recognise and diagnose issues.

The dashboard showed counts for successful and failed activations, when they occurred and a list of failed activations. If issues have occurred, I can retrieve the failed activation identifiers and investigate further.

Let’s walk through the steps used to create this dashboard to help you create custom visualisations for serverless applications running on IBM Cloud Functions…

IBM Cloud Logging

IBM Cloud Logging can be accessed using the link on the IBM Cloud Functions dashboard. This will open the logging service for the current organisation and space.

All activation records and application logs are automatically forwarded to the logging service by IBM Cloud Functions.

Log Message Fields

Activation records and application log messages have a number of common record fields.

  • activationId_str - activation identifier for log message.
  • timestamp - log draining time.
  • @timestamp - message ingestion time.
  • action_str - fully qualified action name

Log records for different message types are identified using the type field. This is either activation_record or user_logs for IBM Cloud Functions records.

Activation records have the following custom fields.

  • duration_int - activation duration in milliseconds
  • status_str - activation status response (non-zero for errors)
  • message - activation response returned from action
  • time_date - activation record start time
  • end_date - activation record end time

Applications log lines, written to stdout or stderr, are forwarded as individual records. One application log line per record. Log message records have the following custom fields.

  • message - single application log line output
  • stream_str - log message source, either stdout or stderr
  • time_date - timestamp parsed from application log line

Finding Log Messages For One Activation

Use this query string in the ”Discover tab to retrieve all logs messages from a particular activation.

1
activationId_str: <ACTIVATION_ID>

Search queries are executed against log records within a configurable time window.

Monitoring Dashboard

This is the monitoring dashboard I created. It contains visualisations showing counts for successful and failed activations, histograms of when they occurred and a list of the recent failed activation identifiers.

It allows me to quickly review the previous 24 hours activations for issues. If there are notable issues, I can retrieve the failed activation identifiers and investigate further.

Before being able to create the dashboard, I needed to define two resources: saved searches and visualisations.

Saved Searches

Kibana supports saving and referring to search queries from visualisations using explicit names.

Using saved searches with visualisations, rather than explicit queries, removes the need to manually update visualisations’ configuration when queries change.

This dashboard uses two custom queries in visualisations. Queries are needed to find activation records from both successful and failed invocations.

  • Create a new “Saved Search” named “activation records (success)” using the following search query.
1
type: activation_record AND status_str: 0
  • Create a new “Saved Search” named “activation records (failed)” using the following search query.
1
type: activation_record AND NOT status_str: 0

The status_str field is set to a non-zero value for failures. Using the type field ensures log messages from other sources are excluded from the results.

Indexed Fields

Before referencing log record fields in visualisations, those fields need to be indexed correctly. Use these instructions to verify activation records fields are available.

  • Check IBM Cloud Functions logs are available in IBM Cloud Logging using the ”Discover” tab.
  • Click the “โš™๏ธ (Management)” menu item on the left-hand drop-down menu in IBM Cloud Logging.
  • Click the ”Index Patterns” link.
  • Click the ๐Ÿ”„ button to refresh the field list.

Visualisations

Three types of visualisation are used on the monitoring dashboard. Metric displays are used for the activation counts, vertical bar charts for the activation times and a data table to list failed activations.

Visualisations can be created by opening the “Visualize” menu item and select a new visualisation type under the “Create New Visualization” menu.

Create five different visualisations, using the instructions below, before moving on to create the dashboard.

Activation Counts

Counts for successful and failed activations are displayed as singular metric values.

  • Select the “Metric” visualisation from the visualisation type list.
  • Use the “activation records (success)” saved search as the data source.
  • Ensure the Metric Aggregation is set to “Count”
  • Set the “Font Size” under the Options menu to 120pt.
  • Save the visualisation as “Activation Counts (Success)”

  • Repeat this process to create the failed activation count visualisation.
  • Use the “activation records (failed)” saved search as the data source.
  • Save the visualisation as “Activation Counts (Failed)”.

Activation Times

Activation counts over time, for successful and failed invocations, are displayed in vertical bar charts.

  • Select the “Vertical bar chart” visualisation from the visualisation type list.
  • Use the “activation records (success)” saved search as the data source.
  • Set the “Custom Label” to Invocations
  • Add an “X-Axis” bucket type under the Buckets section.
  • Choose “Date Histogram” for the aggregation, “@timestamp” for the field and “Minute” for the interval.
  • Save the visualisation as “Activation Times (Success)”

  • Repeat this process to create the failed activation times visualisation.
  • Use the “activation records (failed)” saved search as the data source.
  • Save the visualisation as “Activation Times (Failed)”

Failed Activations List

Activation identifiers for failed invocations are shown using a data table.

  • Select the “Data table” visualisation from the visualisation type list.
  • Use the “activation records (failed)” saved search as the data source.
  • Add a “Split Rows” bucket type under the Buckets section.
  • Choose “Date Histogram” for the aggregation, “@timestamp” for the field and “Second” for the interval.
  • Add a “sub-bucket” with the “Split Rows” type.
  • Set sub aggregation to “Terms”, field to “activationId_str” and order by “Term”.
  • Save the visualisation as “Errors Table”

Creating the dashboard

Having created the individual visualisations components, the monitoring dashboard can be constructed.

  • Click the “Dashboard” menu item from the left-and menu panel.
  • Click the “Add” button to import visualisations into the current dashboard.
  • Add each of the five visualisations created above.

Hovering the mouse cursor over visualisations will reveal icons for moving and re-sizing.

  • Re-order the visualisations into the following rows:
    • Activations Metrics
    • Activation Times
    • Failed Activations List
  • Select the “Last 24 hours” time window, available from the relative time ranges menu.
  • Save the dashboard as ”Cloud Functions Monitoring”. Tick the ”store time with dashboard” option.

Having saved the dashboard with time window, re-opening the dashboard will show our visualisations with data for the previous 24 hours. This dashboard can be used to quickly review recent application issues.

Conclusion

Monitoring serverless applications is crucial to diagnosing issues on serverless platforms.

IBM Cloud Functions provides automatic integration with the IBM Cloud Logging service. All activation records and application logs from serverless applications are automatically forwarded as log records. This makes it simple to build custom monitoring dashboards using these records for serverless applications running on IBM Cloud Functions.

Using this service with World Cup Twitter bot allowed me to easily monitor the application for issues. This was much easier than manually retrieving and reviewing activation records using the CLI!

Debugging Node.js OpenWhisk Actions

Debugging serverless applications is one of the most challenging issues developers face when using serverless platforms. How can you use debugging tools without any access to the runtime environment?

Last week, I worked out how to expose the Node.js debugger in the Docker environment used for the application runtime in Apache OpenWhisk.

Using the remote debugging service, we can set breakpoints and step through action handlers live, rather than just being reliant on logs and metrics to diagnose bugs.

So, how does this work?

Let’s find out more about how Apache OpenWhisk executes serverless functions…

Background

Apache OpenWhisk is the open-source serverless platform which powers IBM Cloud Functions. OpenWhisk uses Docker containers to create isolated runtime environments for executing serverless functions.

Containers are started on-demand as invocation requests arrive. Serverless function source files are dynamically injected into the runtime and executed for each invocation. Between invocations, containers are paused and kept in a cache for re-use with further invocations.

The benefit of using an open-source serverless platform is that the build files used to create runtime images are also open-source. OpenWhisk also automatically builds and publishes all runtime images externally on Docker Hub. Running containers using these images allows us to simulate the remote serverless runtime environment.

Runtime Images

All OpenWhisk runtime images are published externally on Docker Hub.

Runtime images start a HTTP server which listens on port 8080. This HTTP server must implement two API endpoints (/init & /run) accepting HTTP POST requests. The platform uses these endpoints to initialise the runtime with action code and then invoke the action with event parameters.

More details on the API endpoints can be found in this blog post on creating Docker-based actions.

Node.js Runtime Image

This repository contains the source code used to create Node.js runtime environment image.

https://github.com/apache/incubator-openwhisk-runtime-nodejs

Both Node.js 8 and 6 runtimes are built from a common base image. This base image contains an Express.js server which handles the platform API requests. The app.js file containing the server is executed when the containers starts.

JavaScript code is injected into the runtime using the /init API. Actions created from source code are dynamically evaluated to instantiate the code in the runtime. Actions created from zip files are extracted into a temporary directory and imported as a Node.js module.

Once instantiated, actions are executed using the /run API. Event parameters are come from the request body. Each time a new request is received, the server calls the action handler with event parameters. Returned values are serialised as the JSON body in the API response.

Starting Node.js Runtime Containers

Use this command to start the Node.js runtime container locally.

1
$ docker run -it -p 8080:8080 openwhisk/action-nodejs-v8

Once the container has started, port 8080 on localhost will be mapped to the HTTP service exposed by the runtime environment. This can be used to inject serverless applications into the runtime environment and invoke the serverless function handler with event parameters.

Node.js Remote Debugging

Modern versions of the Node.js runtime have a command-line flag (--inspect) to expose a remote debugging service. This service runs a WebSocket server on localhost which implements the Chrome DevTools Protocol.

1
2
$ node --inspect index.js
Debugger listening on 127.0.0.1:9229.

External tools can connect to this port to provide debugging capabilities for Node.js code.

Docker images for the OpenWhisk Node.js runtimes use the following command to start the internal Node.js process. Remote debugging is not enabled by default.

1
node --expose-gc app.js

Docker allows containers to override the default image start command using a command line argument.

This command will start the OpenWhisk Node.js runtime container with the remote debugging service enabled. Binding the HTTP API and WebSocket ports to the host machine allows us to access those services remotely.

1
docker run -p 8080:8080 -p 9229:9229 -it openwhisk/action-nodejs-v8 node --inspect=0.0.0.0:9229 app.js

Once a container from the runtime image has started, we can connect our favourite debugging tools…

Chrome Dev Tools

To connect Chrome Dev Tools to the remote Node.js debugging service, follow these steps.

Chrome Dev Tools is configured to open a connection on port 9229 on localhost. If the web socket connection succeeds, the debugging target should be listed in the “Remote Target” section.

  • Click the ”Open dedicated DevTools for Node” link.

In the “Sources” panel the JavaScript files loaded by the Node.js process are available.

Setting breakpoints in the runner.js file will allow you to halt execution for debugging upon invocations.

VSCode

Visual Studio Code supports remote debugging of Node.js code using the Chrome Dev Tools protocol. Follow these steps to connect the editor to the remote debugging service.

  • Click the menu item ”Debug -> Add Configuration
  • Select the ”Node.js: Attach to Remote Program” from the Intellisense menu.
  • Edit the default configuration to have the following values.
1
2
3
4
5
6
7
8
{
  "type": "node",
  "request": "attach",
  "name": "Attach to Remote",
  "address": "127.0.0.1",
  "port": 9229,
  "localRoot": "${workspaceFolder}"
}

  • Choose the new ”attach to remote” debugging profile and click the Run button.

The ”Loaded Scripts” window will show all the JavaScript files loaded by the Node.js process.

Setting breakpoints in the runner.js file will allow you to halt execution for debugging upon invocations.

Breakpoint Locations

Here are some useful locations to set breakpoints to catch errors in your serverless functions for the OpenWhisk Node.js runtime environments.

Initialisation Errors - Source Actions

If you are creating OpenWhisk actions from JavaScript source files, the code is dynamically evaluated during the /init request at this location. Putting a breakpoint here will allow you to catch errors thrown during that eval() call.

Initialisation Errors - Binary Actions

If you are creating OpenWhisk actions from a zip file containing JavaScript modules, this location is where the archive is extracted in the runtime filesystem. Putting a breakpoint here will catch errors from the extraction call and runtime checks for a valid JavaScript module.

This code is where the JavaScript module is imported once it has been extracted. Putting a breakpoint here will catch errors thrown importing the module into the Node.js environment.

Action Handler Errors

For both source file and zipped module actions, this location is where the action handler is invoked on each /run request. Putting a breakpoint here will catch errors thrown from within action handlers.

Invoking OpenWhisk Actions

Once you have attached the debugger to the remote Node.js process, you need to send the API requests to simulate the platform invocations. Runtime containers use separate HTTP endpoints to import the action source code into the runtime environment (/init) and then fire the invocation requests (/run).

Generating Init Request Body - Source Files

If you are creating OpenWhisk actions from JavaScript source files, send the following JSON body in the HTTP POST to the /init endpoint.

1
2
3
4
5
6
{
  "value": {
    "main": "<FUNCTION NAME IN SOURCE FILE>",
    "code": "<INSERT SOURCE HERE>"
  }
}

code is the JavaScript source to be evaluated which contains the action handler. main is the function name in the source file used for the action handler.

Using the jq command-line tool, we can create the JSON body for the source code in file.js.

1
$ cat file.js | jq -sR  '{value: {main: "main", code: .}}'

Generating Init Request Body - Zipped Modules

If you are creating OpenWhisk actions from a zip file containing JavaScript modules, send the following JSON body in the HTTP POST to the /init endpoint.

1
2
3
4
5
6
7
{
  "value": {
    "main": "<FUNCTION NAME ON JS MODULE>",
    "code": "<INSERT BASE64 ENCODED STRING FROM ZIP FILE HERE>",
    "binary": true
  }
}

code must be a Base64 encoded string for the zip file. main is the function name returned in the imported JavaScript module to call as the action handler.

Using the jq command-line tool, we can create the JSON body for the zip file in action.zip.

1
$ base64 action.zip | tr -d '\n' | jq -sR '{value: {main: "main", binary: true, code: .}}'

Sending Init Request

The HTTPie tool makes it simple to send HTTP requests from the command-line.

Using this tool, the following command will initialise the runtime container with an OpenWhisk action.

1
2
3
4
5
6
$ http post localhost:8080/init < init.json
HTTP/1.1 200 OK
...
{
    "OK": true
}

If this HTTP request returns without an error, the action is ready to be invoked.

No further initialisation requests are needed unless you want to modify the action deployed.

Generating Run Request Body

Invocations of the action handler functions are triggered from a HTTP POST to the /run API endpoint.

Invocations parameters are sent in the JSON request body, using a JSON object with a value field.

1
2
3
4
5
6
{
  "value": {
    "some-param-name": "some-param-value",
    "another-param-name": "another-param-value",
  }
}

Sending Run Request

Using the HTTPie tool, the following command will invoke the OpenWhisk action.

1
2
3
4
5
6
$ http post localhost:8080/run < run.json
HTTP/1.1 200 OK
...
{
    "msg": "Hello world"
}

Returned values from the action handler are serialised as the JSON body in the HTTP response. Issuing further HTTP POST requests to the /run endpoint allows us to re-invoke the action.

Conclusion

Lack of debugging tools is one of the biggest complaints from developers migrating to serverless platforms.

Using an open-source serverless platform helps with this problem, by making it simple to run the same containers locally that are used for the platform’s runtime environments. Debugging tools can then be started from inside these local environments to simulate remote access.

In this example, this approach was used to enable the remote debugging service from the OpenWhisk Node.js runtime environment. The same approach could be used for any language and debugging tool needing local access to the runtime environment.

Having access to the Node.js debugger is huge improvement when debugging challenging issues, rather than just being reliant on logs and metrics collected by the platform.

Binding IAM Services to IBM Cloud Functions

Binding service credentials to actions and packages is a much better approach to handling authentication credentials in IBM Cloud Functions, than manually updating (and maintaining) default parameters ๐Ÿ”.

IBM Cloud Functions supports binding credentials from IAM-based and Cloud Foundry provisioned services.

Documentation and blog posts demonstrating service binding focuses on traditional platform services, created using the Cloud Foundry service broker. As IBM Cloud integrates IAM across the platform, more platform services will migrate to use the IAM service for managing authentication credentials.

How do we bind credentials for IAM-based services to IBM Cloud Functions? ๐Ÿค”

Binding IAM-based services to IBM Cloud Functions works the same as traditional platform services, but has some differences in how to retrieve details needed for the service bind command.

Let’s look at how this works…

Binding IAM Credentials

Requirements

Before binding an IAM-based service to IBM Cloud Functions, the following conditions must be met.

You will need the following information to bind a service credentials.

  • Service name.
  • (Optional) Instance name.
  • (Optional) Credentials identifier.

Using the CLI

Use the ibmcloud wsk service bind command to bind service credentials to actions or packages.

1
bx wsk service bind <SERVICE_NAME> <ACTION|PACKAGE> --instance <INSTANCE> --keyname <KEY>

This command supports the following (optional) flags: --instance and --keyname.

If the instance and/or key names are not specified, the CLI uses the first instance and credentials returned from the system for the service identifier.

Accessing from actions

Credentials are stored as default parameters on the action or package.

The command uses a special parameter name (__bx_creds) to store all credentials. Individual service credentials are indexed using the service name.

1
2
3
4
5
6
7
8
{
   "__bx_creds":{
      "service-name":{
         "apikey":"<API_KEY>",
         ...
      }
   }
}

Default parameters are automatically merged into the request parameters during invocations.

Common Questions

How can I tell whether a service instance uses IAM-based authentication?

Running the ibmcloud resource service-instances command will return the IAM-based service instances provisioned.

Cloud Foundry provisioned services are available using a different command: ibmcloud service list.

Both service types can be bound using the CLI but the commands to retrieve the necessary details are different.

How can I find the service name for an IAM-based service instance?

Run the ibmcloud resource service-instance <INSTANCE_NAME> command.

Service names are shown as the Service Name: field value.

How can I list available service credentials for an IAM-based service instance?

Use the ibmcloud resource service-keys --instance-name <NAME> command.

Replace the <NAME> value with the service instance returned from the ibmcloud service list command.

How can I manually retrieve IAM-based credentials for an instance?

Use the ibmcloud resource service-key <CREDENTIALS_NAME> command.

Replace the <CREDENTIALS_NAME> value with credential names returned from the ibmcloud service service-keys command.

How can I create new service credentials?

Credentials can be created through the service management page on IBM Cloud.

You can also use the CLI to create credentials using the ibmcloud resource service-key-create command. This command needs a name for the credentials, IAM role and service instance identifier.

Example - Cloud Object Storage

Having explained how to bind IAM-based services to IBM Cloud Functions, let’s look at an example….

Cloud Object Storage is the service used to manage files for serverless applications on IBM Cloud. This service supports the newer IAM-based authentication service.

Let’s look at how to bind authentication credentials for an instance of this service to an action.

Using the CLI, we can check an instance of this service is available…

1
2
3
4
5
$ ibmcloud resource service-instances
Retrieving service instances in resource group default..
OK
Name                     Location   State    Type               Tags
my-cos-storage           global     active   service_instance

In this example, we have a single instance of IBM Cloud Object Storage provisioned as my-cos-storage.

Retrieving instance details will show us the service name to use in the service binding command.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ ibmcloud resource service-instance my-cos-storage
Retrieving service instance my-cos-storage in resource group default..
OK

Name:                  my-cos-storage
ID:                    crn:v1:bluemix:public:cloud-object-storage:global:<GUID>:
GUID:                  <GUID>
Location:              global
Service Name:          cloud-object-storage
Service Plan Name:     lite
Resource Group Name:   default
State:                 active
Type:                  service_instance
Tags:

The IBM Cloud Object Storage service name is cloud-object-storage.

Before we can bind service credentials, we need to verify service credentials are available for this instance.

1
2
3
4
5
$ ibmcloud resource service-keys --instance-name my-cos-storage
Retrieving service keys in resource group default...
OK
Name                     State    Created At
serverless-credentials   active   Tue Jun  5 09:11:06 UTC 2018

This instance has a single service key available, named serverless-credentials.

Retrieving the service key details shows us the API secret for this credential.

1
2
3
4
5
6
7
8
9
10
11
$ ibmcloud resource service-key serverless-credentials
Retrieving service key serverless-credentials in resource group default...
OK

Name:          serverless-credentials
ID:            <ID>
Created At:    Tue Jun  5 09:11:06 UTC 2018
State:         active
Credentials:
               ...
               apikey:                   <SECRET_API_KEY_VALUE>

apikey denotes the secret API key used to authenticate calls to the service API.

Having retrieved the service name, instance identifier and available credentials, we can use these values to bind credentials to an action.

1
2
$ bx wsk service bind cloud-object-storage params --instance my-cos-storage --keyname serverless-credentials
Credentials 'serverless-credentials' from 'cloud-object-storage' service instance 'my-cos-storage' bound to 'params'.

Retrieving action details shows default parameters bound to an action. These will now include the API key for the Cloud Object Storage service.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ bx wsk action get params
ok: got action params
{
  ...
  "parameters": [{
    "key": "__bx_creds",
    "value": {
      "cloud-object-storage": {
        "apikey": "<API_KEY_SECRET>",
        ...
      }
    }
  }]
}

Under the __bx_creds default parameter, there is a cloud-object-storage property with the API key amongst other service credential values.

Using Cloud Object Storage From IBM Cloud Functions (Node.js)

How do you manage files for a serverless application? ๐Ÿค”

Previous blog posts discussed this common problem and introduced the most popular solution, using a cloud-based object storage service. ๐Ÿ‘๐Ÿ‘๐Ÿ‘

Object stores provide elastic storage in the cloud, with a billing model which charges for capacity used. These services are the storage solution for serverless applications, which do not have access to a traditional file system. ๐Ÿ‘

I’m now going to demonstrate how to use IBM Cloud Object Storage from IBM Cloud Functions.

This blog post will show you…

  • How to provision IBM Cloud Object Storage and create authentication tokens.
  • How use client libraries to access IBM Cloud Object Storage from IBM Cloud Functions.
  • Example serverless functions for common use-cases, e.g uploading files.

Code examples in this blog post will focus on the Node.js runtime.

Instructions on service provisioning and authentication credentials are relevant for any runtime.

IBM Cloud Accounts and Storage Services

IBM Cloud Object Storage is available to all IBM Cloud users.

IBM Cloud has three different account types: lite, pay-as-you-go or subscription.

Lite Accounts

Lite accounts do not require a credit card to register and do not expire after a limited time period.

Numerous platform services, including Cloud Object Storage, provide free resources for lite account users. IBM Cloud Object Storage’s free resource tier comes the following monthly limits.

  • Store 25GB of new data.
  • Issue 20,000 GET and 2,000 PUT requests.
  • Use 10GB of public bandwidth.

Lite tier usage supports all resiliency and storage class options but are limited to a single service instance.

Users can sign up for a free “Lite” account here. Please follow the instructions to install the IBM Cloud CLI.

Pay-as-you-Go & Subscription Accounts

Lite accounts can be upgraded to Pay-As-You-Go or Subscription accounts. Upgraded accounts still have access to the free tiers provided in Lite accounts. Users with Pay-As-You-Go or Subscriptions accounts can access services and tiers not included in the Lite account.

Benefits of the additional service tiers for IBM Cloud Object Storage include unlimited instances of the object storage service. Costs are billed according to usage per month. See the pricing page for more details: https://www.ibm.com/cloud-computing/bluemix/pricing-object-storage#s3api

Provisioning IBM Cloud Object Storage

IBM Cloud Object Storage can be provisioned through the IBM Cloud service catalog.

From the Service Details page, follow these instructions to provision a new instance.

  • Give the service an identifying name.
  • Leave the resource group as ”default”.
  • Click the “Create” button.

Once the service has been provisioned, it will be shown under the “Services” section of the IBM Cloud Dashboard. IBM Cloud Object Storage services are global services and not bound to individual regions.

  • Click the service instance from the dashboard to visit the service management page.

Once the service has been provisioned, we need to create authentication credentials for external accessโ€ฆ

Service Credentials

Service credentials for IBM Cloud Object Storage use IBM Cloud’s IAM service.

I’m just going to cover the basics of using IAM with Cloud Object Storage. Explaining all the concepts and capabilities of the IAM service would need a separate (and lengthy) blog post!

Auto-Binding Service Credentials

IBM Cloud Functions can automatically provision and bind service credentials to actions.

This feature is supported through the IBM Cloud CLI command: bx wsk service bind.

Bound service credentials are stored as default action parameters. Default parameters are automatically included as request parameters for each invocation.

Using this approach means users do not have to manually provision and manage service credentials. ๐Ÿ‘

Service credentials provisioned in this manner use the following configuration options:

  • IAM Role: Manager
  • Optional Configuration Parameters: None.

If you need to use different configuration options, you will have to manually provision service credentials.

Manually Creating Credentials

  • Select the ”Service Credentials” menu item from the service management page.
  • Click the “New credential” button.

Fill in the details for the new credentials.

  • Choose an identifying name for the credentials.
  • Select an access role. Access roles define which operations applications using these credentials can perform. Permissions for each role are listed in the documentation.

    Note: If you want to make objects publicly accessible make sure you use the manager permission.

  • Leave the Service ID unselected.

If you need HMAC service keys, which are necessary for generating presigned URLs, use the following inline configuration parameters before. Otherwise, leave this field blank.

1
{"HMAC": true}
  • Click the “Add” button.

๐Ÿ” Credentials shown in this GIF were deleted after the demo (before you get any ideas…) ๐Ÿ”

Once created, new service credentials will be shown in the credentials table.

IBM Cloud Object Storage API

Cloud Object Storage exposes a HTTP API for interacting with buckets and files.

This API implements the same interface as AWS S3 API.

Service credentials created above are used to authenticate requests to the API endpoints. Full details on the API operations are available in the documentation.

HTTP Endpoints

IBM Cloud Object Storage’s HTTP API is available through region-based endpoints.

When creating new buckets to store files, the data resiliency for the bucket (and therefore the files within it) is based upon the endpoint used for the bucket create operation.

Current endpoints are listed in the external documentation and available through an external API: https://cos-service.bluemix.net/endpoints

Choosing an endpoint

IBM Cloud Functions is available in the following regions: US-South, United Kingdom and Germany.

Accessing Cloud Object Storage using regional endpoints closest to the Cloud Functions application region will result in better application performance.

IBM Cloud Object Storage lists public and private endpoints for each region (and resiliency) choice. IBM Cloud Functions only supports access using public endpoints.

In the following examples, IBM Cloud Functions applications will be hosted in the US-South region. Using the US Regional endpoint for Cloud Object Storage will minimise network latency when using the service from IBM Cloud Functions.

This endpoint will be used in all our examples: s3-api.us-geo.objectstorage.softlayer.net

Client Libraries

Rather than manually creating HTTP requests to interact with the Cloud Object Storage API, client libraries are available.

IBM Cloud Object Storage publishes modified versions of the Node.js, Python and Java AWS S3 SDKs, enhanced with IBM Cloud specific features.

Both the Node.js and Python COS libraries are pre-installed in the IBM Cloud Functions runtime environments for those languages. They can be used without bundling those dependencies in the deployment package.

We’re going to look at using the JavaScript client library from the Node.js runtime in IBM Cloud Functions.

JavaScript Client Library

When using the JavaScript client library for IBM Cloud Object Storage, endpoint and authentication credentials need to be passed as configuration parameters.

1
2
3
4
5
6
7
8
9
const COS = require('ibm-cos-sdk');

const config = {
    endpoint: '<endpoint>',
    apiKeyId: '<api-key>',
    serviceInstanceId: '<resource-instance-id>',
};

const cos = new COS.S3(config);

Hardcoding configuration values within source code is not recommended. IBM Cloud Functions allows default parameters to be bound to actions. Default parameters are automatically passed into action invocations within the event parameters.

Default parameters are recommended for managing application secrets for IBM Cloud Functions applications.

Having provisioned the storage service instance, learnt about service credentials, chosen an access endpoint and understood how to use the client library, there’s one final step before we can start to creating functionsโ€ฆ

Creating Buckets

IBM Cloud Object Storage organises files into a flat hierarchy of named containers, called buckets. Buckets can be created through the command-line, using the API or the web console.

Let’s create a new bucket, to store all files for our serverless application, using the web console.

  • Open the ”Buckets” page from the COS management page.
  • Click the ”Create Bucket” link.

  • Create a bucket name. Bucket names must be unique across the entire platform, rather than just your account.

  • Select the following configuration options
    • Resiliency: Cross Region
    • Location: us-geo
    • Storage class: Standard
  • Click the ”Create” button.

Once the bucket has been created, you will be taken back to the bucket management page.

Test Files

We need to put some test files in our new bucket. Download the following images files.

Using the bucket management page, upload these files to the new bucket.

Using Cloud Object Storage from Cloud Functions

Having created a storage bucket containing test files, we can start to develop our serverless application.

Let’s begin with a serverless function that returns a list of files within a bucket. Once this works, we will extend the application to support retrieving, removing and uploading files to a bucket. We can also show how to make objects publicly accessible and generate pre-signed URLs, allowing external clients to upload new content directly.

Separate IBM Cloud Functions actions will be created for each storage operation.

Managing Default Parameters

Serverless functions will need the bucket name, service endpoint and authentication parameters to access the object storage service. Configuration parameters will be bound to actions as default parameters.

Packages can be used to share configuration values across multiple actions. Actions created within a package inherit all default parameters stored on that package. This removes the need to manually configure the same default parameters for each action.

Let’s create a new package (serverless-files) for our serverless application.

1
2
$ bx wsk package create serverless-files
ok: created package serverless-files

Update the package with default parameters for the bucket name (bucket) and service endpoint (cos_endpoint).

1
2
$ bx wsk package update serverless-files -p bucket <MY_BUCKET_NAME> -p cos_endpoint s3-api.us-geo.objectstorage.softlayer.net
ok: updated package serverless-files

Did you notice we didn’t provide authentication credentials as default parameters?

Rather than manually adding these credentials, the CLI can automatically provision and bind them. Let’s do this now for the cloud-object-storage service…

  • Bind service credentials to the serverless-files package using the bx wsk service bind command.
1
2
$ bx wsk service bind cloud-object-storage serverless-files
Credentials 'cloud-fns-key' from 'cloud-object-storage' service instance 'object-storage' bound to 'serverless-files'.
  • Retrieve package details to check default parameters contain expected configuration values.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ bx wsk package get serverless-files
ok: got package serverless-files
{
    ...
    "parameters": [
        {
            "key": "bucket",
            "value": "<MY_BUCKET_NAME>"
        },
        {
            "key": "cos_endpoint",
            "value": "s3-api.us-geo.objectstorage.softlayer.net"
        },
        {
            "key": "__bx_creds",
            "value": {
                "cloud-object-storage": {
                    ...
                }
            }
        }
    ]
}

List Objects Within the Bucket

  • Create a new file (action.js) with the following contents.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
const COS = require('ibm-cos-sdk')

function cos_client (params) {
  const bx_creds = params['__bx_creds']
  if (!bx_creds) throw new Error('Missing __bx_creds parameter.')

  const cos_creds = bx_creds['cloud-object-storage']
  if (!cos_creds) throw new Error('Missing cloud-object-storage parameter.')

  const endpoint = params['cos_endpoint']
  if (!endpoint) throw new Error('Missing cos_endpoint parameter.')

  const config = {
    endpoint: endpoint,
    apiKeyId: cos_creds.apikey,
    serviceInstanceId: cos_creds.resource_instance_id
  }

  return new COS.S3(config);
}

function list (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  const client = cos_client(params)

  return client.listObjects({ Bucket: params.bucket }).promise()
    .then(results => ({ files: results.Contents }))
}

This action retrieves the bucket name, service endpoint and authentication credentials from invocation parameters. Errors are returned if those parameters are missing.

  • Create a new package action from this source file with the following command.
1
2
$ bx wsk action create serverless-files/list-files actions.js --main list --kind nodejs:8
ok: created action list-files

The โ€”main flag set the function name to call for each invocation. This defaults to main. Setting this to an explicit value allows us to use a single source file for multiple actions.

The โ€”kind sets the action runtime. This optional flag ensures we use the Node.js 8 runtime rather than Node.js 6, which is the default for JavaScript actions. The IBM Cloud Object Storage client library is only included in the Node.js 8 runtime.

  • Invoke the new action to verify it works.
1
2
3
4
5
6
7
8
$ bx wsk action invoke serverless-files/list-files -r
{
    "files": [
        { "Key": "jumping pug.jpg", ... },
        { "Key": "pug blanket.jpg", ... },
        { "Key": "swimming pug.jpg", ... }
    ]
}

The action response should contain a list of the files uploaded before. ๐Ÿ’ฏ๐Ÿ’ฏ๐Ÿ’ฏ

Retrieve Object Contents From Bucket

Let’s add another action for retrieving object contents from a bucket.

  • Add a new function (retrieve) to the existing source file (action.js) with the following source code.
1
2
3
4
5
6
7
8
function retrieve (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")
  const client = cos_client(params)

  return client.getObject({ Bucket: params.bucket, Key: params.name }).promise()
    .then(result => ({ body: result.Body.toString('base64') }))
}

Retrieving files needs a file name in addition to the bucket name. File contents needs encoding as a Base64 string to support returning in the JSON response returned by IBM Cloud Functions.

  • Create an additional action from this updated source file with the following command.
1
2
$ bx wsk action create serverless-files/retrieve-file actions.js --main retrieve --kind nodejs:8
ok: created action serverless-files/retrieve-file
  • Invoke this action to test it works, passing the parameter name for the file to retrieve.
1
2
3
4
$ bx wsk action invoke serverless-files/retrieve-file -r -p name "jumping pug.jpg"
{
    "body": "<BASE64 ENCODED STRING>"
}

If this is successful, a (very long) response body containing a base64 encoded image should be returned. ๐Ÿ‘

Delete Objects From Bucket

Let’s finish this section by adding a final action that removes objects from our bucket.

  • Update the source file (actions.js) with this additional function.
1
2
3
4
5
6
7
function remove (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")
  const client = cos_client(params)

  return client.deleteObject({ Bucket: params.bucket, Key: params.name }).promise()
}
  • Create a new action (remove-file) from the updated source file.
1
2
$ bx wsk action create serverless-files/remove-file actions.js --main remove --kind nodejs:8
ok: created action serverless-files/remove-file
  • Test this new action using it to remove a file from the bucket.
1
2
$ bx wsk action invoke serverless-files/remove-file -r -p name "jumping pug.jpg"
{}
  • Listing bucket files should now return two files, rather than three.
1
2
3
4
5
6
7
$ bx wsk action invoke serverless-files/list-files -r
{
    "files": [
        { "Key": "pug blanket.jpg", ... },
        { "Key": "swimming pug.jpg", ... }
    ]
}

Listing, retrieving and removing files using the client library is relatively simple. Functions just need to call the correct method passing the bucket and object name.

Let’s move onto a more advanced example, creating new files in the bucket from our actionโ€ฆ

Create New Objects Within Bucket

File content will be passed into our action as Base64 encoded strings. JSON does not support binary data.

When creating new objects, we should set the MIME type. This is necessary for public access from web browsers, something we’ll be doing later on. Node.js libraries can calculate the correct MIME type value, rather than requiring this as an invocation parameter.

  • Update the source file (action.js) with the following additional code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
const mime = require('mime-types');

function upload (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")
  if (!params.body) throw new Error("Missing object parameter.")

  const client = cos_client(params)
  const body = Buffer.from(params.body, 'base64')

  const ContentType = mime.contentType(params.name) || 'application/octet-stream'
  const object = {
    Bucket: params.bucket,
    Key: params.name,
    Body: body,
    ContentType
  }

  return client.upload(object).promise()
}

exports.upload = upload;

As this code uses an external NPM library, we need to create the action from a zip file containing source files and external dependencies.

  • Create a package.json file with the following contents.
1
2
3
4
5
6
7
{
  "name": "upload-files",
  "main": "actions.js",
  "dependencies": {
    "mime-types": "^2.1.18"
  }
}
  • Install external libraries in local environment.
1
2
$ npm install
added 2 packages in 0.804s
  • Bundle source file and dependencies into zip file.
1
2
3
4
$ zip -r upload.zip package.json actions.js node_modules
  adding: actions.js (deflated 72%)
  adding: node_modules/ (stored 0%)
  ...
  • Create a new action from the zip file.
1
2
$ bx wsk action create serverless-files/upload-file upload.zip --main upload --kind nodejs:8
ok: created action serverless-files/upload-file
  • Create the Base64-encoded string used to pass the new file’s content.
1
2
$ wget http://www.pugnow.com/wp-content/uploads/2016/04/fly-pug-300x300.jpg
$ base64 fly-pug-300x300.jpg > body.txt
  • Invoke the action with the file name and content as parameters.
1
$ bx wsk action invoke serverless-files/upload-file -r -p body $(cat body.txt) -p name "flying pug.jpg"

Object details should be returned if the file was uploaded correctly.

1
2
3
4
5
6
7
{
    "Bucket": "my-serverless-files",
    "ETag": "\"b2ae0fb61dc827c03d6920dfae58e2ba\"",
    "Key": "flying pug.jpg",
    "Location": "https://<MY_BUCKET_NAME>.s3-api.us-geo.objectstorage.softlayer.net/flying%20pug.jpg",
    "key": "flying pug.jpg"
}

Accessing the object storage dashboard shows the new object in the bucket, with the correct file name and size.

Having actions to create, delete and access objects within a bucket, what’s left to do? ๐Ÿค”

Expose Public Objects From Buckets

Users can also choose to make certain objects within a bucket public. Public objects can be retrieved, using the external HTTP API, without any further authentication.

Public file access allows external clients to access files directly. It removes the need to invoke (and pay for) a serverless function to serve content. This is useful for serving static assets and media files.

Objects have an explicit property (x-amz-acl) which controls access rights. Files default to having this value set as private, meaning all operations require authentication. Setting this value to public-read will enable GET operations without authentication.

Files can be created with an explicit ACL property using credentials with the Writer or Manager role. Modifying ACL values for existing files is only supported using credentials with the Manager role.

  • Add the following source code to the existing actions file (action.js).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function make_public (params) {
  return update_acl(params, 'public-read')
}

function make_private (params) {
  return update_acl(params, 'private')
}

function update_acl (params, acl) => {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")
  const client = cos_client(params)

  const options = {
    Bucket: params.bucket,
    Key: params.name,
    ACL: acl
  }

  return client.putObjectAcl(options).promise()
}
  • Create two new actions with the update source file.
1
2
3
4
$ bx wsk action create serverless-files/make-public actions.js --main make_public --kind nodejs:8
ok: created action serverless-files/make-public
$ bx wsk action create serverless-files/make-private actions.js --main make_private --kind nodejs:8
ok: created action serverless-files/make-private

Bucket objects use the following URL scheme: https://./

We have been using the following endpoint hostname: s3-api.us-geo.objectstorage.softlayer.net.

  • Checking the status code returned when accessing an existing object confirms it defaults to private.
1
2
3
$ curl -I https://<BUCKET_NAME>.s3-api.us-geo.objectstorage.softlayer.net/flying%20pug.jpg
HTTP/1.1 403 Forbidden
...
  • Invoke the make-public action to allow GET requests without authentication.
1
$ bx wsk action invoke serverless-files/make-public -r -p name "flying pug.jpg"
  • Retry file access using the external HTTP API. This time a 200 response is returned with the content.
1
2
3
4
$ curl -I https://<BUCKET_NAME>.s3-api.us-geo.objectstorage.softlayer.net/flying%20pug.jpg
HTTP/1.1 200 OK
Content-Type: image/jpeg
...

Having set an explicit content type for the file, opening this URL in a web browser will show the image.

  • Disable public access using the other new action.
1
bx wsk action invoke serverless-files/make-private -r -p name "flying pug.jpg"
  • Re-issue the curl request to the file location.
1
2
3
$ curl -I https://<BUCKET_NAME>.s3-api.us-geo.objectstorage.softlayer.net/flying%20pug.jpg
HTTP/1.1 403 Forbidden
...

HTTP requests to this file now return a 403 status. Authentication is required again. ๐Ÿ”‘

In addition to allowing public read access we can go even further in allowing clients to interact with bucketsโ€ฆ

Provide Direct Upload Access To Buckets

Cloud Object Storage provides a mechanism (presigned URLs) to generate temporary links that allow clients to interact with buckets without further authentication. Passing these links to clients means they can access to private objects or upload new files to buckets. Presigned URLs expire after a configurable time period.

Generating presigned URLs is only supported from HMAC authentication keys.

HMAC service credentials must be manually provisioned, rather than using the bx wsk service bind command. See above for instructions on how to do this.

  • Save provisioned HMAC keys into a file called credentials.json.

Let’s create an action that returns presigned URLs, allowing users to upload files directly. Users will call the action with a new file name. Returned URLs will support an unauthenticated PUT request for the next five minutes.

  • Create a new file called presign.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
'use strict';

const COS = require('ibm-cos-sdk');
const mime = require('mime-types');

function cos_client (params) {
  const creds = params.cos_hmac_keys
  if (!creds) throw new Error('Missing cos_hmac_keys parameter.')

  const endpoint = params.cos_endpoint
  if (!endpoint) throw new Error('Missing cos_endpoint parameter.')

  const config = {
    endpoint: endpoint,
    accessKeyId: creds.access_key_id,
    secretAccessKey: creds.secret_access_key
  }

  return new COS.S3(config);
}

function presign (params) {
  if (!params.bucket) throw new Error("Missing bucket parameter.")
  if (!params.name) throw new Error("Missing name parameter.")

  const client = cos_client(params)

  const options = {
    Bucket: params.bucket,
    Key: params.name,
    Expires: 300,
    ContentType: mime.contentType(params.name) || 'application/octet-stream'
  }

  return { url: client.getSignedUrl('putObject', options) }
}

exports.presign = presign;
  • Update the package.json file with the following contents.
1
2
3
4
5
6
7
{
  "name": "presign",
  "main": "presign.js",
  "dependencies": {
    "mime-types": "^2.1.18"
  }
}
  • Bundle source file and dependencies into zip file.
1
2
3
4
$ zip -r presign.zip package.json presign.js node_modules
  adding: actions.js (deflated 72%)
  adding: node_modules/ (stored 0%)
  ...
  • Create a new action from the zip file.
1
2
$ bx wsk action create serverless-files/presign presign.zip --main presign --kind nodejs:8 -P credentials.json
ok: created action serverless-files/presign
  • Invoke the action to return a presigned URL for a new file.
1
2
3
4
$ bx wsk action invoke serverless-files/presign -r -p name pug.jpg
{
    "url": "https://<BUCKET>.s3-api.us-geo.objectstorage.softlayer.net/pug.jpg?AWSAccessKeyId=<SECRET>&Content-Type=image%2Fjpeg&Expires=<TIME>&Signature=<KEY>"
}

Using this URL we can upload a new image without providing authentication credentials.

  • This curl command โ€”upload-file will send a HTTP PUT, with image file as request body, to that URL.
1
$ curl --upload-file "my pug.jpg" <URL> --header "Content-Type: image/jpeg"

The HTTP request must include the correct “Content-Type” header. Use the value provided when creating the presigned URL. If these values do not match, the request will be rejected.

Exploring the objects in our bucket confirms we have uploaded a file! ๐Ÿ•บ๐Ÿ’ƒ

Presigned URLs are a brilliant feature of Cloud Object Storage. Allowing users to upload files directly overcomes the payload limit for cloud functions. It also reduces the cost for uploading files, removing the cloud functions’ invocation cost.

conclusion

Object storage services are the solution for managing files with serverless applications.

IBM Cloud provides both a serverless runtime (IBM Cloud Functions) and an object storage service (IBM Cloud Object Store). In this blog post, we looked at how integrate these services to provide a file storage solution for serverless applications.

We showed you how to provision new COS services, create and manage authentication credentials, access files using a client library and even allow external clients to interact directly with buckets. Sample serverless functions using the Node.js runtime were also provided.

Do you have any questions, comments or issues about the content above? Please leave a comment below, find me on the openwhisk slack or send me a tweet.