Capturing Runtime Metrics For OpenWhisk Applications

This blog post is part one of a series on “Monitoring Serverless Applications Metrics”. See the introduction post for details and links to other posts.

Serverless platforms pose a unique challenge for collecting application metrics. Runtime environments are ephemeral, existing only to process requests.

Using a background process to capture and export metrics to an external service is impossible. With such a restricted runtime environment, we have to look at other options… 🤔

exporting serverless runtime metrics

console logs with alarm trigger

Apache OpenWhisk captures console logs written to stdout or stderr by actions. Logs are available in activation records through the platform API.

Runtime metrics written to the console will be saved in the activation records.

An additional action, triggered from the alarm package, can be scheduled to collect these values from the logs and forward to the metrics service.

This approach is simple and does not interfere with request processing. However, it does add a delay to the metric values being available in the collection service. This delay is dependent on the schedule chosen for the alarm trigger feed.

send before returning

Another option is to instrument the serverless functions to automatically collect and push metrics to the metric service before returning from the function.

The function will have to wait for the external metrics service to respond before returning.

This method pushes metric values into the collection service in real-time. There is no waiting for the background collection action to run. The disadvantage of this approach is that it adds a delay to each request. This delay is dependent on the response time from the collection service.

capturing runtime metrics

Metric values will have to be captured using a runtime library, due to the restrictions on running background processes.

The library should automatically capture registered metrics during each invocation. Values will be forwarded to the collection service using the configured forwarding method.

openwhisk-metrics

There is a Node.js library to ease the process of capturing runtime metrics for OpenWhisk actions.

https://github.com/jthomas/openwhisk-metrics

Node.js actions are wrapped with a proxy to automate recording metrics during invocations.

Metric values for cpu, memory, time, error and coldstart are collected by default. It supports adding custom metric collectors.

usage

Wrap action handlers with the metrics library.

const metrics = require('openwhisk-metrics')

const main = params => {
  return { message: "Hello World" }
}

module.exports.main = metrics(main) 

Metrics values are logged to stdout for each invocation of the serverless function.

METRIC <workspace>.<action_name>.<activation>.memory.rss 53018624 1512489781
METRIC <workspace>.<action_name>.<activation>.memory.heapTotal 34463744 1512489781
METRIC <workspace>.<action_name>.<activation>.memory.heapUsed 16955224 1512489781
METRIC <workspace>.<action_name>.<activation>.memory.external 987361 1512489781
METRIC <workspace>.<action_name>.<activation>.error 0 1512489781
METRIC <workspace>.<action_name>.<activation>.coldstart 0 1512489781
METRIC <workspace>.<action_name>.<activation>.cpu.user 177 1512489781
METRIC <workspace>.<action_name>.<activation>.cpu.system 2 1512489781
METRIC <workspace>.<action_name>.<activation>.time.start 1511605588388 1512489781
METRIC <workspace>.<action_name>.<activation>.time.end 1511605588468 1512489781
METRIC <workspace>.<action_name>.<activation>.time.duration 80 1512489781

Once you are collecting metrics, you need a monitoring service to forward them to…

monitoring service

We’re going to look at collecting and visualising metrics using the IBM Cloud Monitoring Service.

Use the IBM® Cloud Monitoring service to expand your collection and retention capabilities when working with metrics, and to be able to define rules and alerts that notify you of conditions that require attention.

IBM Cloud Monitoring service comes with a metric collection service, Grafana-based visualisation dashboard and an alerting system.

Let’s demonstrate how to use the approaches listed above for exporting metrics to the IBM Cloud Monitoring service.

There’s an additional Node.js library to integrate the OpenWhisk Metrics library with the IBM Cloud Monitoring Service. This can forward metrics in real-time or batches (using a schedule action from a timer).

provisioning

An instance of the service must be provisioned before being used.

The service is available in multiple regions. Choose the same region as the IBM Cloud Functions instance for best performance.

Instances can be provisioned through the IBM Cloud dashboard or the IBM Cloud CLI tool.

 $ bx cf create-service Monitoring lite my_monitoring_svc

For more details on provisioning instances of this service, please see the documentation: https://console.bluemix.net/docs/services/cloud-monitoring/how-to/provision.html#provision

authentication

IBM Cloud Monitoring supports the following authentication methods: API Key, UAA Token or IAM Token.

An API key will be used to provide authentication credentials in the examples below.

Keys can be created through the IBM Cloud dashboard or the IBM Cloud CLI tool.

$ bx iam api-key-create metrics-key -d "API Key For Serverless Metrics"

Note: The API key is only shown at the time of creation. If the API key is lost, you must create a new API key.

For more details on creating API keys, please see the documentation here: https://console.bluemix.net/docs/services/cloud-monitoring/security/auth_api_key.html#auth_api_key

space domain

The GUID of the account space is also required to use the metrics service.

Retrieve this value using the IBM Cloud CLI tool. Replace SpaceName with the name of the space.

bx iam space SpaceName --guid

The GUID for the space is returned.

$ bx iam space dev --guid
667fadfc-jhtg-1234-9f0e-cf4123451095

Note: Space GUIDs must be prefixed with s- when being using with the monitoring service.

"667fadfc-jhtg-1234-9f0e-cf4123451095" => "s-667fadfc-jhtg-1234-9f0e-cf4123451095"

example (real-time forwarding)

Let’s start with an example of using real-time forwarding of metrics values. Metric values will be automatically collected by the runtime library. Before each invocation finishes, the library will send the values to the external collection service.

This example assumes you already have the CLI tool for IBM Cloud Functions installed and configured. If you need to do this, please follow the instructions here.

create new directory

$ mkdir hello-world
$ cd hello-world

initialise npm package

$ npm init -y

install libraries

$ npm install openwhisk-metrics cloud-functions-metrics-service

update action handler source

Create a file called index.js with following code.

const metrics = require('openwhisk-metrics')
const service = require('cloud-functions-metrics-service')

metrics.service = service.client({  
  host: 'metrics.<???>.bluemix.net',
  scope: 's-<???>',
  api_key: '<???>'
})

const main = params => {
	return { message: "Hello World" }
}

module.exports.main = metrics(main)

Replace the host, scope and api_key parameters custom values. Host names for the regional monitoring service endpoints are listed here.

Space GUIDs must be prefixed with s- to identify a space in the scope parameter.

create new action

$ zip -r action.zip index.js package.json node_modules
$ wsk action create hello-world --kind nodejs:8 action.zip

invoke action

$ wsk action invoke hello-world -b
ok: invoked /_/hello-world with id 74add543b3b94bbbadd543b3b9dbbb17

use api to list metrics

Use curl to manually list the ingested metric labels for IBM Cloud Functions.

Replace the <???> values in the command with configuration values from above.

$ curl --url 'https://metrics.<???>.bluemix.net/v1/metrics/list?query=ibm.public.cloud-functions.*.*.*.*' --header 'x-auth-scope-id: s-<???>' --header 'X-Auth-User-Token: apikey <???>'
[{
  "leaf": 0,
  "context": {},
  "text": "72df4bc809c04fae9f4bc809c01fae77",
  "expandable": 1,
  "id": "ibm.public.cloud-functions.*.*.*.72df4bc809c04fae9f4bc809c01fae77",
  "allowChildren": 1
}]

Each activation identifier should be listed as a label value in the metrics service.

example (batch forwarding)

If we don’t want to add a (short) delay to each invocation, metric values can be forwarded asynchronously. An explicit action will be created to forward metric values from invocation logs. This action will be triggered on a periodic schedule using the alarm trigger feed.

This example assumes you already have the CLI tool for IBM Cloud Functions installed and configured. If you need to do this, please follow the instructions here.

create sample action

$ mkdir hello-world
$ cd hello-world
$ npm init -y
$ npm install openwhisk-metrics

Create a file called index.js with the following code.

const metrics = require('openwhisk-metrics')

const main = params => {
  return { message: "Hello World" }
}

module.exports.main = metrics(main) 

deploy and test sample action

Package and deploy hello-world action.

$ zip -r action.zip index.js package.json node_modules
$ wsk action create hello-world --kind nodejs:8 action.zip

Metric values are written to the console for each invocation.

$ wsk action invoke hello-world -b
ok: invoked /_/hello-world with id 28da39d219df436a9a39d219df036a30
$ wsk activation logs 28da39d219df436a9a39d219df036a30
2017-12-18T14:38:50.751615113Z stdout: METRIC user@host_dev.hello-world.28da39d219df436a9a39d219df036a30.cpu.user 0 1513607930
2017-12-18T14:38:50.751672372Z stdout: METRIC user@host_dev.hello-world.28da39d219df436a9a39d219df036a30.cpu.system 0 1513607930
2017-12-18T14:38:50.751685034Z stdout: METRIC user@host_dev.hello-world.28da39d219df436a9a39d219df036a30.time.start 1513607930749 1513607930
...

create metric-forwarder action

Clone the project repository and install NPM dependencies.

$ git clone https://github.com/jthomas/cloud-functions-metrics-service
$ cd cloud-functions-metrics-service
$ npm install

Update action configuration file (config.json) with the following parameter values.

{
  "actions": ["hello-world"],
  "service": {
    "host": "metrics.<???>.bluemix.net",
    "scope": "s-<???>",
    "api_key": "<???>"          
  }
}

Replace the <???> fields in the configuration file from values from above.

Package metric-forwarder action.

$ zip -r action.zip index.js package.json lib node_modules

deploy metric-forwarder action

Create new metric-forwarder action from deployment package and configuration file.

$ wsk action create metric-forwarder --kind nodejs:8 action.zip --param-file config.json

Create trigger feed for alarm package to run metric-forwarder on periodic schedule.

$ wsk trigger create interval \
  --feed /whisk.system/alarms/interval \
  --param minutes 1

Bind trigger to action using rule.

$ wsk rule create forward-metrics-on-interval interval metric-forwarder

invoke sample action

$ wsk action invoke hello-world -b
ok: invoked /_/hello-world with id 28da39d219df436a9a39d219df036a30

This will generate activation records containing metric values. When the interval trigger is next fired, metric values from these records forwarded to the collection service.

Logs from the metric-forwarder action will show the activation records that have been retrieved.

$ wsk activation poll metric-forwarder

use api to list metrics

Use curl to manually list the ingested metric labels for IBM Cloud Functions.

Replace the <???> values in the command with configuration values from above.

$ curl --url 'https://metrics.<???>.bluemix.net/v1/metrics/list?query=ibm.public.cloud-functions.*.*.*.*' --header 'x-auth-scope-id: s-<???>' --header 'x-auth-user-token: apikey <???>'
[{
  "leaf": 0,
  "context": {},
  "text": "72df4bc809c04fae9f4bc809c01fae77",
  "expandable": 1,
  "id": "ibm.public.cloud-functions.*.*.*.28da39d219df436a9a39d219df036a30",
  "allowChildren": 1
}]

Each activation identifier should be listed as a label value in the metrics service.

next steps

Metric values from our serverless applications are now being collected by the IBM Cloud Monitoring service. 👏👏👏

Applications metrics are automatically recorded by the runtime library for each invocation. Metric values are exported to the monitoring service in real-time or using a timed action to forward in batches.

Setting up monitoring dashboards from the collected values will allow us to identify and resolve issues with our serverless applications. In the next blog post, we’ll look using Grafana to visualise metric values being collected…