HY300E 002 v004

System Insight
8.0
Copyright © 2018 Digital Route AB

Copyright © 2018 Digital Route AB
The contents of this document are subject to revision without further notice due to continued progress in methodology, design, and
manufacturing.
Digital Route AB shall have no liability for any errors or damage of any kind resulting from the use of this document.
DigitalRoute® and MediationZone® are registered trademarks of Digital Route AB. All other trade names and marks mentioned
herein are the property of their respective holders.
Table of Contents
1. System Insight Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Preparation of System Insight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Installing System Insight using Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Install System Insight with InfluxDB using Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Install System Insight with Cloudwatch using Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 Access Grafana via Desktop or Web UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Install System Insight Manually . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Configure System Insight Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Configure System Insight without InfluxDB Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3. Configuring System Insight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 System Insight Metrics Compaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Setting Retention Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Configuring System Insight with Multiple InfluxDB Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 System Insight Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4. Managing System Insight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1 Displaying Metrics using System Insight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1.1 Managing System Insight Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.2 Metrics Naming Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Grafana Dashboards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Using System Insight for Batch Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 REST APIs for System Insight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5. System Insight Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6. System Insight Backup and Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
System Insight
This document describes how to configure and use System Insight to store and/or visualize system data, and data processed in
MediationZone workflows. This allows you to view how your system is performing, how your data flow is progressing, how capacity
trends look, as well as correlate events for trouble-shooting, or monitor a service chain.
MediationZone contains a rich set of system metrics, in the form of MIM parameters, and System Insight provides a means to
graphically display this data by gathering numeric MIMs from workflows, and JVM metrics.
4
1. System Insight Overview
System Insight provides a standardized means to visualize complex processing and can be used with visualization and analytics
tools which already exist, e g InfluxDB and Grafana.
A probe is a point in MediationZone or a workflow where metrics are created by sampling or aggregating events. Using internal
probes, data flow probes and customized probes, System Insight generates metrics from MediationZone with searchable tags. The
metrics are then sent to the System Insight service, from which they are stored in a time-series database, based on the
open-source InfluxDB project. These metrics can then be displayed in a web-based dashboard builder, based on the open-source
Grafana project, which shows real-time as well as historic metrics. You can customize the dashboards as required, based on the
stored data.
System Insight overview
You can use this functionality to visualize MediationZone, the metrics for data flow through MediationZone and the metrics for
assets managed by MediationZone.
System Insight can collect metrics from external probes in order to visualize them, and can also forward metrics to an external
analytics or visualization tool, using MZ workflows. You can create custom metrics using the System Insight forwarding agent in a
workflow. The agent sends metrics samples from the workflow to the System Insight service for visualization. Equally, you can use
the System Insight collection agent to collect data from the system insight service that is running, and send the data to any protocol
supported in MediationZone.
System Insight is an akka-based service which runs on one or several SCs. For further information on Akka clusters, see 1.7 Akka
Cluster in the System Administrator's Guide.
5
2. Preparation of System Insight
For the purpose of demonstrating how you can use System Insight to visualize metrics, the configurations and examples provided
in the documentation show System Insight using InfluxDB 1.2 and Grafana 4.3.1, unless stated otherwise, to store and visualize
data from MediationZone.
You can run predefined scripts to configure MediationZone, and install the embedded versions of InfluxDB and Grafana provided.
Using Cloudwatch as storage is also supported. If you choose to use Cloudwatch, there is also a predefined script provided which
you can use to configure System Insight with Cloudwatch.
Alternatively, you can manually configure MediationZone, and manually install InfluxDB, Cloudwatch or another database, and
Grafana or another visualization tool.
2.1 Installing System Insight using Scripts

There are scripts in place if you choose to install an embedded version of System Insight, or if you choose to install System Insight
with Cloudwatch as storage.
If you use the script provided for installing InfluxDB, there is also the option to access Grafana via the Desktop or Web GUI.
2.1.1 Install System Insight with InfluxDB using Scripts

For a basic setup of System Insight, you require the following:
An InfluxDB instance to store metrics with a minimum of 10 GB of disk space

A Grafana dashboard to visualize the metrics that you want to display
An SC to run the system insight service
Note!
The minimum requirement of 10 GB is based on a basic setup for system statistics with the default retention policy of
one week for short-term storage, and one year for downsampled long-term storage.
To get System Insight up and running for test purposes, there are a number of scripts and configuration files available:
A script for a default setup of InfluxDB

A script for a default setup of Grafana
A sample script to set up System Insight on SCs
Note!
The scripts to set up InfluxDB and Grafana are supported on Ubuntu/Debian and Centos/Redhat only.
If you require to configure a System Insight setup using InfluxDB and Grafana offline, copy the grafana<version>.deb and
influxdb.conf files and the scripts from $MZ_HOME/scripts/str-templates/system-insight and save them in the
same directory, then follow the instructions provided below starting from step 2:
If you are working online, proceed to the instructions below.
6
Note!
Due to the service manager used in Ubuntu 14.04, if you are using this version of Ubuntu, before you run the script to set
up InfluxDB as instructed in step 2, you must rename the systemctl command as follows:
mv systemctl systemctl-bak
Before you proceed to step 3, rename systemctl-bak to systemctl again:
mv systemctl-bak systemctl
The steps are as follows:
1. Navigate to the directory $MZ_HOME/scripts/str-templates/system-insight.
2. Run the following script to set up InfluxDB:
$ sudo ./si_influxdb_setup.sh
If required, you can modify the default username and password, and you can also change the database name before
running the script. The variables to change in the script are INFLUX_ADMIN_USR, INFLUX_ADMIN_PWD and
INFLUX_SCHEMA.
3. To ensure that the InfluxDB instance works as it should, use the following influx command:
$ sudo influx -username <username> -password <password>

$ show databases
The output should be the following:
name: databases
name
----
mz
_internal
Alternatively, you can use the following cURL command:
$ curl -XPOST -u <username>:<password> http://<host name>:8086/query --data-urlencode

"q=SHOW DATABASES"
The output should be the following:
{"results":[{"statement_id":0,"series":[{"name":"databases","columns":["name"],"values":[["mz"],["_internal"]]}
4. Run the following script to set up Grafana:
7
4.
$ sudo ./si_grafana_setup.sh
If required, you can modify the default username and password before running the script. The variables to change in the
script are GRAFANA_USR and GRAFANA_PWD.
5. If you want to add sample dashboards, run the script again with the flag add-dashboards:
$ sudo ./si_grafana_setup.sh add-dashboards
Once successfully installed browse to http://<host name>:3000.
Note!
By default, Grafana is installed using http. If you want to use Grafana over https, see the section, Grafana Over
https, in 2.1.3 Access Grafana via Desktop or Web UI.
6. Run the following script on the Platform instance, with the si-topo flag to run the topo commands required to set up
MediationZone with System Insight:
$ ./si_basic_setup.sh si-topo
If you want to access Grafana from the Desktop via Tools, or via the MediationZoneWeb UI, http://<platform
host>:<web interface port>/mz/, modify the property GRAFANA_URL as follows before running the script:
GRAFANA_URL='http://<host name>:3000'
For further information an this method of accessing Grafana from MediationZone, see 2.1.3 Access Grafana via Desktop
or Web UI.
7. You are then prompted to restart the Platform and picos, and startup the services:
$ mzsh restart platform

$ mzsh system restart
$ mzsh service start
8. To set up the filters, run the script with the si-basic-filters flag and your credentials. This step is not obligatory but
provides a setup in which system metrics are produced for InfluxDB:
$ ./si_basic_setup.sh si-basic-filters <username> <password>
Steps 6 - 8 are required as system insight service must be up and running to be able to create profiles and filters.
2.1.2 Install System Insight with Cloudwatch using Scripts

For a basic setup of System Insight, you require the following:
An SC to run the system insight service
To get System Insight up and running for test purposes, there are a number of scripts and configuration files available:
A script for a default setup of Cloudwatch
8
A sample script to set up System Insight on SCs
Note!
The script to set up Cloudwatch is supported on Ubuntu/Debian and Centos/Redhat only.
The steps required to install System Insight with Cloudwatch are as follows:
1. Navigate to the directory $MZ_HOME/scripts/str-templates/system-insight.

2. Run the following script on the Platform instance, with the si-topo flag to run the topo commands required to set up
MediationZone with System Insight:
$ ./si_basic_setup.sh si-topo-cloudwatch
Enter the AWS region when prompted.
3. You are then prompted to restart the Platform and picos, and startup the services:

4. To set up the filters, run the script with the si-basic-filters flag and your credentials. This step is not obligatory but
provides a setup in which system metrics are produced:
$ ./si_basic_setup.sh si-basic-filters <username> <password>
Steps 2 - 4 are required as system insight service must be up and running to be able to create profiles and filters.
2.1.3 Access Grafana via Desktop or Web UI

If you want to be able to access Grafana via the Desktop from Tools or via the MediationZone Web UI (http://<platform
host>:<web interface port>/mz/), you have 2 options:
1. You can modify the property GRAFANA_URL in the si_setup_script before running the script, as described in step 6
in 2.1.1 Install System Insight with InfluxDB using Scripts.
OR
2. At any time after installing System Insight, you can use the mzsh topo command as shown below:
$ mzsh topo set services:custom/val:system-insight.si-instance.config.grafana-url

http://<host name>:3000
Grafana via Desktop
To be able to access Grafana from the Desktop, the System insight service must be running and the property grafana-url
mentioned above must be set when starting the Desktop.
In Desktop, go to Tools, and select System Insight. You are directed to the Grafana login page.
Grafana via Web UI
For System Insight to be visible from the MediationZone Web UI, the property grafana_url mentioned above must be set.
9
Go to the MediationZone Web UI, located at http://<platform host>:<web interface port>/mz/, and select System
Insight from the Dashboard.
System Insight in the Web UI
Grafana Over https
If you require extra security, the option to use Grafana over https is available. However, you require a certificate which is not
provided. When you have your certificate in place, take the following steps:
1. Go to the directory MZ_HOME/scripts/str-templates/system-insight, and open the grafanai.ini file.
2. In the Server section, modify the text as follows and save:

Change this line:
;protocol = http
to the following:
protocol = https
Change these lines:
# https certs & key file

;cert_file =
;cert_key =
to the following:
# https certs & key file

cert_file = /path/to/file/server.crt
cert_key = /parth/to/file/server.key
3. Restart the grafana service.

Go to https://<your grafana installation url>:3000/ and verify that https is working.
2.2 Install System Insight Manually

If you choose to install System Insight without the scripts provided, there are several steps that you must take. However, which
steps you take depend on whether you use InfluxDB, Cloudwatch and/or Grafana, or not. Irrespective of which database or
visualization tool you use, you must configure MediationZone to activate the system insight service on an SC as described in 2.2.1
Configure System Insight Services.
You have several options:
1. If you choose to run your own instance of InfluxDB, see https://docs.influxdata.com/influxdb/v1.2/.
2. If you choose to run Cloudwatch, see https://aws.amazon.com/documentation/cloudwatch/.
3. If you choose to run your own instance of Grafana, see http://docs.grafana.org/.
10
3.
4. If you choose to use System Insight with the System Insight collection or forwarding agent, see 9.67 System Insight
Agents, or save the data to file. See 2.2.2 Configure System Insight without InfluxDB Instances.
2.2.1 Configure System Insight Services

System insight services (SI services) are run on one or more SCs. Take the following steps to configure an SC and to add the akka
and system insight services to that SC.
1. Use the mzsh topo command to add the akka service to the custom.conf for services. You must specify a name for
the akka service, e g si. The startup-natures must be si.
$ mzsh topo set topo://services:custom/obj:akka '{

<akka service name> {
template: "1/standard/basic"
config {
startup-natures: [si]
}
}
}'
See the example below, where the akka service is named si.
Example - Adding the akka service
$ mzsh topo set topo://services:custom/obj:akka '{

si {
config {
startup-natures: [si]
}
}
}'
2. Use the mzsh topo command to create an SC/SCs on which to run System Insight.
$ mzsh topo set topo://container:<container name> pico:<sc> '{

template:mz.standard-sc
config {
properties {
mz.servicehost.natures : si
mz.servicehost.port.range : <port range>
}
}
}'
See the example below, where 3 SCs are created with a respective port range:
11
Example - Adding 3 SCs to run System Insight
$ mzsh topo set topo://container:main1/pico:sc4 '{

config {
properties {
mz.servicehost.port.range : 6001-6050
}
}
}'

config {
properties {
}
}
}'

config {
properties {
}
}
}'
Note!
If you require high volumes of System Insight metrics (> 10'), add the following parameters to the relevant SC
configuration(s) to ensure that there is enough memory to handle the inflow of metrics. For further information on
how to these jdkarg values to the relevant SC conf fileRefer to 2.4 Managing Pico Configurations.
<jdkarg value="-Xmx2G"/>
<jdkarg value="-server" vendor="sun,hp"/>
<jdkarg value="-Xms2G"/>
<jdkarg value="-XX:MaxMetaspaceSize=196M"/>
<jdkarg value="-XX:NewSize=1G"/>
3. Use the mzsh topo command to add the system insight service to the custom.conf for services as shown below.
You must use the same akka service name that you enter for the akka configuration in step 1, which you must also enter
as the value for the akka-cluster, shown below as <akka service name>.
InfluxDB
If you are using InfluxDB as storage, ensure that you complete the relevant username, password and http url for the
InfluxDB instance that you are using. See the Cloudwatch section below if you are using Cloudwatch as storage.
12
$ mzsh topo set topo://services:custom/obj:system-insight '{
si-instance {
start-after=["akka/si"]
config {
storage-backend=influxdb
akka-cluster: "<akka service name>"
influxdb {
url="<http url>"
user="<influxdb username>"
password="<influxdb password>"
database="<database name>"
}
}
}
}'
Example - Adding the System Insight service to the custom.conf when using influxDB

si-instance {
config {
storage-backend=influxdb
akka-cluster: "si"
influxdb {
url="http://influx:8086"
user="mzadmin"
password="dr"
database="mz"
}
}
}
}'
If you want to be able to access Grafana via Desktop from Tools or via the MediationZone Web UI (http://<platform
host>:<web interface port>/mz/), you can use the mzsh topo command as shown below:
$ mzsh topo set services:custom/val:system-insight.si-instance.config.grafana-url

http://<host name>:3000
For further information on this method of accessing Grafana from MediationZone, see the section below, Access Grafana
via Desktop or Web UI.
Cloudwatch
If you are using Cloudwatch as storage, ensure that you complete the relevant AWS user access key, AWS user access
secret, AWS region and a namespace prefix. See the table below for information on how to configure the System Insight
service when using Cloudwatch.
13
Property Description
aws-access-key and aws-access-secret AWS credentials. You must encrypt the relevant key and
secret using the command mzsh encryptpassword. If
you do not provide AWS credentials, the IAM policy is
used for authentication. For further information on the
command mzsh encryptpassword, see 2.1.4
encryptpassword in Command Line Tool User's Guide.
region The AWS region. If you do not enter a region, the default
region is used. For further information on how to identify
which region to enter, see x.
namespace-prefix The root namespace to add metrics to. You can use
forward slashes ("/") to add multiple levels.
batch Add this block to tune batch settings by entering values

for size and interval. See below.
size The number of measurements per batch. The default is

10000. Use the default setting unless advised otherwise
by DigitalRoute.
interval The time in milliseconds between batches. The default is

1000. Use the default setting unless advised otherwise
by DigitalRoute.

si-instance {
config {
akka-cluster: "<akka service name>"
storage-backend=cloudwatch
cloudwatch {
aws-access-key = "<IAM user access key>"
aws-access-secret = "<IAM user access secret>"
region = "<AWS region>"
namespace-prefix = "<prefix>"
batch {
interval = 1000
}
}
}
}
}'
14
Example - Adding the System Insight service to the custom.conf when using Cloudwatch

si-instance {
config {
akka-cluster: "si"
storage-backend=cloudwatch
cloudwatch {
aws-access-key = "DR_DEFAULT_KEY-A093A586B07F76B80ADC1344F9A37878"
aws-access-secret = "DR_DEFAULT_KEY-373D9EA882C035182B05C7CFC6614C14"
region = "eu-west-1"
namespace-prefix = "myprefix"
batch {
interval = 1000
}
}
}
}
}'
4. Use the mzsh topo command to enable System Insight at cell level:
$ mzsh topo set val:common.mz.system.insight true
5. Restart the Platform and then start or restart the ECs and SCs:

$ mzsh system start
or
6. Start the service using the mzsh service command:
Access Grafana via Desktop or Web UI
To be able to access Grafana via Desktop from Tools or via the MediationZone Web UI (http://<platform host>:<web
interface port>/mz/) as mentioned in step 3 above, you can use the following mzsh topo command at any time after
installing System Insight:
$ mzsh topo set services:custom/val:system-insight.si-instance.config.grafana-url http://<host

name>:3000
Grafana via Desktop
The first time you access Grafana via Desktop, the System Insight service must be running and the property grafana_url
mentioned above must be set.
In Desktop, go to Tools, and select System Insight. You are directed to the Grafana login page.
15
Grafana via Web UI
For System Insight to be visible from the MediationZone Web UI, the property grafana_url mentioned above must be set.
Go to the MediationZone Web UI, located at http://<platform host>:<web interface port>/mz/, and select System
Insight from the Dashboard.
System Insight in the Web UI
2.2.2 Configure System Insight without InfluxDB Instances

If you want to run System Insight using the System Insight collection or forwarding agent, you can disable storage backend (i e
InfluxDB or CloudWatch) by taking the steps provided below.
1. Use the following mzsh topo set command to disable the storage backend:
$ mzsh topo set

topo://services:custom/val:system-insight.si-instance.config.storage-backend none
2. Restart the Platform and SC(s):

3. Start the system insight service:
For information on how to use the System Insight agents, see 9.67 System Insight Agents.
16
3. Configuring System Insight
After installing System Insight, there are several configuration modifications that you can make depending on how you are using
System Insight, i e with or without InfluxDB and or Grafana, and what data you want to produce using System Insight.
Configuring metrics compaction, retention policies and multiple instances of InfluxDB only applies if you are using System Insight
with InfluxDB.
Note!
The versions of InfluxDB and Grafana included in System Insight are not highly available. If you want to retain the data
produced using System Insight, see 6. System Insight Backup and Maintenance.
3.1 System Insight Metrics Compaction
Note!
This section only applies if you are using System Insight with InfluxDB for data storage.
As System Insight gathers a large amount of data per second in the form of metrics, using InfluxDB to handle this data, the growing
amount of storage required for this data is addressed by InfluxDB. The InfluxDB solution downsamples the data so that high
precision raw data is kept for only a limited period of time, and lower precision data is kept for a longer period of time. There are
two features which automate the process of downsampling data and expiring old data: Continuous Queries (CQ) and Retention
Policies (RP).
For details on this solution, see https://docs.influxdata.com/influxdb/v1.2/query_language/continuous_queries/ and

https://docs.influxdata.com/influxdb/v1.2/guides/downsampling_and_retention/.
If you have installed InfluxDB using the script provided, the default setup includes three predefined retention policies and one
predefined continuous query.
The predefined retention policies are:
one_week - this is set as the default retention policy for the database
six_months
one_year
The predefined continuous query is named cq_six_months. This is a generic continuous query, downsampling all metrics as mean
values over a period of 10 minutes from the default retention policy of one week into the retention policy of six months.
You must implement the data compaction solution provided by InfluxDB to store the data produced using System Insight. See the
examples below.
Metrics Compaction Examples
Implementing Default Metrics Compaction
In this scenario, you have installed System Insight using the scripts provided so that you have the default setup of InfluxDB. This
means you have a database named "mz" and a default retention policy named "one_week", and InfluxDB is up and running.
The steps below are required to create a retention policy that stores data for one month and a continuous query that runs every
five minutes and calculates the mean of idle_cpu of the measurements during that time, and to store the new measurement with
the name host.compute in the retention policy created.
1. Create a retention policy for one month. Measurements with this retention policy are stored for a month:
CREATE RETENTION POLICY "one_month" ON "mz" DURATION 30d REPLICATION 1
2. In this instance, you are sampling a metric named host.compute with tags={time,host_name} and the values
{idle_cpu , user_time_cpu , idle_proc , sleep_proc , sys_cpu , total_proc , up_time, user_cpu ,
wait_cpu} to InfluxDB with the default retention policy ( one_week ):
17
2.
CREATE CONTINUOUS QUERY "5min_cq" ON "mz" BEGIN SELECT mean("idle_cpu") as "idle_cpu" INTO
"mz”.”one_month”.”host.compute" FROM "host.compute" GROUP BY time(5m) END
This new measurement is stored in the database for a month, after that they are removed.
Removing Default Metrics Compaction
Depending on how you want to use System Insight, the predefined continuous query cq_six_months might store too much data.
To disable the query, remove the predefined continuous query as follows:
DROP CONTINUOUS QUERY "cq_six_months" ON "mz"
3.2 Setting Retention Policies
Note!
The retention policy that you set in System Insight determines how long the filter data is kept, and it can be set at various levels, at
profile and InfluxDB instance level.
If you do not set a specific retention policy, the default retention polices that exist at InfluxDB instance level are the policies that
apply.
If you have installed System Insight with InfluxDB and Grafana, using the scripts provided, as described in 2.1 Install System
Insight using Scripts (old), the default retention policy is one week. Retention policies of six months or one year are also available
for selection.
At InfluxDB Instance Level
If you want to set a specific retention policy at InfluxDB instance level, you add a retention policy via InfluxDB.
If a retention policy is left empty at profile level, but the default is set at InfluxDB instance level, the data retention adheres to the
default policy set on each InfluxDB instance.
At Profile Level
If you set a retention policy for a profile, as described in 2.2.32 systeminsight, the data retention adheres to that policy, overriding
the default InfluxDB instance retention policy.
Note!
If you set a retention policy at profile level, which does not exist in the InfluxDB instances in place, an error message is
thrown.
For further information on configuring System Insight with multiple InfluxDB instances, see 3.3 Configuring System Insight with
Multiple InfluxDB Instances.
18
3.3 Configuring System Insight with Multiple InfluxDB Instances
Note!
If you require multiple InfluxDB databases which can also be written to as a back up, you can add additional databases to the
System Insight service configuration as follows:
backupDBs="influxdb2"
influxdb2 {
database=mz
password=dr
url="http://127.0.0.1:18086"
user=mzadmin
}
For information on how to modify the System Insight service configuration using the mzsh topo command, see 2.7.2 Updating
Service Configurations.
You can specify more than one database using a comma as a separator, as shown in the example below:
Example - How to add additional InfluxDB databases
system-insight {
si-instance {
config {
akka-cluster=si
influxdb {
database=mz
password=dr
url="http://127.0.0.1:8086"
user=mzadmin
}
backupDBs="influxdb2,influxdb3"
influxdb2 {
database=mz
password=dr
url="http://127.0.0.1:18086"
user=mzadmin
}
influxdb3 {
database=mz
password=dr
url="http://127.0.0.1:28086"
user=mzadmin
}
}
start-after=[
"akka/si"
]
template="1/standard/basic"
}
}
If all InfluxDB instances are unreachable, System Insight runs in gated mode. In gated mode, metrics are dropped until at least one
InfluxDB instance is reachable.
19
3.4 System Insight Properties
In the standard template.conf file for the system insight service, there are a number of properties, which are described in the
table below:
Property Description
measurement.client.sender.buffer-size Default value: 100
The buffer size of the client sender
measurement.client.receiver.buffer-size Default value: 100
The buffer size of the client receiver
measurement.client.receiver.batch-size Default value: 16
The batch size of incoming messages to the client
measurement.client.ack-timeout Default value: 3 seconds
The maximum time in seconds that an incoming message to

the client can be in transit before timing out
measurement.server.sender.buffer-size Default value: 1000
The buffer size of a server sender
measurement.server.receiver.buffer-size Default value: 100
The buffer size of a server receiver
measurement.server.receiver.batch-size Default value: 16
The batch size of incoming messages to the server
measurement.server.ack-timeout Default value: 3 seconds
The maximum time in seconds that an incoming message to

the server can be in transit before timing out.
measurement.server.throttle.limit Default value: 1000
The maximum number of messages per the unit of time set

for the measurement server throttle interval, that can be sent
to one server.
measurement.server.throttle.interval Default value: 1 second
The time interval in seconds to apply to a measurement

server throttle interval.
measurement.server.path Default value: measurement router
The logical name of the measurement communication

channel
measurement.server.pushed-back-log-interval Default value: 5 seconds
The interval in seconds to log when messages going from the

server to clients are subject to backpressure
It is recommended that you keep the default values for these properties, but if you require to modify any value(s), use the mzsh
topo command, and then restart the SCs and the system insight service. See the example below:
20
Example - Setting a property for the System Insight service
If you want to change the value for the property measurement.server.throttle.interval to 10:
1. Use the following mzsh topo set command:
$ mzsh topo set

topo://services:custom/val:system-insight.si-instance.config.measurement.server.throttle.interval
10
2. Restart the SCs:
3. Start the custom system insight service:
21
4. Managing System Insight
System Insight displays metrics based on the filters that you create. You add the filters to a profile, and determine the metrics that
you want to display by configuring the filters. This process is described in 4.1 Displaying Metrics using System Insight.
After you have specified filters, you use a visualization tool to display metrics. For the purpose of demonstrating how you can use
System Insight, sample Grafana dashboards are provided. See 4.2 Grafana Dashboards.
4.1 Displaying Metrics using System Insight

System Insight displays metrics based on the filters determined with metrics and tags. The filter you create is determined by the
metric that you want to display, and can be further refined by determining the tag name and tag value that you want to visualize.
For example, if you want to visualize all of the MIMs in a specific workflow, your filter would be determined with mim.<workflow
type>.<agent type>.<agent> without the addition of a tag on the metric.
The data included for a metric:
A timestamp for when the metric was created or sampled

Tags that carry metadata on the metric, for example, which workflow or pico the metric originated from
Values for the actual data carried by the metric, for example, the number of udrs or the amount of data
For information on how to create a System Insight filter, see 4.1.1 Managing System Insight Filters. For information on the naming
conventions in place for metrics, see 4.1.2 Metrics Naming Conventions.
System Metrics
The system metrics are host.compute, host.network, host.storage.iostats, host.storage.usage,
host.storage.swap, pico.events, pico.jvm, pico.workflow, service.akka.router.receiver ,
service.akka.router.sender and service.systeminsight.dispatcher . The respective default fields in place for
these metrics are listed below. You must create a System Insight profile to see these system metrics, see 2.2.32 systeminsight.
Note!
Depending on the filesystem access privileges that you have, the metrics might not be reported for some filesystems.
host.compute
22
Field Description
hi_cpu The percentage of CPU spent servicing/handling hardware

interrupts
idle_cpu The percentage of CPU in idle state
idle_proc The total number of processes in idle state
nice_cpu The percentage of CPU spent running user space processes

that have have a positive nice value (working with low priority
processes)
running_proc The total number of processes in run state
si_cpu The percentage of CPU spent servicing/handling software

interrupts
sleep_proc The total number of processes in sleep state
stolen_cpu The percentage of CPU ‘stolen’ from this virtual machine by

the hypervisor for other tasks, e g running another virtual
machine. This is 0 on the Desktop and server without a virtual
machine.
sys_cpu The percentage of CPU used by the system
total_proc The total number of processes
user_cpu The percentage of CPU used by the user
up_time The amount of time in seconds that has passed since the
machine started
wait_cpu The percentage of CPU in wait state
host.network
Field Description
rx_bytes The total amount of RX bytes
rx_dropped The total amount of RX dropped
rx_errors The total amount of RX errors
rx_frame The total amount of RX frames
rx_overruns The total amount of RX overruns
rx_packets The total amount of RX packets
speed The speed in bits per second
tx_bytes The total amount of TX bytes
tx_carrier The total amount of TX carriers
tx_collisions The total amount of TX collisions
tx_dropped The total amount of TX dropped
tx_errors The total amount of TX errors
tx_overruns The total amount of TX overruns
tx_packets The total amount of TX packets
23
host.storage.iostats
Field Description
disk_read_bytes The number of physical disk bytes read
disk_reads The number of physical disk reads
disk_write_bytes The number of physical disk bytes written
disk_writes The number of physical disk writes
host.storage.swap
Field Description
pages_swapped_in The total number of pages swapped in
pages_swapped_out The total number of pages swapped out
swap_free The amount of swap space free in bytes
swap_total The total amount of swap space in bytes
swap_used The amount of swap space used in bytes
host.storage.usage
Field Description
available_space The total amount of free space available for use on the
filesystem in 1024-byte units
free_file_count The number of free file nodes on the filesystem
free_space The total amount of free space on the filesystem in 1024-byte

units
total_file_count The total number of file nodes on the filesystem
total_size The total size of the filesystem in 1024-byte units
used_percentage The percentage of disk used
used_space The total amount of space used on the filesystem in

1024-byte units
pico.events
This metric reports events that occur on the picos.
pico.jvm
24
Field Description
active_thread_count The current number of live threads including both daemon

and non-daemon threads
collection_count The total number of garbage collections
collection_count_diff The difference between the collection_count and the

previous value of collection_count
collection_count_per_gc The approximate time spent per garbage collection since the
last measurement
collection_time The approximate time (in milliseconds) that has elapsed for
accumulated garbage collection
committed_memory The amount of memory in bytes that is committed for the JVM
to use
cpu_time The CPU time used (in nanoseconds) by the process on

which the JVM is running
cpu_time_diff The difference between the cpu_time and the previous

value of cpu_time
cpu_time_percent The CPU used (in percent) since the last measurement by
the process on which the JVM is running
loaded_file_count The number of classes that are currently loaded in the JVM
max_memory The maximum amount of memory in bytes that can be used

for memory management
open_file_count The number of open file descriptors
up_time The uptime of the JVM in milliseconds
used_memory The amount of memory used in bytes
Any other fields for pico.jvm report memory pool metrics, which are an estimate of the memory usage in bytes of each memory
pool in the JVM.
pico.workflow
Field Description
throughput The throughput of the workflow
service.akka.router.receiver
Field Description
received The messages received by the actor
uptime The uptime of the actor
service.akka.router.sender
25
Field Description
delivered The messages delivered downstream
not_acknowledged The messages not acknowledged from downstream
uptime The uptime of the actor
service.systeminsight.dispatcher
Field Description
custom The number of metrics of the category custom that are handled
host The number of metrics of the category host that are handled
mim The number of metrics of the category mim that are handled
pico The number of metrics of the category pico that are handled
service The number of metrics of the category service that are handled
total The total number of metrics handled
Note!
If you have activated InfluxDB, the metric service.systeminsight.mediator is also listed. This metric has the
default tags actor_system , host_system and pico_instance.
service.systeminsight.mediator
Field Description
custom The number of metrics of the category custom that are handled
host The number of metrics of the category host that are handled
mim The number of metrics of the category mim that are handled
pico The number of metrics of the category pico that are handled
service The number of metrics of the category service that are handled
Default Tag Names

There are default tags in place for some of the metrics.
26
Metric Default Tags
host.compute host_name, reporting_pico , reporting_pico_type
host.network host_name , nic, reporting_pico,

reporting_pico_type
host.storage.iostats host_name , mount_dir, reporting_pico,

reporting_pico_type
host.storage.swap host_name , reporting_pico, reporting_pico_type
host.storage.usage host_name , mount_dir, reporting_pico,

reporting_pico_type
pico.events The tags depend on the events that are run on the pico.
pico.jvm host_name , pico_instance , pico_type
pico.workflow time, host, pico_instance, throughput,

workflow_folder, workflow_instance,
workflow_name
service.akka.router.receiver actor_system , host_system , pico_instance
service.akka.router.sender actor_system , host_system , pico_instance
service.systeminsight.dispatcher actor_system , host_system , pico_instance
The default tags on a mim metric are the following:
Tags Description
agent_category The category of the agent, namely, collection,

processing or forwarding
agent_name The name of the agent
host_name The name of the host
pico_instance The pico instance
workflow_folder The name of the folder in which the workflow is saved
workflow_instance The name of the running instance of a workflow
workflow_name The name of the workflow configuration
workflow_type The type of workflow, namely, batch or realtime
4.1.1 Managing System Insight Filters

You have two options of how to manage System Insight Filters. You can use the mzsh systeminsight command or the System
Insight Profile in Desktop.
When you configure filters, you use regexp syntax to name the metrics that you want to visualize. For the metrics naming
conventions, see 4.1.2 Metrics Naming Conventions.
27
systeminsight Command
Use the mzsh systeminsight command to manage System Insight metrics, by adding and removing filters for the metrics that
you want to produce. You can also use the command to list the metrics available on the running system on which you can apply
filters, to list the retention policies in place, and test which filters there are for a metric.
For details on how to use the systeminsight command, its subcommands, and options, see 2.2.32 systeminsight in the
Command Line Tool documentation.
System Insight Profile
The System Insight profile allows you to create, edit or remove profiles and filters that you want to use to display or store statistics
using the system insight service.
The System Insight profile consists of two tabs: Filters and Detected Metrics.
In the Filters tab, you can add filters to a profile. In the Detected Metrics tab, the possible metrics, tags and tag values detected for
your setup since you started the system insight service are displayed to help you create a filter that you can then add to the filters
in the Filters tab.
For details on how to use the System Insight profile, see 9.67.2 System Insight Profile.
4.1.2 Metrics Naming Conventions

When you want to specify which metrics you want to display on a visualization dashboard, there are specific naming conventions
which must be followed to generate the output you require. The naming convention in place is designed to enable the use of
regexp syntax to filter metrics. A metrics name is made up of categories and tags to determine the data which you want to output.
All metrics names are adjusted to lower case. Spaces in names are replaced with an underscore, for example "UDR Count"
becomes "udr_count".
The name of a metric begins with the category, of which there are five: host, pico, service, custom and mim. Each category
can be further defined with a subcategory, which is then followed by the name:
<category>.<subcategory.subcategory>.metric_name
pico
The pico category, has one possible subcategory, which is jvm. For example, pico.jvm.metaspace_usage,
pico.jvm.thread_count
host
The host category has three possible subcategories:
host.compute, e g host.compute.cpu_usage, host.compute.load_average
host.network, e g host.network.rx_packets, host.network.tx_total_bytes
host.storage e g host.storage.disk_usage, host.storage.page_faults
service
The service category must be further defined by which service you want a metric to be shown: service.<service name>, e
g service.kafka
28
custom
Custom metrics are defined with a user specified name combined with the custom prefix: custom.<user_defined> e g
custom.pcrf.policy_requests, custom.airline.fuel_consumption
This naming convention is used for metrics produced using the System Insight forwarding agent.
mim
The mim category must have a specific structure as shown below:
MIM Workflow type Type of Agent Agent MIM Value
mim batch|realtime collection|processing|forwarding <agent> <mim value>
Examples of how to name a mim metric:
For a metric on the Outbound UDRs MIM value for an Analysis agent (a processing agent) in a real-time workflow, the metric name
is: mim.realtime.processing.analysis.outbound_udrs
For a metric on the Inbound UDRs MIM value for an ECS collection agent in a batch workflow, the metric name is
mim.batch.collection.ecs.inbound_udrs
The minimum specification for a mim metric is mim.batch.workflow or mim.realtime.workflow. These metrics names
would provide output on all the MIM values generated in all the batch workflows or real-time workflows respectively.
4.2 Grafana Dashboards

If you used the scripts provided to install System Insight, the sample dashboards are included in this installation.
If you installed System insight manually, and have your own installation of Grafana, you can import the dashboards provided in the
directory $MZ_HOME/scripts/str-templates/system-insight/dashboards into your Grafana installation.
After installing the sample dashboards, go to https://<your grafana installation url>:3000/.
Dashboards Design
The sample dashboards rely heavily on the Grafana concept of Templates to provide filtering possibilities to narrow down the
scope of the data displayed. Examples of this are templates to enable filtering on service, pico instance or workflow name. For
further information on Grafana Templates, see http://docs.grafana.org/reference/templating/.
Another feature frequently used is the dashboards is Repeat Row/Repeat Panel, where it is possible to design a row or panel and
then reuse it by replicating it using a Template. An example of this can be seen in the Hosts dashboard where the rows are
repeated once per server selected in the Server template.
All graphs and panels provide a tool tip with a summary on the intent of the graph, from which data it is derived, and if any specific
configurations have been done for the display.
Sample Dashboards
Six sample dashboards are provided, and each graph and panel has tooltips which provide information to help you determine how
you want to customize the view of the graphs and panels for your requirements. To display a tooltip, hover your cursor over the i at
the top left hand corner of each table and graph.
The sample dashboards provided are:
Overview
29
The Overview dashboard
This dashboard provides an overview of MediationZone focusing on high level statistics. Using templates, you can filter on server,
pico type and pico instance. The dashboard includes the following graphs and panels:
Platform uptime
Pico uptimes
Throughput per execution context
CPU usage per host
JVM Memory usage
Network I/O per host
Storage I/O per host
Host
The Host dasboard
The Host dashboard provides data for the servers hosting MediationZone with regards to CPU and storage utilization. Using
templates, you can filter on servers and mount directories. The dashboard includes the following graphs and panels:
CPU Utilization
CPU Over Time
Server Uptime
Pico Uptimes
Swap Space Usage
Swap Activity
Disk Usage
Disk I/O
Network
30
The Network dashboard
The Network dashboard provides I/O information on the network interfaces of the servers running MediationZone. Using templates,
you can filter on server and network interfaces. The dashboard includes the following graphs and panels:
Traffic (bytes)
Traffic (packets)
Packets dropped and errors
Network statistics on <host name>
Pico
The Pico dashboard
The Pico dashboard provides pico related data with focus on JVM details, e g, uptime and garbage collections. Using templates, it
provides filtering options as well as the option to specify an interval for garbage collection details. The dashboard includes the
following graphs and panels:
Pico Uptime
Garbage Collection last 5m
31
Average Duration of Garbage Collections last 5m
Memory Usage
Active Threads
Trends
The Trends dashboard
The Trends dashboard provides a comparison between high and low resolution data to see trends in CPU and JVM memory
utilization. Using templates, you can filter on server, pico type and pico instance. The dashboard includes the following graphs and
panels:
CPU Utilization Hourly

JVM Memory Usage Last Hour
CPU Utilization Weekly
JVM Memory Weekly
Workflows
The Workflows dashboard
The Workflows dashboard provides basic information about running workflows with focus on throughput. Using templates, it
provides filter options on Execution Context and workflow details. The dashboard includes the following graphs and panels:
Workflow Throughput (panel)

Running Workflows
Throughput per Execution Context
Workflow Throughput (graph)
32
4.3 Using System Insight for Batch Workflows
Note!
In batch workflows, metrics are not published to InfluxDB after every batch. An aggregation task which runs every 10 seconds
aggregates all the batch results generated up to that point and sends them to InfluxDB.
Only metrics which are a Number are aggregated. Creating metrics from MIMs that are instances of a String, timestamp etc is not
applicable, and these are not aggregated nor sent to InfluxDB.
For workflow MIM parameters, we aggregate only batch_duration.
This means that individual batches cannot be tracked using System Insight. If you require to track individual batches, use Audit:
see 8.1 Audit Profile. System Insight provides aggregate metrics only.
Throughput for Batch Workflows

If you want to display the throughput of a batch workflow using System Insight, you can determine the throughput depending on
what data you require and on your workflow.
Example of how you can determine throughput for a batch workflow

If you have the workflow shown below, you choose how you want to determine the throughput data, which you want to
display using System Insight.
Example batch workflow
If you want the throughput to be determined by the number of UDRs decoded within the batch duration period, you can
use the Outbound UDRs MIM parameter as a metric and divide it by the batch_duration to get the throughput in UDRs
per second. In this example the metric name is mim.batch.processing.decoder.outbound_udrs.
If you want the throughput to be determined by the number of bytes encoded within the batch duration period, you can
use the Outbound Bytes MIM parameter as a metric and divide it by the batch_duration to get the throughput in bytes per
second. In this example the metric name is mim.batch.processing.encoder.outbound_bytes.
Example MIM Browser
33
4.4 REST APIs for System Insight
You can also use REST API for System Insight when using InfluxDB and/or Grafana.
See the REST API examples below.
Examples of REST API for InfluxDB

Use the following command to show measurement names in InfluxDB:
curl -XPOST -u <user>:<password> 'http://influxdb.url:<port>/query?db=mz' --data-urlencode "q=SHOW

MEASUREMENTS"
Use the following command to show databases:
curl -XPOST -u <user>:<password> 'http://influxdb.url:<port>/query?' --data-urlencode "q=SHOW

DATABASES"
Use the following command to query a measurement for the latest 10 rows:
curl -XPOST -u <user>:<password> 'http://influxdb.url:<port>/query?db=mz' --data-urlencode

"q=select * from \"measurement.name\" order by time desc limit 10"
For further information on REST API for InfluxDB, see https://docs.influxdata.com/influxdb/v1.2/guides/.
Examples of REST APIs for Grafana

Use the following REST API to add an InfluxDB data resource to Grafana:
curl -XPOST -H "Content-Type: application/json" -u <user>:<password>

http://grafana.url:<port>/api/datasources --data-binary @<name_of_datasource>.json
Use the following REST API to add a dashboard from a file:
curl -XPOST -u <user>:<password> http://grafana_url:<port>/api/dashboards/db --data-binary

@<dashboard_name>.json -H "Content-Type: application/json"
Use the following REST API to export a dashboard to file:
curl -u <user>:<password> http://grafana.url:<port>/api/dashboards/db/<name_of_dashboard>

<dasboard_name>.json
For further information on REST API for Grafana, see http://docs.grafana.org/http_api/.
34
5. System Insight Example
This section provides an example on how you can use System Insight to display throughput in a MediationZone workflow. The
example shows the stages required to use System Insight to display the throughput of a workflow in a Grafana dashboard.
For the purpose of this example, it is assumed that you have installed System Insight with InfluxDB and Grafana.
Configuration in MediationZone
Example workflow
The workflow includes a System Insight forwarding agent, which means the metrics sent to the System Insight service have the
category of custom which is assigned in the Measurement UDR.
The Analysis agent contains the following APL code:
consume {
map<string,string> tags = mapCreate( string, string);
map<string,string> fields = mapCreate( string, string);
PulseUDR mock_data = (PulseUDR)input;
si.Measurement coll = udrCreate(si.Measurement);
mapSet(tags,"WF", "SI_TEST");
mapSet(fields,"SEQ", (string)mock_data.Sequence);
mapSet(fields, "DATA", baToStr(mock_data.Data, "utf-8"));
coll.tags = tags;
coll.fields = fields;
coll.name = "filter.test";
udrRoute(coll);
}
1. You create a System Insight profile in the Desktop. In the Filters tab, in this case, the profile description is Custom data,
the retention policy of one week is selected, and you select the System Insight Profile Enabled check box.
Filters tab - Creating a System Insight profile
2.
35
2. You can use the Detected Metrics tab to create a filter that sends all the custom-related data to the System Insight service.
The filter created is custom\..*, which is listed in the Filter tab when you click Create Filter.
Detected Metrics tab - example of creating a filter
The data is sent to InfluxDB and you can visualize the data throughput in Grafana.
Visualization in Grafana
To visualize the workflow throughput in Grafana, you add a dashboard with a graph component as follows:
1. You go to your instance of Grafana, and click the logo to the top left. Select Dashboards, then + New.
2. In this instance the Graph panel is selected to visualize data.
3. You click the panel title on the graph and then select Edit.
4. You can add metrics based on those available in InfluxDB. To create the query that Grafana will use to plot on, click the
three dash button to the right; by clicking Toggle Edit Mode, you can select to edit by entering free text, or by selecting
values from the drop boxes.
5. In this example, mims are enabled. To see what the forwarding agent's inbound udr throughput is, create the query
according to the image below:
Editing the Grafana panel
In this instance, the retention policy is set to one_week. The field selected is SEQ, which is an incremented value. Using
the function last retrieves the last value to be plotted in a graph.
6. If everything is ok, the data is visible in the graph. You can modify the graphs to update the interval to show the latest 5
minutes and update every 5 seconds to get continuous data in the graph.
This provides you with a visualization of the SEQ field into the System Insight forwarding agent, so you can keep track of
how many increments that have been produced in a workflow:
36
Grafana dashboard example
37
6. System Insight Backup and Maintenance
The instances of InfluxDB and Grafana which are provided with MediationZone are not highly available. This means that you must
take certain steps to secure file storage, dashboards created and metrics data.
BackUp and Maintenance of Embedded Instances of InfluxDB and Grafana

Any upgrades or maintenance required for the embedded versions of InfluxDB and Grafana shall be provided by DigitalRoute.
InfluxDB
If you use the embedded setup of System Insight provided with MediationZone, only the metrics data model used internally by
MediationZone can be hosted on the embedded instance of InfluxDB. External writes or queries are not supported. To prevent the
loss of metrics data, you are required to store the InfluxDB database on secure file storage, i e, file storage that can be replicated
or backed up. One option is to have multiple InfluxDB instances, see 3.3 Configuring System Insight with Multiple InfluxDB
Instances. In addition you must monitor the InfluxDB instance, for example, to ensure that the disk does not become full.
The embedded open source version of InfluxDB 1.2 is supported.
Grafana
The instance of Grafana provided with MediationZone is only supported when connected to the embedded InfluxDB
instance. Grafana stores the dashboards created and user data to disk, by default via sqllite3, for further information see
'database' in http://docs.grafana.org/installation/configuration/. Instead of sqllite3, you can use an external PostgreSQL or MySQL
database which you then must support and maintain. To prevent the loss of data and the dashboards created, the disk to which
dashboards and data are stored must be secured, i e replicated or backed up.
The embedded open source version of Grafana 4.3.1 is supported.
Backup and Maintenance of External Instances of InfluxDB and Grafana

You can also use System Insight with external instances of InfluxDB and Grafana, which you must maintain and update.
InfluxDB
If you use your own instance of InfluxDB with System Insight, it can be used to store and query any metrics, not only those that
originated from MediationZone. If this is the case, you must back up your instance of InfluxDB as recommended by InfluxData,
see https://docs.influxdata.com/influxdb/v1.2/.
System Insight can be used with the following versions of InfluxDB: InfluxDB version 1.x, InfluxCloud version 1.x, InfluxEnterprise
version 1.x. For support of these versions, contact InfluxData.
Grafana
If you use your own version of Grafana, it can be connected to any external metrics source.
System Insight can be used with any version of Grafana that is compatible with the InfluxDB version being used. For support of any
other version of Grafana (i e not the embedded version of Grafana 4.3.1), contact Grafana.
38
39

HY300E 002 v004

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

HY300E 002 v004

Uploaded by

Copyright:

Available Formats

System Insight

Copyright © 2018 Digital Route AB

System Insight overview

2.1 Installing System Insight using Scripts

2.1.1 Install System Insight with InfluxDB using Scripts

An InfluxDB instance to store metrics with a minimum of 10 GB of disk space

A script for a default setup of InfluxDB

If you are working online, proceed to the instructions below.

Before you proceed to step 3, rename systemctl-bak to systemctl again:

The steps are as follows:

1. Navigate to the directory $MZ_HOME/scripts/str-templates/system-insight.

2. Run the following script to set up InfluxDB:

$ sudo influx -username <username> -password <password>

The output should be the following:

Alternatively, you can use the following cURL command:

$ curl -XPOST -u <username>:<password> http://<host name>:8086/query --data-urlencode

The output should be the following:

4. Run the following script to set up Grafana:

$ sudo ./si_grafana_setup.sh add-dashboards

Once successfully installed browse to http://<host name>:3000.

$ mzsh restart platform

$ ./si_basic_setup.sh si-basic-filters <username> <password>

2.1.2 Install System Insight with Cloudwatch using Scripts

An SC to run the system insight service

A script for a default setup of Cloudwatch

1. Navigate to the directory $MZ_HOME/scripts/str-templates/system-insight.

Enter the AWS region when prompted.

$ mzsh restart platform

$ ./si_basic_setup.sh si-basic-filters <username> <password>

2.1.3 Access Grafana via Desktop or Web UI

$ mzsh topo set services:custom/val:system-insight.si-instance.config.grafana-url

Grafana via Desktop

Grafana via Web UI

System Insight in the Web UI

Grafana Over https

1. Go to the directory MZ_HOME/scripts/str-templates/system-insight, and open the grafanai.ini file.

2. In the Server section, modify the text as follows and save:

Change these lines:

# https certs & key file

# https certs & key file

3. Restart the grafana service.

2.2 Install System Insight Manually

You have several options:

1. If you choose to run your own instance of InfluxDB, see https://docs.influxdata.com/influxdb/v1.2/.

2. If you choose to run Cloudwatch, see https://aws.amazon.com/documentation/cloudwatch/.

3. If you choose to run your own instance of Grafana, see http://docs.grafana.org/.

2.2.1 Configure System Insight Services

$ mzsh topo set topo://services:custom/obj:akka '{

Example - Adding the akka service

$ mzsh topo set topo://services:custom/obj:akka '{

$ mzsh topo set topo://container:<container name> pico:<sc> '{

$ mzsh topo set topo://container:main1/pico:sc4 '{

$ mzsh topo set topo://container:main1/pico:sc5 '{

$ mzsh topo set topo://container:main1/pico:sc6 '{

$ mzsh topo set topo://services:custom/obj:system-insight '{

$ mzsh topo set services:custom/val:system-insight.si-instance.config.grafana-url

batch Add this block to tune batch settings by entering values

size The number of measurements per batch. The default is

interval The time in milliseconds between batches. The default is

$ mzsh topo set topo://services:custom/obj:system-insight '{

$ mzsh topo set topo://services:custom/obj:system-insight '{

$ mzsh topo set val:common.mz.system.insight true

$ mzsh restart platform