Performance monitoring using Node.js, socket.io and MarkLogic

This post is 4 years old. (Or older!) Code samples may not work, screenshots may be missing and links could be broken. Although some of the content may be relevant please take it with a pinch of salt.

Previously I have worked a lot with sockets and recently I had the opportunity to work with them again - I needed a good example application to demonstrate sockets other than a chat application. The idea that I have come up with was to gather various metrics from Node.js and plot the values for these metrics on a chart. Taking this idea a bit further, I thought, wouldn't it be nice to save the data in a database so that maybe later someone could run historical reporting on it? And after a few hours of work the app was born.

The source code for the app can be found here: https://github.com/tpiros/system-information

Please note that this application has been written using ES2015

Requirements

First let's start with the requirements. The application should gather data using the built-in 'os' library from Node.js and display those on a chart. For charting purposes I decided to use Google Charts as it's very simple to set it up. The other requirement for the app is that it should save the collected data to a MarkLogic database.

Logically the first thing to setup would be the Node.js application so let's start our discussion here.

Setting up socket.io

As always, when working with Node.js application where I'd like to display something in a browser I use Express as well - a web server for Node.js. If you haven't worked with Express before have a look at it's documentation for useful resources. The baseline (boilerplate) Express app that already has socket.io added to it would look something like this:

'use strict';
const express = require('express');
const app = express();
const server = require('http').Server(app);
const io = require('socket.io')(server);
const os = require('os');
const socket = require('./socket').connect(io, os);
const hbs = require('express-handlebars');
const router = express.Router();

app.set('port', 8080);
app.use('/', router);
app.use('/bower_components', express.static(__dirname + '/bower_components'));
app.use('/js', express.static(__dirname + '/js'));
app.engine('.hbs', hbs());
app.set('view engine', '.hbs');

let indexRoute = (req, res) => {
res.render(__dirname + '/index');
};

router.route('/').get(indexRoute);

server.listen(app.get('port'), () => {
console.log('Magic happens on port ' + app.get('port'));
});

These few lines of code are going to grab all the dependencies for the application, set the view engine to handlebars, render the index.hbs file for us and start up the HTTP server on port 8080.

Notice that in line 7 we are including a file called socket.js and we call a connect() method with two parameters, io and os.

Separate socket.js

Because I like to keep my application logic clean and well separated I have created the aforementioned file to place all socket.io related logic in there, so in fact, socket.js is the file that contains the actual connection to the socket.

The logic in that file is again fairly simple. A connection is made to the socket, then, using the Node.js os library system information is collected and in every 5 seconds a message is emitted containing the collected metrics using setInterval() and socket.emit():

const os = require('os');
var connect = (io, os) => {
io.on('connection', (socket) => {
var load = os.loadavg()[0];
var totalMemory = os.totalmem();
var freeMemory = os.freemem();
var usedMemory = Number((totalMemory - freeMemory) / 1073741824).toFixed(4);

socket.emit('resources', { cpu: load, memory: usedMemory });
setInterval(() => {
load = os.loadavg()[0];
freeMemory = os.freemem();
usedMemory = Number((totalMemory - freeMemory) / 1073741824).toFixed(4);
socket.emit('resources', { cpu: load, memory: usedMemory });
}, 5000);
});
};

module.exports.connect = connect;

You may be wondering where there are two socket.emit() calls? The answer is simple, the first one in line 9 gets emitted when the application first starts up, the second one, in line 14 gets emitted 5 seconds after and every 5 seconds thereafter. socket.emit() allows the application to emit data, and that data can than be access by the clients connecting to this socket and listening for the resources message.

The data that is sent to the template has a simple JSON structure:

{
cpu: 'some value',
memory: 'some other value'
}

This concludes the backend part of the application so let's now take a look at how this data is going to be displayed in the browser.

Creating the client

Let's create a very simple index.hbs file that includes the Google Charting API and the connection to the socket as well:

<html>
<head>
<script src="/socket.io/socket.io.js"></script>
<script
type="text/javascript"
src="https://www.gstatic.com/charts/loader.js"
>
</script>
<script type="text/javascript" src="/js/charting.js"></script>
</head>
<body>
<div class="content">
<h1>System Information</h1>
<div id="curve_chart" style="width: 900px; height: 500px"></div>
</div>
</body>
</html>

charting.js is the file that is responsible for collecting the data for the chart and drawing the chart itself. The code in there is pretty simple - if you want to make modifications to the chart than please take a look at the Google Charting API documentation.

(function() { 'use strict'; google.charts.load('current',
{'packages':['corechart']}); google.charts.setOnLoadCallback(drawChart); var
socket = io.connect('http://localhost:8080'); function drawChart() { var options
= { title: 'System Utilisation', curveType: 'function', legend: { position:
'bottom' }, pointSize: 3 }; var chart = new
google.visualization.LineChart(document.getElementById('curve_chart')); var
dataArray = [['Time', 'CPU Average (%)', 'Used Memory (GB)'], [new Date(), 0,
0]]; var data = google.visualization.arrayToDataTable( dataArray );
chart.draw(data, options); socket.on('resources', function (load) {
dataArray.push([new Date(), load.cpu, load.memory]); data =
google.visualization.arrayToDataTable( dataArray ); chart.draw(data, options);
}); } })();

Notice that on line 5 a connection is made to the socket that was created in the previous steps (remember, the HTTP server is running on port 8080 henceforth the connection is made to http://localhost:8080)

Lines 21 to 27 are the most exciting ones. Remember how the Node.js application uses socket.emit() to emit the resources message with the data? In the client socket.on() is used to 'listen' for the emitted message and the load argument in that anonymous callback function will contain the data sent from the server. The data is simple placed into an array and the chart is drawn - and it is redrawn every time when the resources message gets emitted by the server.

This really covers the basics of the application - at this point a chart is displayed in the browser that displays data collected via the Node.js os library.

Persisting data in MarkLogic

But why stop here? The application can do a bit more by displaying more information about the system as well as saving the system utilisation metrics to a database for later retrieval.

Let's go back to the Node.js part of the application and add some extra metrics to be collected and make sure that data is sent to the handlebars template:

let dataObject = {
osType: os.type().toLowerCase() === 'darwin' ? 'Mac OS X' : os.type(),
osReleaseVersion: os.release(),
osArch: os.arch(),
osCPUs: os.cpus(),
osHostname: os.hostname(),
osTotalMemory: Number(os.totalmem() / 1073741824).toFixed(0),
};
// etc
let indexRoute = (req, res) => {
res.render(__dirname + '/index', { data: dataObject });
};

Let's also persist this data in a database by adding these few lines to the Node.js application:

db.documents
.write({
uri: '/data/host.json',
contentType: 'application/json',
content: dataObject,
})
.result()
.then((response) => {
console.log(response.documents[0].uri + ' inserted to the database.');
})
.catch((error) => {
console.log(error);
});

If you would like to know how to get connected to the MarkLogic database and learn more about the Node.js Client API please read this article.

Of course the separately collected metrics for CPU utilisation and memory usage should also be persisted but, the question is, how? Ideally one document should be created in the database per data collection. To achieve this a modification is required in socket.js as that is the place where the data collection done, the new setInterval() should look like this:

setInterval(() => {
load = os.loadavg()[0];
freeMemory = os.freemem();
usedMemory = Number((totalMemory - freeMemory) / 1073741824).toFixed(4);
socket.emit('resources', { cpu: load, memory: usedMemory });
db.documents
.write({
uri: '/data/' + Date.now() + '.json',
contentType: 'application/json',
content: { cpu: load, memory: usedMemory },
})
.result()
.then((response) => {
console.log(response.documents[0].uri + ' inserted to the database');
})
.catch((error) => {
console.log(error);
});
}, 5000);

The documents are going to be stored in the database with the URI (unique document identifier) of /data/EPOCH.json. (If you're not sure how URIs work in MarkLogic refer to the article that I have linked previously.)

All that's left is to display the data in the handlebar template:

<html>
<head>
<script src="/socket.io/socket.io.js"></script>
<script
type="text/javascript"
src="https://www.gstatic.com/charts/loader.js"
>
</script>
<script type="text/javascript" src="/js/charting.js"></script>
<script
type="text/javascript"
src="/bower_components/jquery/dist/jquery.js"
>
</script>
<link
rel="stylesheet"
href="/bower_components/bootstrap/dist/css/bootstrap.css"
/>

</head>
<body>
<div class="content">
<div id="error"></div>
<h1>System Information</h1>
<div class="panel panel-default">
<div class="panel-heading">
<h3 class="panel-title">
- () ()
</h3>
</div>
<div class="panel-body">
<p>
<strong>System CPU</strong>: ,
<strong>Total memory</strong>: GB
</p>
<div id="curve_chart" style="width: 900px; height: 500px"></div>
</div>
</div>
</div>
</body>
</html>

Conclusion

At this point the application is complete - it's collecting CPU and memory utilisation in every 5 seconds and it also saves that data to the MarkLogic database - the data structure persisted in the database is the same as discussed before.

In a later article we will take a look at how we can do 'historical' reporting on the collected data based on the documents in the database. Stay tuned!