Bring your C++ Application to the Web with Web Assembly

This post is 4 years old. (Or older!) Code samples may not work, screenshots may be missing and links could be broken. Although some of the content may be relevant please take it with a pinch of salt.

In the past year, I had the chance to deliver a talk at a few conferences/meetups titled "Supercharge your JavaScript with Web Assembly". As part of that talk, I showcased an application where I ported a C++ project to the web by using web assembly.

In a nutshell, web assembly allows us to bring native applications to the web and generally speaking; there are three main reasons why you'd want to consider web assembly for your next project:

Reuse existing code: you already have something written in a language such as C or Rust, and you wish to bring that to the web.
Performance: most of the time, JavaScript and web assembly perform similarly; however, web assembly has more predictable performance due to how the V8 engine is creating the machine-readable code.
Binary size: A web assembly module can be compressed quite a lot using gzip or brotli, and it can be much smaller than a JavaScript file.

With all these options in mind, the use-case we'll go through in this post is related to the first point.

Back in 2017, my colleague, Jon Sneyers has created a C++ project called SSIMULACRA - "Structural SIMilarity Unveiling Local And Compression Related Artifacts".

You can learn more about the project, and the implementation details of SSIMULACRA in the article titled Detecting the psychovisual impact of compression related artifacts using SSIMULACRA by Jon.

To give you some context and explain the project in a nutshell, it allows us to compare two images for detecting quality differences between the two. Image compression and optimisation play an important factor in today's web ecosystem. It forms a critical part of any web performance project; however, image optimisation should be done without losing visual fidelity. In other words, you don't want to over optimise an image and produce a low-quality version. SSIMULACRA compares two of the same images and can tell you if an image is over-optimised. The result of running the tool is a number. A number closer to 1 indicates a really pixelated image, while a number close to 0 shows an image optimised in the right way, and there are no visual and visible defects on the image.

All that said, determining the "right" quality for an image is a challenging task. As per the linked article, we could say that "If the value is above 0.1 (or so), the distortion is likely to be perceptible/annoying."

Eric Portis, another colleague of mine, has done some exhaustive research around this very subject; you can read his findings published in the article Human-redable image quality scores.

The original project requires you to compile and execute a C++ binary and provide two images as arguments using your CLI: ./ssimulacra img1.jpg img2.jpg.

I find this tool amazing, but I thought that it'd bring more value if I could run this from a web application and potentially also show the images that are being compared. And how do we port a C++ project to the web? By using web assembly!

Getting started

Before we get to the nitty-gritty details, let's talk about the environment that we'll be using. For this particular port, I used a virtualised environment by running Ubuntu (v18.04.2 - you can verify this by running lsb_release -a) on VirtualBox. My recommendation is to configure the OS for SSH and file sharing via Samba for easy access.

SSIMULACRA has a dependency on OpenCV; therefore, this will be the first thing we look at.

OpenCV

"OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in commercial products. Being a BSD-licensed product, OpenCV makes it easy for businesses to utilise and modify the code."

There are multiple versions of OpenCV available - at the time of writing this article, v4 is the most recent one, and this is the version that we will install. Since we'll be requiring some customisation down the line, we'll build OpenCV from source.

Follow these steps to update the package repository for Ubuntu, update your packages and install the necessary dependencies for being able to build and install OpenCV from source:

$ sudo apt-get update && sudo apt-get upgrade -y
$ sudo apt-get install build-essential -y
$ sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev -y

(If you get an error that some of the lib* packages are not available, run the command be;pw first, and then rerun the previous few commands.
$ sudo add-apt-repository main && sudo add-apt-repository universe &&  sudo add-apt-repository restricted && sudo add-apt-repository multiverse

These are optional packages but I recommend that you also install them:

$ sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libdc1394-22-dev -y

At this point, we are ready to start with the OpenCV build and installation process. Navigate to your preferred folder (I will be using /root) and execute the following commands:

$ mkdir ~/src
$ cd ~/src
$ git clone https://github.com/opencv/opencv.git
$ cd opencv
$ mkdir build && cd build

The above will clone the latest OpenCV repository from GitHub and create a build folder. It is customary to create a different src and build folders when working with CMake - which is the tool that we'll be using to compile OpenCV.

Once inside the build folder, we can start the compilation process. Note that there are specific flags that we need to enable because if we miss some of them, we'll end up having some trouble later on, and we may need to repeat the compilation process. Also, note that the entire compilation process can take anywhere between 15 and 30 minutes after the make step specified later.

$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
  -D CMAKE_INSTALL_PREFIX=/usr/local \
  -D INSTALL_PYTHON_EXAMPLES=ON \
  -D OPENCV_GENERATE_PKGCONFIG=YES \
  -D INSTALL_C_EXAMPLES=ON ..

Some options above are not necessarily required (like the examples), feel free to change those around.

Weird as it may seem, but for the OPENCV_GENERATE_PKGCONFIG option, we need to pass in YES and not ON. We are also adding the C and Python examples - these are not required but can come in handy when we are not sure about how some function works.

Providing the fact that there are no errors reported by CMake, we can go ahead and execute the following command:

$ make -j$(nproc)

At this point, the build process should start - you can now go and brew a coffee.

Hopefully, the make command finished successfully. If that's the case, let's test whether the installation was successful by executing the following two commands:

$ pkg-config --cflags opencv4 # get the include path (-I)
$ pkg-config --libs opencv4  # get the libraries path (-L) and all the libraries (-l)

The first one returns should return -I/usr/local/include/opencv4, while the secound one should return -L/usr/local/lib -lopencv_dnn -lopencv_gapi -lopencv_highgui -lopencv_ml -lopencv_objdetect -lopencv_photo -lopencv_stitching -lopencv_video -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_videoio -lopencv_imgcodecs -lopencv_imgproc -lopencv_core. These are the paths for the OpenCV core files and additional libraries.

Now we have OpenCV installed, which means we can continue and set up SSIMULACRA itself. Since this project lives in GitHub, let's go ahead and do clone it:

cd /root && git clone https://github.com/cloudinary/ssimulacra && cd ssimulacra

If you try to compile the project, it will fail at this point:

$ make
ssimulacra.cpp:81:10: fatal error: cv.hpp: No such file or directory
 \#include <cv.hpp>
     ^~~~~~~~

This is expected. There are a few changes that we need to implement in ssimulacra.cpp since it has been created to work with an older version of OpenCV. The changes are the following:

replace #include <cv.hpp> with #include <opencv2/opencv.hpp>
remove #include <highgui.h>

The latest version of OpenCV uses opencv.hpp, and there's also no need to add highgui.h since the imread method has been moved to the OpenCV core.

Note that since the writing of this article, the original SSIMULACRA repository has been updated to reflect the changes above.

We also need to edit the Makefile since it's referencing opencv and opencv4; this is how the updated file should look like:

CFLAGS=`pkg-config --cflags opencv4`
LDFLAGS=`pkg-config --libs opencv4` -lopencv_core -lopencv_imgcodecs -lopencv_imgproc

Now let's execute make again:

$ make

Hopefully, there are no errors present in the CLI, and we should receive an executable file called ssimulacra. Invoking this file will throw a warning because we didn't pass any variables to it, but at least we can verify that the binary is, in fact, ready:

$ ./ssimulacra
./ssimulacra
Usage: ./ssimulacra orig_image distorted_image
Returns a value between 0 (images are identical) and 1 (images are very different)
If the value is above 0.1 (or so), the distortion is likely to be perceptible/annoying.
If the value is below 0.01 (or so), the distortion is likely to be imperceptible.

If you have two images available (remember the images need to be the same but have different qualities), feel free to run the binary: ./ssimulacra image1.jpg image2.jpg.

SSIMULACRA to Web Assembly

Now that we have set up OpenCV and SSIMULACRA successfully, we will look at how to port it to web assembly. There will be quite a few steps involved since the application relies on some libraries that we cannot bring to the "wasm" world. We need first to create a web assembly build of OpenCV and use the resulting libraries when building the wasm file via Emscripten.

Setup Emscripten

Emscripten is a toolchain for compiling to asm.js and WebAssembly, built using LLVM, that lets you run C and C++ on the web at near-native speed without plugins.

To set it up we'll be using GitHub again:

$ cd /root && git clone https://github.com/emscripten-core/emsdk.git && cd emsdk
$ ./emsdk install latest
$ ./emsdk activate latest
$ source ./emsdk_env.sh

Note that source ./emsdk_env.sh will need to be invoked whenever a new terminal session is started.

With the above steps, we have Emscripten installed successfully on our system.

Next, we need to create the web assembly OpenCV build. This one will be tricky since we need to enable all the options that SSIMULACRA will require in the OpenCV web assembly build. There are quite a few steps to follow. Navigate and open the build file:

$ cd /root/src/opencv/platforms/js
$ vi build_js.py

And make the following changes in it (leave the rest of the options unchanged):

-DWITH_JPEG=ON
-DWITH_WEBP=ON
-DWITH_PNG=ON
-DWITH_TIFF=ON
-DBUILD_ZLIB=ON,
-DBUILD_opencv_apps=OFF,
-DBUILD_opencv_calib3d=OFF,
-DBUILD_opencv_dnn=ON,
-DBUILD_opencv_features2d=OFF,
-DBUILD_opencv_flann=ON,
-DBUILD_opencv_gapi=OFF,
-DBUILD_opencv_ml=OFF,
-DBUILD_opencv_photo=OFF,
-DBUILD_opencv_imgcodecs=ON,

Finally, execute the build command:

$ python /root/src/opencv/platforms/js/build_js.py build_wasm --build_wasm --emscripten_dir="/root/emsdk/upstream/emscripten"

This is the second time when you can go and brew/grab a coffee since the process will take another 15-30 minutes to complete.

For reasons unknown, there are times when the WebAssembly build process doesn't build some of the selected libraries; therefore, we may need to do some other manual compilation.

To verify this, run the following two commands:

$ { find /root/src/build_wasm/3rdparty/lib -type f -name "*.a" | wc -l & find /root/src/build_wasm/lib -type f -name "*.a" | wc -l } | cat

The above should return 6 and 10 respectively. If you get different values, you need to navigate into the two subfolders (build_wasm/3rdparty/lib and build_wasm/lib) and make sure that you have the following .a files:

$ pwd && l
/root/src/opencv/build_wasm/3rdparty/lib
total 4.9M
drwxr-xr-x 2 root root 4.0K Jun 20 20:53 .
drwxr-xr-x 9 root root 4.0K Jun 20 20:16 ..
-rw-r--r-- 1 root root 398K Jun 20 20:39 liblibjpeg-turbo.a
-rw-r--r-- 1 root root 233K Jun 20 20:40 liblibpng.a
-rw-r--r-- 1 root root 3.3M Jun 20 20:20 liblibprotobuf.a
-rw-r--r-- 1 root root 478K Jun 20 20:41 liblibtiff.a
-rw-r--r-- 1 root root 466K Jun 20 20:42 liblibwebp.a
-rw-r--r-- 1 root root  96K Jun 20 20:16 libzlib.a

$ pwd && l
/root/src/opencv/build_wasm/lib
total 17M
drwxr-xr-x  2 root root 4.0K Jun 20 20:45 .
drwxr-xr-x 14 root root 4.0K Jun 20 20:16 ..
-rw-r--r--  1 root root 2.1M Jun 20 20:27 libopencv_calib3d.a
-rw-r--r--  1 root root 2.6M Jun 20 20:20 libopencv_core.a
-rw-r--r--  1 root root 5.5M Jun 20 20:29 libopencv_dnn.a
-rw-r--r--  1 root root 718K Jun 20 20:25 libopencv_features2d.a
-rw-r--r--  1 root root 555K Jun 20 20:21 libopencv_flann.a
-rw-r--r--  1 root root 488K Jun 20 20:45 libopencv_imgcodecs.a
-rw-r--r--  1 root root 3.6M Jun 20 20:23 libopencv_imgproc.a
-rw-r--r--  1 root root 418K Jun 20 20:29 libopencv_objdetect.a
-rw-r--r--  1 root root 790K Jun 20 20:24 libopencv_photo.a
-rw-r--r--  1 root root 325K Jun 20 20:29 libopencv_video.a

You could likely be missing libopencv_imgcodecs.a and all the image codec libs such as liblibpng.a.

To remedy this, first, open the folder called 3rdparty and hopefully, you see a folder for each image codec (libweb etc.), navigate into each and execute make, for example:

$ cd /root/src/opencv/build_wasm/3rdparty/libwebp && make

If you're missing libopencv_imgcodecs.a make sure you navigate to the modules folder, and invoke make:

$ cd /root/src/opencv/build_wasm/modules/imgcodecs && make

It's imperative that you also verify that the right flags are on in build_js.py as stated earlier.

Create the `wasm` file

Run emcc to create the wasm file. Before we can create the web assembly file, however, we need to also edit the source code for SSIMULACRA. Make the following changes:

# Include two more header files:
#include <opencv2/imgcodecs.hpp>
#include <emscripten/emscripten.h>

We will also create a function dedicated to running the SSIMULACRA calculation and rework how the C++ main() function works. Replace int main(int argc, char **argv) with just int calc(). Delete the first couple of printf instructions since those won't be needed, and replace the first few lines with the following:

img1_temp = imread("image1.ext",-1);
img2_temp = imread("image2.ext",-1);
int nChan = img1_temp.channels();
if (nChan != img2_temp.channels()) {
  fprintf(stderr, "Image file image1 has %i channels, while\n",  nChan);
  fprintf(stderr, "image file image2 has %i channels. Can't compare.\n", img2_temp.channels());
  return -1;
}

This makes sure that the suitable file is being read (more on this later) and makes sure that there are no more references to argv since that won't be used.

Prepend int calc() with the keyword EMSCRIPTEN_KEEPALIVE, which means that we also need to add the extern "C" {} to make the keyword available. (extern "C" makes a function name in C++ have C linkage)

All in all, this is how the calc() function should look like (trimmed):

#ifdef __cplusplus
extern "C"
{
#endif
 EMSCRIPTEN_KEEPALIVE
 int calc()
 {
  Scalar sC1 = {C1, C1, C1, C1}, sC2 = {C2, C2, C2, C2};
  Mat img1, img2, img1_img2, img1_temp, img2_temp, img1_sq, img2_sq, mu1, mu2, mu1_sq, mu2_sq, mu1_mu2, sigma1_sq, sigma2_sq, sigma12, ssim_map;
  // read and validate input images
  img1_temp = imread("image1.ext", -1);
  img2_temp = imread("image2.ext", -1);

  int nChan = img1_temp.channels();
  if (nChan != img2_temp.channels())
  {
   fprintf(stderr, "Image file image1 has %i channels, while\n", nChan);
   fprintf(stderr, "image file image2 has %i channels. Can't compare.\n", img2_temp.channels());
   return -1;
  }
# at the end of calc():
#ifdef __cplusplus
}
#endif

Last but not least, we still need to add a new main() function, but let's keep it simple:

int main(int argc, char **argv) {
 return 0;
}

At this point we are ready to compile SSIMULACRA to Web Assembly. Open Makefile and add the following:

ssimulacra: ssimulacra.cpp
	emcc -std=c++11 -O3 -fstrict-aliasing -ffast-math -I/usr/local/include/opencv4 -L/root/src/opencv/build_wasm/lib -lopencv_core -lopencv_imgcodecs -lopencv_imgproc -L/root/src/opencv/build_wasm/3rdparty/lib -llibjpeg-turbo -llibpng -llibwebp -llibtiff -llibzlib -llibprotobuf -s LLD_REPORT_UNDEFINED ssimulacra.cpp -s WASM=1 -s MODULARIZE -s EXPORT_NAME="WAModule" -s EXPORTED_RUNTIME_METHODS='["FS", "ccall"]' -s FORCE_FILESYSTEM=1 -s NO_EXIT_RUNTIME=1 -s ALLOW_MEMORY_GROWTH=1 -o ssimulacra.js

Make sure that the paths are correctly set for the generated lib files.

You can now go ahead and execute emmake make. The process will take a few seconds (about 30-60 seconds), and at the end, you should see two new files created: ssimulacra.js and ssimulacra.wasm These are the two files that you'll require to add to a web application.

Let's talk about the options a little bit. emcc is Emscripten which builds the Web Assembly file, and the possibilities are related to this CLI utility. We are making the Emscripten file system available, and we also do things like JavaScript code modularisation (this will help us to import the Web Assembly file just like a standard JavaScript ES2015 Module). Other options such as NO_EXIT_RUNTIME and ALLOW_MEMORY_GROWTH are also required because this way, we can keep on calling calc() from our web app, and we don't need to worry about memory allocation.

Using `ssimulacra.wasm`

Copy both the .js and .wasm files to the place where you are building your web application and add some simple HTML (note the inclusion of the copied JavaScript file):

<input
  id="image1"
  placeholder="https://res.cloudinary.com/tamas-demo/image/upload/jam/darthvader.jpg"
/>
<input
  id="image2"
  placeholder="https://res.cloudinary.com/tamas-demo/image/upload/q_auto/jam/darthvader.jpg"
/>
<button id="calculate">SSIMULACRA</button>
<span id="score"></span>

<script src="ssimulacra.js"></script>

We don't need to import the Web Assembly file because the way Emscripten created the corresponding JavaScript file for us means that the import will be taken care of for us automatically.

Pro tip: To use Web Assembly streaming, for faster load times, the HTTP server that you use should have the appropriate Content-Type setup for wasm files. Here's an example HTTP server created in Python:
# server.py
import http.server
from http.server import HTTPServer, BaseHTTPRequestHandler
import socketserver

PORT = 8999

Handler = http.server.SimpleHTTPRequestHandler

Handler.extensions_map = {
  '.html': 'text/html',
  '.png': 'image/png',
  '.jpg': 'image/jpg',
  '.wasm': 'application/wasm',
  '.css': 'text/css',
  '.js': 'application/x-javascript',
  '': 'application/octet-stream',
}

httpd = socketserver.TCPServer(("", PORT), Handler)

print("serving at port", PORT)
httpd.serve_forever()
Call this server by executing python server.py.

Now it's time to use SSIMULACRA! To do so, we'll need to read the two Cloudinary URLs and transform them into a Uint8Array and save that in memory for Web Assembly. Note we are using the Fetch API and its arrayBuffer() method to get a buffer and convert that to a Uint8Array.

const oneInput = document.getElementById('one');
const twoInput = document.getElementById('two');
const score = document.getElementById('score');

const urlToUint8Array = async (url) => {
  const response = await fetch(url);
  const buffer = await response.arrayBuffer();
  const arr = new Uint8Array(buffer);
  return arr;
};

Let's also instansiate the Web Assembly module. Remember that we have used the MODULARISE option when running emcc to create the wasm file plus, we also used EXPORT_NAME="WAModule, which is the actual name of the module that we can import:

document.addEventListener('DOMContentLoaded', async () => {
  const waModule = await WAModule();
  // waModule now has access to the Web Assembly file system
  // as well as all the exported functions such as calc()
  // ... the rest of the code goes here

Now that we have a helper method and the right module imported, we can write the two files to the Web Assembly memory and call the calc() method:

document.getElementById('calculate').addEventListener('click', async () => {
  const img1 = await urlToUint8Array(image1.value);
  const img2 = await urlToUint8Array(image2.value);
  waModule.FS.writeFile('image1.ext', img1);
  waModule.FS.writeFile('image2.ext', img2);
  const ssimulacraScore = waModule.ccall('calc', 'int', [], null);
  score.innerText = ssimulacraScore;
});

Note that FS.writeFile() writes the file to memory, and by default, it creates a binary file. Also, remember that we have made some changes in the original ssimulacra.cpp file and made sure that these are the two files that we are reading and using for the calculation.

Open the web application now, add two images and hit the SSIMULACRA button. If you have done everything correctly, the SSIMULACRA score should appear.

An interesting addition to the application would be to show the two images side by side, which we can do in two ways. One is the obvious one (reading the Cloudinary URL from the input box), and the other is slightly less obvious - we can take the binary file from the Web Assembly memory and rebuild the image from it:

// just add <div id="img"></div> to the HTML
const array = waModule.FS.readFile('image1.ext');
const base64Data = btoa(String.fromCharCode.apply(null, array));
const img = new Image();
img.src = `data:image/jpg;base64,${base64Data}`;
document.getElementById('img').appendChild(img);

Conclusion

This project was challenging, but I enjoyed it thoroughly. One typical use-case for Web Assembly is to bring existing applications to the web, and I hope that with the example outlined in this article, you see its value.