Open Source Tooling and Automation

Here I will demonstrate an example of using various open source tooling and automation on a GitHub repository.

Create repo to test

Initial commit for new test repository includes:

  • README file
  • .gitignore for Node
  • MIT license

Initialize npm package.json file

Since I have nodejs installed on my machine, I can go ahead and pull the newly created repository to my local machine.

git pull git@github.com:lkisac/OpenSourceToolingAutomation.git

Initialize the package.json file:
npm init

{
  "name": "lab7",
  "version": "1.0.0",
  "description": "Open Source Tooling and Automation",
  "main": "seneca.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "repository": {
    "type": "git",
    "url": "git+https://github.com/lkisac/OpenSourceToolingAutomation.git"
  },
  "author": "",
  "license": "MIT",
  "bugs": {
    "url": "https://github.com/lkisac/OpenSourceToolingAutomation/issues"
  },
  "homepage": "https://github.com/lkisac/OpenSourceToolingAutomation#readme",
  "bin": {
    "seneca": "./seneca.js"
  },
  "dependencies": {
    "commander": "^2.9.0"
  }
}

Implement JavaScript functions

/**
 * Given a string `email`, return `true` if the string is in the form
 * of a valid Seneca College email address, `false` othewise.
 */
exports.isValidEmail = function(email) {
    // TODO: needs to be implemented
};

/**
 * Given a string `name`, return a formatted Seneca email address for
 * this person. NOTE: the email doesn't need to be real/valid/active.
 */
exports.formatSenecaEmail = function(name) {
    // TODO: needs to be implemented
};

First attempt at implementing stub functions (this will be improved later on using ESLint). Implementation also includes the use of npm’s commander library to be able to pass command line options to the script.

You can set bin environment variables in the package.json file to execute a script:

  "bin": {
    "seneca": "./seneca.js"
  },

Build package.json:

npm install -g

Run script from command line:

$ seneca -v lkisac@myseneca.ca
email: lkisac@myseneca.ca
valid

$ seneca -v lkisac@gmail.com
email: lkisac@gmail.com
invalid

$ seneca -f lkisac
name: lkisac
lkisac@myseneca.ca

$ seneca -v lkisac@gmail.com -f lkisac
email: lkisac@gmail.com
invalid
name: lkisac
lkisac@myseneca.ca

Code works as expected, although it needs some clean up. In the next section, I will show how ESLint can assist in the clean up process.

Clean code w/ ESLint

Install and configure ESLint to validate our coding style:

npm install eslint --save-dev

--save-dev option adds configuration as development dependency (developing code vs. using code).

For this example, ESLint is configured with Airbnb styleguide, No React, and in JSON format.
./node_modules/.bin/eslint --init

	Installing eslint-plugin-import, eslint-config-airbnb-base
	lab7@1.0.0 C:\github\OpenSourceToolingAutomation
	+-- eslint-config-airbnb-base@11.1.1
	`-- eslint-plugin-import@2.2.0
	  +-- builtin-modules@1.1.1
	  +-- contains-path@0.1.0
	  +-- doctrine@1.5.0
	  +-- eslint-import-resolver-node@0.2.3
	  +-- eslint-module-utils@2.0.0
	  | +-- debug@2.2.0
	  | | `-- ms@0.7.1
	  | `-- pkg-dir@1.0.0
	  +-- has@1.0.1
	  | `-- function-bind@1.1.0
	  +-- lodash.cond@4.5.2
	  `-- pkg-up@1.0.0
	    `-- find-up@1.1.2
	      `-- path-exists@2.1.0
	
Successfully created .eslintrc.json file in C:\github\OpenSourceToolingAutomation

Now I can run newly configured eslint on the JavaScript file seneca.js:

./node_modules/.bin/eslint seneca.js

Working with warnings/errors

First, there were many linebreak-style issues with the error message: “Expected linebreaks to be ‘LF’ but found ‘CRLF'”. I fixed this by running dos2unix seneca.js to convert line endings to Unix format.

Other warnings/errors included:

  • Unexpected var, use let or const instead
  • Strings must use singlequote
  • Missing space before function parentheses

To organize these fixes properly, I grouped similar issues together:

i.e. for the Unexpected var, use let or const instead error, I ran:

./node_modules/.bin/eslint seneca.js | grep 'Unexpected var, use let or const instead'

Once each line containing that issue was fixed, I commit the fix to GitHub. This will make each commit clearer and specific instead of all issues crammed together into one commit.

History for fixes (pre-pended by “fixed “).

ESLint is extremely useful to get your code to match a specific style. The config file is customizable, so any project can contain its own settings. This can help contributors follow a specific standard for a given project.

Add Travis CI to repository

Following the getting started guide, I set up my Travis account by syncing my existing GitHub account.
You can customize your .travis.yml file for a particular language. List of languages is provided here.

language: node_js
node_js:
  - "6"
install:
  - npm install
script:
  - npm test

You can also validate your yml file here by providing a link to your repository (containing the yml file), or by pasting your yml file into the textbox provided.

Example

To keep track of your repository’s build status you can add a “build badge” to your repository.

Travis CI is used in almost all GitHub open source projects. Anytime you submit a pull request, it must pass one or more Travis builds.

Makefile rules & recipes

Recipes for the rules defined in your Makefiles require specific indentation. Each line in a recipe (i.e. “tests” below) must start with a tab character.

You can run:

cat -n -E -T Makefile
option description
n Show line numbers
e equivalent to -vE
t equivalent to -vT
E, –show-ends display $ at end of each line
T, –show-tabs display TAB characters as ^I
v, –show-nonprinting use ^ and M- notation, except for LFD and TAB

Which produces something like:

include ../Makeconfig$
$
headers := time.h sys/time.h sys/timeb.h bits/time.h^I^I^I\$
^I   bits/types/clockid_t.h bits/types/clock_t.h^I^I^I\$
^I   bits/types/struct_itimerspec.h^I^I^I^I\$
^I   bits/types/struct_timespec.h bits/types/struct_timeval.h^I\$
^I   bits/types/struct_tm.h bits/types/timer_t.h^I^I^I\$
^I   bits/types/time_t.h$
$
routines := offtime asctime clock ctime ctime_r difftime \$
^I    gmtime localtime mktime time^I^I \$
^I    gettimeofday settimeofday adjtime tzset^I \$
^I    tzfile getitimer setitimer^I^I^I \$
^I    stime dysize timegm ftime^I^I^I \$
^I    getdate strptime strptime_l^I^I^I \$
^I    strftime wcsftime strftime_l wcsftime_l^I \$
^I    timespec_get$
aux :=^I    era alt_digit lc-time-cleanup$
$
tests := test_time clocktest tst-posixtz tst-strptime tst_wcsftime \$
^I   tst-getdate tst-mktime tst-mktime2 tst-ftime_l tst-strftime \$
^I   tst-mktime3 tst-strptime2 bug-asctime bug-asctime_r bug-mktime1 \$
^I   tst-strptime3 bug-getdate1 tst-strptime-whitespace tst-ftime \$
^I   tst-tzname$
$

Where ^I represents a tab character and $ represents a newline character. You can use this to check for valid tab and newline indentation in your recipes in case you run into this error: *** missing separator. Stop.
If you’re using vi, make sure to use :set noet to disable replacement of tabs with a tabwidth set number of spaces.

glibc – proposed approach to optimize difftime/subtract

This is a continuation to my previous post on choosing a glibc function that could potentially be optimized. Now I’ll discuss my proposed approach for potential optimization.

difftime

difftime has a few handlers for calculating doubles and long doubles, but for any other types it will simply subtract the larger time value from the smaller one and return the result.

Let’s look at difftime first:

/* Return the difference between TIME1 and TIME0.  */
double
__difftime (time_t time1, time_t time0)
{
  /* Convert to double and then subtract if no double-rounding error could
     result.  */

  if (TYPE_BITS (time_t) <= DBL_MANT_DIG
      || (TYPE_FLOATING (time_t) && sizeof (time_t) < sizeof (long double)))
    return (double) time1 - (double) time0;

  /* Likewise for long double.  */

  if (TYPE_BITS (time_t) <= LDBL_MANT_DIG || TYPE_FLOATING (time_t))
    return (long double) time1 - (long double) time0;

  /* Subtract the smaller integer from the larger, convert the difference to
     double, and then negate if needed.  */

  return time1 < time0 ? - subtract (time0, time1) : subtract (time1, time0);
}

The IF condition for doubles does not contain any significantly expensive operations (i.e. multiply, divide), and since it doesn’t, it may not be necessary to change anything here, but, we know that if the first condition before the OR is not met, we won’t need to execute the second condition, so this could also be written as:

if (TYPE_BITS (time_t) <= DBL_MANT_DIG) {return (double) time1 - (double) time0;}
if (TYPE_FLOATING (time_t) && sizeof (time_t) < sizeof (long double))) {return (double) time1 - (double) time0;}

Since the first condition is the smaller of the two, we test it first and immediately return our result if the condition is met. If not, we can then check the next slightly larger condition.

We can apply a similar approach for the second condition:

if (TYPE_FLOATING (time_t)) {return (long double) time1 - (long double) time0;}
if (TYPE_BITS (time_t) <= LDBL_MANT_DIG) {return (long double) time1 - (long double) time0;}

Something else I noticed inside the __difftime function was the checks for double and long double were always returning time1 minus time0 regardless of which was the larger value. On my particular machine (x86_64), the second IF condition was true since TYPE_BITS(time_t) was lower than LDBL_MANT_DIG, so line 11 was being executed.

double 
__difftime (time_t time1, time_t time0)
{
  if (TYPE_BITS (time_t) <= DBL_MANT_DIG
      || (TYPE_FLOATING (time_t) && sizeof (time_t) < sizeof (long double))) {
    return (double) time1 - (double) time0;
  }

  if (TYPE_BITS (time_t) <= LDBL_MANT_DIG || TYPE_FLOATING (time_t)) {
    //return time1 < time0 ? (long double) time0 - (long double) time1 : (long double) 
    return (long double) time1 - (long double) time0;
  }

  return time1 < time0 ? - subtract (time0, time1) : subtract (time1, time0);
}

I wrote a small tester for this:

int main() {

    // test difftime function
    time_t time1 = time(NULL);
    time_t time0 = time(NULL) + 10;
    printf("time1 = %d\ntime0 = %d\n", time1, time0);
    double result;
    result = __difftime(time1, time0);
    printf("difftime(time1, time0) = %f\n", result);
    result = __difftime(time0, time1);
    printf("difftime(time0, time1) = %f\n", result);

    return 0;
}

Which outputs:

__difftime
time1 = 1489180977
time0 = 1489180987
difftime(time1, time0) = -10.000000
__difftime
time1 = 1489180987
time0 = 1489180977
difftime(time0, time1) = 10.000000

Both results should return 10, but we are missing the time1 < time0 comparison check for each of those conditions, so I included the ternary operators in both conditions:

double
__difftime (time_t time1, time_t time0)
{
  if (TYPE_BITS (time_t) <= DBL_MANT_DIG
      || (TYPE_FLOATING (time_t) && sizeof (time_t) < sizeof (long double))) {
    return time1 < time0 ? (double) time0 - (double) time1 : (double) time1 - (double)
  }

  if (TYPE_BITS (time_t) <= LDBL_MANT_DIG || TYPE_FLOATING (time_t)) {
    return time1 < time0 ? (long double) time0 - (long double) time1 : (long double) 
  }

  ...
}

New output:

__difftime
time1 = 1489181645
time0 = 1489181655
difftime(time1, time0) = 10.000000
__difftime
time1 = 1489181655
time0 = 1489181645
difftime(time0, time1) = 10.000000

subtract

This function is called for any number other than double or long double type. If the time_t type is not a signed type, then the function simply returns the result of time1 - time0. If time_t type is a signed type, handle optimization.

Front End Development w/ Visual Studio Code

VS Code

VS Code is a lightweight code editor for front-end development. It is cross platform and includes many extensions. I will go over some of the extensions I found very useful as well as some of VS code’s neat features.

Test Drive

Open a project

You can have your project directory layout by simply opening the project folder from VS Code.

Here is an example opening the open source brackets project:

vscode_brackets_folder

Change indent from tabs to spaces

VS Code by default tries to figure out the formatting based on the file you have open.
If you want to explicitly set this, you can open the settings.json file by going to File > Preferences > Settings, and set it to your preference:

"editor.tabSize": 4
"editor.insertSpaces": true

Multi-line editing

If you have multiple lines of common code, you can hit Alt + Shift + Left click and highlight multiple lines to either delete or write new code.

You can also place the cursors arbitrarily and edit your text (while holding Alt + Shift):

Extensions

Debugger for Chrome

You can download and install the Chrome Debugger extension here, or install it from within VS Code by hitting Ctrl + Shift + X and searching for ‘Debugger’.

Chrome Debugging Protocol Viewer is the same protocol used by Chrome Dev Tools that allows for tools for debugging.

Here is a guide for getting started using Chrome Debugging with VS Code. It is also necessary to go through the readme docs in github.

Issues

Getting the VS Code Chrome debugger to work with Thimble has been a challenge. After I have brackets and Thimble up and running I open localhost:3500 in my Chrome browser – everything is ok.

Then attach to chrome using this launch.config:

    {
        "name": "Attach",
        "type": "chrome",
        "request": "attach",
        "port": 9222,
        "url": "http://localhost:3500/en-US/",
        "webRoot": "${workspaceRoot}/client/en_US/",
        "sourceMaps": true,
        "diagnosticLogging": true,
        "sourceMapPathOverrides": {
            "scripts/*": "${webRoot}/scripts/*"
        }
    },

After trying various webRoot and sourceMapPathOverride settings, I would still see ‘sourceRoot undefined’ in my diagnostic log output:

Paths.scriptParsed: could not resolve https://www.youtube.com/yts/jsbin/player-en_US-vflppaI6B/remote.js to a file under webRoot: c:\github\thimble.mozilla.org/client/en_US/. It may be external or served directly from the server's memory (and that's OK).
Target userAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
SourceMap: creating for http://localhost:3500/node_modules/jquery/dist/jquery.min.js
SourceMap: sourceRoot: undefined
SourceMap: sources: ["jquery.js"]
SourceMap: webRoot: c:\github\thimble.mozilla.org/client/en_US/
SourceMap: no sourceRoot specified, using webRoot + script path dirname: c:\github\thimble.mozilla.org\client\en_US\node_modules\jquery\dist
SourceMaps.scriptParsed: http://localhost:3500/node_modules/jquery/dist/jquery.min.js was just loaded and has mapped sources: ["c:\\github\\thimble.mozilla.org\\client\\en_US\\node_modules\\jquery\\dist\\jquery.js"]

Debugging example:

I will include some examples of using the remote debugger with the Thimble project, as well as the settings I used to get it working properly.

Terminal

You can access a PowerShell terminal from VS Code which by default starts in your project directory.

Common Key Bindings

Open Project Folder Ctrl + K, Ctrl + O
Go to Line: Ctrl + G
Move editor left: Ctrl + PgUp
Move editor right: Ctrl + PgDn
Split Editor: Ctrl + \
List Methods: Ctrl + Shift + O
Search for File: Ctrl + E
Search for files: Ctrl + E
Search for text (in current file): Ctrl + F
Search for text (all files in project): Ctrl + Shift + F

More key bindings

Color Themes

Change Color Theme by Clicking File > Preferences > Color Theme. Or Ctrl + K, Ctrl + T.

glibc – choosing potential function to enhance

Following the getting started instructions, I cloned the glibc repository to my Fedora Linux machine.

After cloning the repository, I went through the list of functions looking for one that could potentially be enhanced or optimized in some way. There were several I was interested in (ceil, bsearch, regexec), but I eventually chose the difftime function.

The function is defined inside the ./time/difftime.c file, with its own subtract() function I may look at for potential optimization as well.

Fix Oracle Virtual Box Guest Machine causing CPU spikes

If you are experiencing constant CPU spikes when running Oracle VirtualBox machines, try the following:

Disable Nested Paging:

  • Open Oracle Virtual VM Box
  • Right click on machine that is causing CPU spikes
  • Click on the ‘System’ tab, then ‘Acceleration’
  • Uncheck ‘Enable Nested Paging’

Set Execution Cap:

  • In Oracle Virtual VM Box, right click machine again
  • Click ‘System’, then ‘Processor’
  • Set the execution cap to 40%

I found out that there are some issues in Windows 10 regarding CPU spikes. For me, it occurs when any application starts up, the CPU usage will spike to 100% then go right back down once it’s done loading. For the virtual box I was running, it would hit 100% then sit at around 80 – 90%. Setting the execution cap fixes the issue at least for the virtual box.

Inline Assembly Lab

Part A – Volume Scale Factor w/ Inline Assembly

The first part of this lab is to use SQDMULH or SQRDMULH instructions via inline assembly on a previous volume scaling solution and compare each of the performances on an Aarch64 architecture.

Inline assembly call

  • asm(...); or __asm__(...);
  • asm volatile (...); or __asm__ __volatile (...);
    volatile is used if you want to explicitly prevent the compiler from moving code as a result of optimization.

Additional links for inline assembly:

Solution

We’ll take the one solution from our previous program which multiplies each sound sample by a volume scale factor, and replace the multiply operation inside the loop with inline assembler code.

// Volume up using multiply by volume scale factor
void naiveVolumeUp(int16_t* sample_, int16_t* newSample_)
{
    int16_t *x = __builtin_assume_aligned(sample_,16);
    int16_t *y = __builtin_assume_aligned(newSample_,16);

    for (int i = 0; i <= SAMPLESNUM; i+8)
    {
        __asm__ ("LD1 {v0.8h}, [%0];" // load multiple 1-element structure (stores [value] of %0 into v0.8h)
                "DUP v1.8h, %0;" // duplicate element (vector) from %0 to vector register 1
                "SQDMULH v0.8h, v0.8h, v1.8h;" // Signed integer saturating doubling multiply high half (vector)
                "ST1 {v0.8h}, [%0], #16;"
                :
                : "r"(y[i]), "r"(y[i]) // inputs
                //: // outputs
                //: // clobbers
                );
    }
}

Note: solution is incomplete – will update once resolved

Part B – Open source package that uses Assembler

For the second part of the lab, we analyze an open source package that uses inline assembly. For this I chose the CLN package.

Clone CLN repo

I first cloned the repository (on a Fedora Linux machine), so that I could have a look at all the files and look for any assembler code that exists.
git clone git://www.ginac.de/cln.git

Then searched all files (recursively) within the project folder that contain the keyword ‘asm’.
grep -rnw './' -e "asm"

Two files were found in the results:

./src/base/digitseq/cl_DS_mul_nuss.h:963
./src/base/digitseq/cl_DS_mul.cc:176

The cl_DS_mul.cc file only contained a comment with “asm”. The header file cl_DS_mul.nuss.h contained many lines of assembler code.

Part of cl_DS_mul.nuss.h inline assembly:

#if defined(__GNUC__) && defined(__i386__)
    var uintD dummy;
  #ifdef NUSS_ASM_DIRECT
    __asm__ __volatile__ (
        "movl %1,%0" "\n\t"
        "subl %2,%0" "\n\t"
        "movl %0,%3"
        : "=&q" (dummy)
        : "m" (a.ow0), "m" (b.ow0), "m" (r.ow0)
        : "cc"
        );
...
...

The assembler code is written for an i386 platform as we can see from the defined(__i386__) line surrounding the assembly code for each of these preprocessors.

The author of this file included some comments at the top of the file explaining what these definitions are for:

  • NUSS_IN_EXTERNAL_LOOPS – Define this is you want the external loops instead of inline operation
  • NUSS_ASM_DIRECT – Define this if you want the external loops instead of inline operation
  • DEBUG_NUSS – Define this for (cheap) consistency checks.
  • DEBUG_NUSS_OPERATIONS – Define this for extensive consistency checks.

There were no other files containing any assembler code, so the inline assembly in this project are inclusive to the i386 platform.

Although assembler code is platform specific, it can be beneficial for certain cases when optimization can be improved on that specific platform. Our inline assembler can address that while leaving other platforms to be handled by the compiler optimizations.