Slick Node.js file navigation with closures and generators

If you were want­i­ng to under­stand either clo­sures or gen­er­a­tors by exam­ple, let me share with you how both lend them­selves to pret­ty awe­some file nav­i­ga­tion in Node.js.

Let’s start with the core mechan­ic.

const path = require('path');

function mount(home) {
    function vantage(...args) {
        return path.resolve(home, ...args);
    }

    vantage.cd = function (...args) {
        return mount(vantage(...args));
    };

    return vantage;
}

The mount() func­tion takes an absolute path and returns a self-repli­cat­ing inter­face to nav­i­gate in terms of that path. The name is admit­ted­ly mis­lead­ing when flirt­ing with Lin­ux user­space or BASH terms, but it seemed con­tex­tu­al­ly appro­pri­ate.

The way Javascript clo­sures work has to do with how Javascript treats vari­able scope. Every func­tion “remem­bers” the vari­ables it could see when it was called. The vantage func­tion, in par­tic­u­lar, can see the val­ue of home passed to mount, and it will always see the same val­ue of home for that par­tic­u­lar call to mount. Since you can cre­ate func­tions on the fly in Javascript with­in oth­er func­tion scopes, you are basi­cal­ly bind­ing func­tions to the state of oth­er func­tion calls.

With the help of path.resolve() from Node.js builtins, you get some flex­i­bil­i­ty out of the box.

const logs = mount('/var/log');

logs() === '/var/log'
logs('celery') === '/var/log/celery'
logs('..') === '/var'
logs('..', '..', 'tmp', '.') === '/tmp'
logs('../../tmp') === '/tmp'

The “self-repli­cat­ing” part comes from the cd func­tion, which just binds home to a dif­fer­ent van­tage point.

const celerylogs = logs.cd('celery');
celerylogs() === '/var/log/celery';
celerylogs.cd('..')() === logs();

I find this handy because it gives me a way to quick­ly reach out and estab­lish project file sys­tems.

const sourcetree = mount(process.env.CODEBASE || '/opt/provisioned');
const sourcetree.apps = sourcetree.cd('src', 'apps');
const sourcetree.libs = sourcetree.cd('src', 'libs');
const sourcetree.tests = sourcetree.cd('__tests__');

// ...

Of course, man­u­al nav­i­ga­tion isn’t enough. We’ll need stats. Let’s head back into mount() and add a func­tion to recur­sive­ly walk from the van­tage point of the mount­ed direc­to­ry. I’ll use syn­chro­nous code for sim­plic­i­ty.

function mount() {
  // ...assume fs is in scope 
  vantage.walk = function* () {
    for (const basename of fs.readdirSync(home)) {
      const abspath = path.join(home, basename);
      const stat = fs.statSync(abspath);
      const nest = yield [abspath, basename, stat];

      if (stat.isDirectory() && (nest === undefined || nest === true)) {
        yield* vantage.cd(abspath).walk();
      }
    }
  };
}

This is a gen­er­a­tor func­tion that recur­sive­ly lists con­tents of a direc­to­ry, while mak­ing poten­tial­ly tons of sys­tem calls to stat the files for com­plete­ness sake (you won’t want some­thing so hun­gry in time-crit­i­cal sys­tems, but this is a handy addi­tion when run­ning tests or audits). But what is a gen­er­a­tor, exact­ly?

A gen­er­a­tor is a func­tion that can stop in the mid­dle of its exe­cu­tion and return con­trol to a caller using the yield key­word. This makes it a kind of corou­tine, but gen­er­a­tors are treat­ed more like a spe­cial­iza­tion that returns con­trol to a sin­gle caller. Gen­er­a­tors can talk to its caller by pass­ing a val­ue through yield, and the caller can talk back through the return val­ue of yield. In Javascript, that two-way com­mu­ni­ca­tion hap­pens through an iter­a­tor, which in this sce­nario you can think of as a way to play, pause and stop exe­cu­tion of a gen­er­a­tor.

In this case the walk() gen­er­a­tor uses the nest trilean to know if the caller wants to recurse into a direc­to­ry. By default, advanc­ing the iter­a­tor with no feed­back (i.e. for..of) will nav­i­gate the entire file sys­tem by list­ing direc­to­ry con­tents before recurs­ing.

If you wish to restrict nav­i­ga­tion to a sub­set of the file sys­tem, use a pre-test loop and a pred­i­cate. This exam­ple nav­i­gates a Node.js project while skip­ping node_modules.

const recursive_iterator = mount('/home/fred/nodeproject').walk();
let yielded = recursive_iterator.next();

while (!done) {
  const [abspath, basename, stat] = yielded;

  process(abspath, stat);

  yielded = recursive_iterator.next(
    basename !== 'node_modules');
}

ls-like func­tion­al­i­ty means reject­ing all recur­sion. If you want to be more true to the source mate­r­i­al, mod­i­fy the below to stick to base­names.

function mount(home) {
  //...

  vantage.ls = function* () {
    const it = vantage.walk();
    let { done, value } = it.next(false);

    while (!done) {
      yield value; // value[1] for basenames
      ({ done, value } = it.next(false));
    }
  };

  //...
}

This is handy for file pro­cess­ing in direc­to­ries with well-estab­lished user­space con­ven­tions, and you can see that if we choose to ignore the absolute path pro­vid­ed and restrict our­selves to base­names, that does not make it any hard­er to restore the absolute path.

// Delete rotated logs of basename whatever.log.X where X >= 5
const logdir = mount('/var/log/vendor');

for (const [,basename] of logdir.ls()) {
  const rotation = parseInt(basename.split('.').pop());

  if (!isNaN(rotation) && rotation >= 5) {
      fs.unlinkSync(logdir(basename));
  }
}

Some forms of analy­sis become triv­ial.

// Java projects conventionally hold source
// code and tests in mirrored file systems.
const sourcetree = mount('/opt/provisioned/src');
const predicate = ([,basename]) => basename.endsWith('.java');

// Careful: Array.from() on generators keeps all
// yielded content in memory. Make sure garbage
// collection eventually reclaims it.
const countClasses = (dir) =>
  Array.from(sourcetree.cd(dir).walk())
       .filter(predicate)
       .length;

const nClasses = countClasses('main');
const nTestClasses = countClasses('tests');
const clCoverage = `${100 * nTestClasses/nClasses}%`

console.log(`Class-level coverage: ${clCoverage}`);

You can play with this for a bit and come up with your own cas­es. If this approach helps you at all, or if you have feed­back, tell me! Thanks for read­ing and hap­py new year!

Do NOT follow this link or you will be banned from the site!