user7898461
user7898461

Reputation:

Reduce array list, to set of common paths

I am trying to reduce a list to a shorter list, of just common filesystem paths. Trying to find all the common grandparents, and put only those in the final list. Here is the goal: The goal is that we must eliminate all directories in the list for which there is a parent directory in the list.

A better way to phrase that might be: The goal is that we must eliminate all paths in the list for which there is a parent directory of that path in the list.

Say I have this input and expected output:

const input = [
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"
];

const output = [
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs",
];

const getReducedList = function (input) {

  return input
  .sort((a, b) => (a.length - b.length))
  .reduce((a, b) => {

    // console.log('a:', a, 'b:', b);

    const s = !a.some(v => {
      return b.startsWith(v);
    });

    if (s) {
      a.push(b);
    }

    return a;

  }, []);

};

console.log(getReducedList(input));

that getReducedList function seems to work with our first test case, 5 is reduced to 2. But, if we add a second test case, here is where things get weird:

If I remove an item from the original list and change it to this list of 4:

const input = [
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"
];

then I would expect to get this output (the same list of 4):

const output = [
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"
];

the reason why I would expect/desire the same list of 4, is because no item in the list has a parent directory elsewhere in the list. But I actually get this output, a list of 2, which is incorrect:

const output =  [ 
   '/home/oleg/WebstormProjects/oresoftware/r2g',
   '/home/oleg/WebstormProjects/oresoftware/sumanjs/suman' 
];

does anyone know how I can fix this to get the expected result instead? An answer needs to satisfy both test cases.

To make it perfectly clear, if you add "/home/oleg" to the original list, then "/home/oleg" should be the only entry in the output.

Upvotes: 1

Views: 172

Answers (8)

user7898461
user7898461

Reputation:

Looks like we only needed to change one line. Here is the original:

const getReducedList = function (input) {

  return input
  .sort((a, b) => (a.length - b.length))
  .reduce((a, b) => {

    const s = !a.some(v => {
      return b.startsWith(v);
    });

    if (s) {
      a.push(b);
    }

    return a;

  }, []);

};

we need to change the one line to this instead:

return b.startsWith(v + '/');

Upvotes: 1

Melchia
Melchia

Reputation: 24284

const input = [
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types-blabla",

];

let ouput = input.sort().reduce((a, c)=> {
let d = true;
for (let j =0; j< a.length; j++){
  if(c.startsWith(a[j]+'/')) {
  d=false;
  break;}
}
if (d) a.push(c);
return a;
},[]);

console.log(ouput);

Upvotes: 0

Hikmat G.
Hikmat G.

Reputation: 2621

It works without sort though has other drawbacks.

Edit: Had to add sort to work properly

const input = [
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"
];

const input2 = [
  "/home/oleg/WebstormProjects/oresoftware",
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"
];

function getParents(input) {
  return input.sort((a, b) => a.length - b.length)
  .reduce((acc, a) => {
    const index = acc.findIndex(b => b.startsWith(a) || a.startsWith(b));
    const slashDiff = index > 0 ? a.split('/').length === acc[index].split('/').length : false;
    if (index === -1 || slashDiff) {
      return [...acc, a];
    }
    acc.splice(index, 1, a.length < acc[index].length ? a : acc[index])

    return acc;
  }, []);

}

console.log(getParents(input));
console.log(getParents(input2));

Upvotes: 0

trincot
trincot

Reputation: 350760

You need to avoid matches when the next value does not have its last forward slash following the previous value.

const input = [
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"
];

const getReducedList = (set => input => input.filter(b => 
     !set.has(b.substr(0, b.lastIndexOf("/")))))(new Set(input));

console.log(getReducedList(input));

The logic is that it puts all paths in a set, and then checks for each path whether its parent is in the set. A path's parent is the path that has all characters up to, and excluding, the final /.

Removing the second entry from the input will not change the output. suman is treated the same way as suman-types and suman-watch.

Here is a more verbose, ES5 version (replacing the Set with a plain object):

const input = [
  "/home/oleg/WebstormProjects/oresoftware/r2g",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
  "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"
];

function getReducedList(input) {
    var hash = {};
    input.forEach(function(path) {
        hash[path] = true;
    });
    return input.filter(function (b) {
        return !hash[b.substr(0, b.lastIndexOf("/"))];
    }, []);
}

console.log(getReducedList(input));

Upvotes: 0

Mark
Mark

Reputation: 92460

I think you can do this with a very simple recursive function. You simply sort by length, then recursively pop the shortest, add it to the results, filter the array with that, recurse:

const input = [
    "/home/oleg/WebstormProjects/oresoftware/r2g",
    "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman",
    "/home/oleg/WebstormProjects/oresoftware/sumanjs",
    "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types",
    "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"
  ];
  
input.sort((a,b) => b.length - a.length)
function getPrefixes(list, res =[]) {
    if (list.length < 1) return res
    let next = list.pop()
    res.push(next)
    return getPrefixes(list.filter(u => !u.startsWith(next + '/')), res)

}
console.log(getPrefixes(input))

Upvotes: 1

Nina Scholz
Nina Scholz

Reputation: 386680

You could add a slash for look up and pop the last value if found in next element.

This proposal uses a sorted list by value, not by lenght of the string, because it needs a sorted list for compairing the next element.

function uniqueFolder(array) {
    return array
        .sort()
        .reduce((a, b) => {
            if (b.startsWith(a[a.length - 1] + '/')) {
                a.pop();
            }
            a.push(b);
            return a;
        }, []);
}

function commonPath(array) {
    return array
        .sort()
        .reduce((a, b) => {
            if (!b.includes(a[a.length - 1])) {
                a.push(b);
            }
            return a;
        }, []);
}

const input = [ "/home/oleg/WebstormProjects/oresoftware/r2g", "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman", "/home/oleg/WebstormProjects/oresoftware/sumanjs", "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types", "/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch"];

console.log(uniqueFolder(input));
console.log(commonPath(input));
.as-console-wrapper { max-height: 100% !important; top: 0; }

Upvotes: 0

quazar
quazar

Reputation: 590

If you can use sets from es6 then you can simply sort with respect to directory length and keep pushing to a set.

Upvotes: 0

baao
baao

Reputation: 73251

You could simply use node's path module

const path = require('path');
const input = [
  '/home/oleg/WebstormProjects/oresoftware/r2g',
  '/home/oleg/WebstormProjects/oresoftware/sumanjs/suman',
  '/home/oleg/WebstormProjects/oresoftware/sumanjs',
  '/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-types',
  '/home/oleg/WebstormProjects/oresoftware/sumanjs/suman-watch'
];

const s = new Set();
const out = [];
input.forEach(e => {
  let p = path.dirname(e);
  if (! s.has(p)) {
    s.add(p);
    out.push(e);
  }
});

console.log(out);

The forEach could obviously replaced by reduce if you want to...

Upvotes: 0

Related Questions