jmls
jmls

Reputation: 2969

reduce an array of strings to have no duplicates or substrings

Given an array of

['/a','/a/b/c/d/e','/b','/b/c','/a/b/c/d','/a','/b']

I would like to be able to reduce this array down to:

['/a/b/c/d/e','/b/c'] 

i.e the longest of each unique path ('/a' is a substring of '/a/b/c/d/e')

Perhaps I don't know the correct terminology, but I've been Googling for a couple of hours and have gotten nowhere

I was thinking of sorting by element length, then for each element loop through the list, checking the indexOf() until I reached the element itself

Just seems kind of expensive.

edit I didn't explain that well enough - the strings are paths - and I need to use mkdirp to create the directory structure, but didn't want to call it many times (mkdir /a/b/c/d/e will create /a/b/c , so if I have another path of /a/b/c I just want to ignore it

Upvotes: 0

Views: 158

Answers (3)

Nina Scholz
Nina Scholz

Reputation: 386604

You could check the whole string and collect the longer ones, without sorting in advance.

var array = ['/abc/def', '/abc/defghi/jkl', '/a', '/a/b/c/d/e', '/b', '/b/c', '/efg', '/a/b/c/d', '/a', '/b', '/a/b'],
    result = array.reduce(function (r, a) {
        r.some(function (b, i, rr) {
            var aa = a + '/',
                bb = b + '/',
                min = Math.min(aa.length, bb.length);

            if (aa.slice(0, min) === bb.slice(0, min)) {
                if (a.length > b.length) {
                    rr[i] = a;
                }
                return true;
            }
        }) || r.push(a);
        return r;
    }, []);
    
console.log(result);

Upvotes: 1

jmls
jmls

Reputation: 2969

I eventually came up with this - would appreciate comments

let originalArray = ['/abc/def','/abc/defghi/jkl','/a','/a/b/c/d/e','/b','/b/c','/efg','/a/b/c/d','/a','/b','/a/b'];

let uniqueArray = Array.from(new Set(originalArray));
let newArray = [];

uniqueArray.sort();

uniqueArray.forEach((item,index) => {
  if (index === uniqueArray.length - 1) {
      newArray.push(item);
      return;
  }

  if (uniqueArray[index + 1].indexOf(item + "/") === -1) {
      newArray.push(item);
  }
});

console.log(newArray)

Upvotes: 0

Anish Goyal
Anish Goyal

Reputation: 2859

I would sort the array. Then, iterate through. For each element, if it is equivalent to the next element, or a substring of it, skip it. Otherwise, add it to a dynamically growing structure. Finally, convert your dynamically growing structure to an array.

You only have to check the next item since anything that's a prefix of something else will come alphabetically before it, directly before the shortest string using it as a prefix. This cuts your algorithm down to O(nlgn).

Luckily, '/' comes before letters and numbers in ASCII, which will help you extend this to paths with multiletter folder names.

Let's assume your original array is in arr.

var arr = ['/a','/a/b/c/d/e','/b','/b/c','/a/b/c/d','/a','/b'];
arr.sort();
var output = [];
for(var i = 0; i < arr.length; i++)
{
  if(i < arr.length - 1 && arr[i + 1].indexOf(arr[i]) == 0 &&
    (arr[i].length == arr[i + 1].length || arr[i + 1].charAt(arr[i].length) == "/"))
  {
    continue;
  }
  output.push(arr[i]);
}
console.log(output);

In the for loop, for each string we check if there's another string in the array after it; if not, that string is not a prefix of another path. Then, we check if the current and next strings are equal. If not, we check if the next string starts with the current one, and also if the next character after the current string is a "/" in the next string so that we don't assume "abc/def" is a prefix of "abc/defghi/jkl", which are clearly two separate paths.

Upvotes: 1

Related Questions