ardabro
ardabro

Reputation: 2061

Is there any smart way to combine overlapping paths in python?

Let's say I have two path names: head and tail. They can overlap with any number of segments. If they don't I'd like to just join them normally. If they overlap, I'd like to detect the common part and combine them accordingly. To be more specific: If there are repetitions in names I'd like to find as long overlapping part as possible. Example

"/root/d1/d2/d1/d2" + "d2/d1/d2/file.txt" == "/root/d1/d2/d1/d2/file.txt"
and not "/root/d1/d2/d1/d2/d1/d2/file.txt"

Is there any ready-to-use library function for such case, or I have to implement one?

Upvotes: 9

Views: 2458

Answers (4)

James Harrington
James Harrington

Reputation: 3216

I just made it here looking for this answer. Hopefully it can help somone else.
Here is how i did it in Nodejs

const path1 = '/root/user/name/code/website/'
const path2 = './website/index.js'

const arrOfDirectories = [...path1.split('/'), ...path2.split('/')] 
// ['', 'root', 'user', 'name', 'code', 'website', '', '.' 'website', 'index.js']

const arrOfUniqueDirectories = arrOfDirectories.filter((value, index, self) => self.indexOf(value) === index)
// ['', 'root', 'user', 'name', 'code', 'website', '', '.','index.js']

const jankyPath = arrOfUniqueDirectories.join('/')
// /root/user/name/code/website./index.js

const myPath = path.normalize(jankyPath)
// myPath = /root/user/name/code/website/index.js

Upvotes: 0

Shashank
Shashank

Reputation: 13869

I think this works:

p1 = "/root/d1/d2/d1/d2"
p2 = "d2/d1/d2/file.txt"

def find_joined_path(p1, p2):
    for i in range(len(p1)):
        if p1[i:] == p2[:len(p1) - i]:
            return p1[:i] + p2

print(find_joined_path(p1, p2))

Note that it's a general solution that works for any two strings, so it may not be as optimized as a solution that works only with file paths.

Upvotes: 1

Kasravnd
Kasravnd

Reputation: 107287

You can use a list comprehension within join function :

>>> p1="/root/d1/d2/d1/d2"
>>> p2="d2/d1/d2/file.txt"
>>> p1+'/'+'/'.join([i for i in p2.split('/') if i not in p1.split('/')])
'/root/d1/d2/d1/d2/file.txt'

Or if the difference is just the base name of second path you can use os.path.basename to get the bname and concatenate it to p1 :

>>> import os
>>> p1+'/'+os.path.basename(p2)
'/root/d1/d2/d1/d2/file.txt'

Upvotes: 3

Abhijit
Abhijit

Reputation: 63707

I would suggest you to use difflib.SequenceMatcher followed by get_matching_blocks

>>> p1, p2 = "/root/d1/d2/d1/d2","d2/d1/d2/file.txt"
>>> sm = difflib.SequenceMatcher(None,p1, p2)
>>> size = sm.get_matching_blocks()[0].size
>>> path = p1 + p2[size:]
>>> path
'/root/d1/d2/d1/d2/file.txt'

Ans a General solution

def join_overlapping_path(p1, p2):
    sm = difflib.SequenceMatcher(None,p1, p2)
    p1i, p2i, size = sm.get_matching_blocks()[0]
    if not p1i or not p2i: None
    p1, p2 = (p1, p2) if p2i == 0 else (p2, p1)
    size = sm.get_matching_blocks()[0].size
    return p1 + p2[size:]

Execution

>>> join_overlapping_path(p1, p2)
'/root/d1/d2/d1/d2/file.txt'
>>> join_overlapping_path(p2, p1)
'/root/d1/d2/d1/d2/file.txt'

Upvotes: 3

Related Questions