pinky
pinky

Reputation: 372

pysmb to get directory tree of a smb share server

i manage to connect and access a smb share server using pysmb. what i meant is to read/write/delete/create files/folders to/from the server.

majority of the time i need to read file ( be it jpg or csv and etc) from the server base on the smb device and service name (pysmb terms).

basically i have no idea what is the filename and directory name in the smb devices. meaning the naming is dynamics.

i am wondering is it a good idea to get the filtered directory tree first before processing read files. the numbers of files and directories is not known with about 3 months data about 60TB.

listShares(timeout=30)[source]
listPath(service_name, path, search=55, pattern='*', timeout=30)

the above methods about to get only 1 specific level of the hierarchy. what i want is the similar output from os.walk.path().

anybody have experience in idea? can i get suggestions? thank you very much.

Upvotes: 4

Views: 14279

Answers (3)

Rohit.M
Rohit.M

Reputation: 124

Did you consider using threads ? quick idea is to get all the top level directories then use threads for all of them and use your smbwalk function. While tree walk, it does lookup on objects so it will take time. but you will see performance improvement using threads.

Upvotes: 1

pinky
pinky

Reputation: 372

def smbwalk(conn, shareddevice, top = u'/'):
    dirs , nondirs = [], []

    if not isinstance(conn, SMBConnection):
        raise TypeError("SMBConnection required")


    names = conn.listPath(shareddevice, top)

    for name in names:
        if name.isDirectory:
            if name.filename not in [u'.', u'..']:
                dirs.append(name.filename)
        else:
            nondirs.append(name.filename)

    yield top, dirs, nondirs

    for name in dirs:
        new_path = os.path.join(top, name)
        for x in smbwalk(conn, shareddevice, new_path):
            yield x


conn = SMBConnection(*con_str, domain='workgroup')
assert conn.connect('10.10.10.10')
ans = smbwalk(conn, 'SHARE_FOLDER',top= '/')

this is what i want, but i found out that if the network shares is too big, it is taking forever to return.

Upvotes: 9

Rohit.M
Rohit.M

Reputation: 124

Not sure if this is what you want. but i'm working on similar kind of stuff so here you go.

I use Impacket which actually use some base classes from pysmb. https://github.com/CoreSecurity/impacket

I hope your listPath method is returning output in text format and not SharedFile instance.

What i mean is, store below values while listing them.

get_longname is_directory get_filesize

I have tree method which traverse through share/path and checks if SharedFile instance is directory, & does recursive call to itself.

def tree(self, path):    
   for x in range(0, path.count('\\')):
            print '|  ',
    print '%s' % os.path.basename(path.replace('\\', '/'))

    self.do_ls('%s\\*' % path, pretty=False) #Stores files data in listdata[]

    for file, is_directory, size in self.listdata:
            if file in ('.', '..'):
                continue
            if is_directory > 0:
                self.tree(ntpath.join(path, file))
            else:
                for x in range(0, path.count('\\')):
                    print '|  ',
                print '|-- %s (%d bytes)' % (file, size)


>>>d.tree('test')
.snapshot
|   hourly.0
|   |   dir0
|   |   |   Test051-89
|   |   |   Test051_perf3100-test_43
|   |   |   |   Test051_perf3100-test_52
|   |-- a.txt (8 bytes)
|   |-- dir0 - Shortcut.lnk (1834 bytes)
|   |-- Thumbs.db (46080 bytes)
|   |   20743
|   |   |-- file.txt (82 bytes)
|   |   |-- link.txt (82 bytes)
|   |   |   targetdir
|   |   |   |-- file2.txt (39 bytes)
|   |-- target.txt (6394368 bytes)
|   |   linkdir
|   |   |-- file2.txt (39 bytes)

Upvotes: 5

Related Questions