Aukta
Aukta

Reputation: 403

How can I create an "in memory" copy of a directory with same content as existing directory in python

I want to create a copy of an existing directory with files and subfolders in memory, perform some operation on it and discard it without making any change in the existing directory.

Upvotes: 2

Views: 2189

Answers (2)

Todd
Todd

Reputation: 5395

You have a few options that might help you achieve the goal of having a tree in the filesystem temporarily copied and useful to you. One option is the tempfile module. Using os.walk you can create a TemporaryDirectory for each source dir as you walk, and a NamedTemporaryFile for each file, then copy the files in a loop. This will create files that can be accessed from outside Python that are mapped to the OS's filesystem in the temporary folder of the system. Once the process ends, or the objects' ref counts go to 0, the files are reliably removed. This may not be in memory however - but that may not be your primary goal.

Another alternative is to use the fs module which can be installed with pip (pip install fs). And then easily copy a source folder to memory like so:

import fs.copy
from fs.memoryfs import MemoryFS
from fs.osfs import OSFS

mem_fs = MemoryFS()
drv_fs = OSFS("~/projects/misc")

fs.copy.copy_fs(drv_fs, mem_fs)

Pretty easy! Now this assumes that you want to use Python code only to access this new in-memory filesystem. This works well for that. The memory filesystem won't be mapped to the OS's filesystem however, so you wouldn't be able to run some utility in the bash shell to access these files for instance. But as long as you're only interested in accessing these copies with your Python application you're fine.

You can easily access this new FS through the fs API (https://docs.pyfilesystem.org/en/latest/ ). Through the mem_fs object, you can get other file or directory objects and perform operations on them.

A quick example of getting a directory listing, opening a file for reading, then printing some of its text:

>>> mem_fs.listdir('.')
['chloropleth', '__pycache__', ...<lots more files>... 'genetic2.py']
>>>
>>> gen_file = mem_fs.open("genetic2.py", "r")
>>>
>>> text = gen_file.read()
>>>
>>> text[:10]
'import io\n'
>>> 

The API is very close to what you'd expect with file objects you get with open(...). Other Python commands or functions from other modules that take a file object should work the same as they do with regular file objects - fs module file objects have the same, or nearly the same, interface.

Upvotes: 3

Antwane
Antwane

Reputation: 22628

You can use os.walk() to iterate on each element of a directory, recursively. Then, you can save file information you need (and possibly the content) into a dict or some other data structure that will match your needs (Note: BytesIO and StringIO classes may help). Then you can do what you want on memory representation of the files.

Obviously, if you only need to perform actions on individual files, you can do it directly when looping without copying everything in memory.

Upvotes: 0

Related Questions