Reputation: 131
When we download data from instagram, it scans the entire posts of the account even when you provide a time window(it will skip the dates older but will still scan the whole history) through the following command:
instaloader --login=username
--password=password
--post-metadata-txt="{likes} likes, {comments} comments."
--post-filter="date_utc >= datetime(2019, 12, 31) and not is_video"
This is very inefficient. I am wondering is there any more efficient way to download data?
Upvotes: 0
Views: 3619
Reputation: 637
This is not directly supported by Instaloader's command line interface, meaning that you have to write a little Python script to achieve that. There is an example in the Instaloader documentation for downloading posts in a specific period. It differs from what you want to achieve in only very few points:
Use a custom post_metadata_txt_pattern. To do so, instantiate Instaloader with
L = instaloader.Instaloader(post_metadata_txt_pattern="{likes} likes, {comments} comments.")
Log in:
L.load_session_from_file('username')
Load until the latest post (Not having a specific SINCE date). This allows for a even simpler loop. Also filter by not is_video:
for post in takewhile(lambda p: p.date_utc > datetime(2019, 12, 31), posts):
if not post.is_video:
L.download_post(post, 'target_directory')
The key is the takewhile function, which ends the download loop when a post is encountered that does not match the given condition. Considering that posts come newest-first, the download loop terminates as soon as all new-enough posts have been downloaded.
Putting it all together, we get:
from datetime import datetime
from itertools import takewhile
import instaloader
L = instaloader.Instaloader(post_metadata_txt_pattern="{likes} likes, {comments} comments.")
L.load_session_from_file('username')
posts = L.get_hashtag_posts('hashtag')
# or
# posts = instaloader.Profile.from_username(L.context, 'profile').get_posts()
for post in takewhile(lambda p: p.date_utc > datetime(2019, 12, 31), posts):
if not post.is_video:
L.download_post(post, 'target_directory')
Write that in a file, e.g. downloader.py and execute it with python downloader.py. The call to load_session_from_file assumes that there's already a saved Instaloader session, to get one, simply call instaloader -l username before executing the code snippet.
Upvotes: 2