user2979630
user2979630

Reputation: 131

Download 1 week data from instaloader

When we download data from instagram, it scans the entire posts of the account even when you provide a time window(it will skip the dates older but will still scan the whole history) through the following command:

instaloader --login=username
--password=password
--post-metadata-txt="{likes} likes, {comments} comments." 
--post-filter="date_utc >= datetime(2019, 12, 31) and not is_video" 

This is very inefficient. I am wondering is there any more efficient way to download data?

Upvotes: 0

Views: 3619

Answers (1)

aandergr
aandergr

Reputation: 637

This is not directly supported by Instaloader's command line interface, meaning that you have to write a little Python script to achieve that. There is an example in the Instaloader documentation for downloading posts in a specific period. It differs from what you want to achieve in only very few points:

  • Use a custom post_metadata_txt_pattern. To do so, instantiate Instaloader with

    L = instaloader.Instaloader(post_metadata_txt_pattern="{likes} likes, {comments} comments.")
    
  • Log in:

    L.load_session_from_file('username')
    
  • Load until the latest post (Not having a specific SINCE date). This allows for a even simpler loop. Also filter by not is_video:

    for post in takewhile(lambda p: p.date_utc > datetime(2019, 12, 31), posts):
        if not post.is_video:
            L.download_post(post, 'target_directory')
    

The key is the takewhile function, which ends the download loop when a post is encountered that does not match the given condition. Considering that posts come newest-first, the download loop terminates as soon as all new-enough posts have been downloaded.

Putting it all together, we get:

from datetime import datetime
from itertools import takewhile

import instaloader

L = instaloader.Instaloader(post_metadata_txt_pattern="{likes} likes, {comments} comments.")

L.load_session_from_file('username')

posts = L.get_hashtag_posts('hashtag')
# or
# posts = instaloader.Profile.from_username(L.context, 'profile').get_posts()

for post in takewhile(lambda p: p.date_utc > datetime(2019, 12, 31), posts):
    if not post.is_video:
        L.download_post(post, 'target_directory')

Write that in a file, e.g. downloader.py and execute it with python downloader.py. The call to load_session_from_file assumes that there's already a saved Instaloader session, to get one, simply call instaloader -l username before executing the code snippet.

Upvotes: 2

Related Questions