[PYTHON] Crawl Follower for an account with Instagram

Overview

Crawl all follower information for an account on Instagram. The official Instagram API has an API to get your follower list, but no API to get someone else's follower list. So this time I created a program to crawl the follower information of any account.

Prerequisites

Prerequisites for using this program

  1. Have an Instagram account
  2. Know the User ID of the account you want to look up
  3. Know what your Query ID is

How to check 2 and 3 will be described later.

program

crawl_followers.py



import requests
import json

# Your Account info
def get_user_info():
    return {
            "username": "your_account",
            "password": "your_password"
            }

# HTTP Headers to login
def login_http_headers():
    ua = "".join(["Mozilla/5.0 (Windows NT 6.1; WOW64) ",
                  "AppleWebKit/537.36 (KHTML, like Gecko) ",
                  "Chrome/56.0.2924.87 Safari/537.36"])
    return {
            "user-agent": ua,
            "referer":"https://www.instagram.com/",
            "x-csrftoken":"null",
            "cookie":"sessionid=null; csrftoken=null"
            }

# login session
def logined_session():
    session = requests.Session()
    login_headers = login_http_headers()
    user_info = get_user_info()
    login_url = "https://www.instagram.com/accounts/login/ajax/"
    session.post(login_url, data=user_info, headers=login_headers)
    return session

# a fetch (max 3000 followers)
def fetch_followers(session, user_id, query_id, after=None):
    variables = {
        "id": user_id,
        "first": 3000,
    }
    if after:
        variables["after"] = after

    followers_url = "".join(["https://www.instagram.com/graphql/query/?",
                             "query_id=" + query_id + "&",
                             "variables=" + json.dumps(variables)])
    # HTTP Request
    followers = session.get(followers_url)
    dic = json.loads(followers.text)
    edge_followed_by = dic["data"]["user"]["edge_followed_by"]

    count = edge_followed_by["count"] # number of followers
    after = edge_followed_by["page_info"]["end_cursor"] # next pagination
    has_next = edge_followed_by["page_info"]["has_next_page"]
    return {
            "count": count,
            "after": after,
            "has_next":  has_next,
            "followers": edge_followed_by["edges"]
            }

def fetch_all_followers(session, user_id, query_id):
    after     = None # pagination
    followers = []  

    while(True):
        fetched_followers = fetch_followers(session, user_id, query_id, after)
        followers += fetched_followers["followers"]

        if fetched_followers["has_next"]:
            after = fetched_followers["after"]
        else:
            return {
                    "count": fetched_followers["count"],
                    "followers": followers
                    }

def main(user_id, query_id):
    session = logined_session()
    return fetch_all_followers(session, user_id, query_id)

if __name__ == '__main__':
    user_id  = "3425874156" # user id to search 
    query_id = "" # your query id
    main(user_id, query_id)

Know the User ID of the account you want to look up

It's fairly easy to know the user id of an account. You can usually find it from the Instagram GUI. Example: Know the user id of taylorswift Step 1: Open chrome Step 2: Open the developer tool Step 3: Go to the Network tab Step 4: Go to https://www.instagram.com/taylorswift/ Step 5: Click / query at ↓ Screenshot 2017-04-11 20.26.15.png

Step 6: Go to the Response tab and look for the key "owner" スクリーンショット 2017-04-11 20.28.26.png

Step 7: The string next to the key "id" in "owner" is the User ID

Know your query id

Step 1: Open chrome Step 2: Login from https://www.instagram.com/ Step 3: Go to https://www.instagram.com/taylorswift/ Step 4: Open the developer tool Step 5: Open Network Step 6: Click taylor's Followers Step 7: The? Query_id part of the request in ↓ is your query id (actually in the hidden part) query_id.png

What I learned about Query ID

・ It is always the same even if you log in again ・ If you change your account, it will be different. ・ You can't get those things from the official Instagram API. ・ Conclusion, I don't know what it is, but some ID that has a one-to-one relationship with the account.

That's it

Recommended Posts

Crawl Follower for an account with Instagram
Building an Anaconda environment for Python with pyenv
Various commands for building an environment with Apache
Try building an environment for MayaPython with VisualStudioCode
Get an Access Token for your service account with the Firebase Admin Python SDK
Procedure for creating an application with Django with Pycharm ~ Preparation ~
[Linux] WSL2 Build an environment for laravel7 with Ubuntu 20.04
Building an environment for natural language processing with Python
Create an environment for test automation with AirtestIDE (Tips)
Rebuilding an environment for machine learning with Miniconda (Windows version)
Create a child account for connect with Stripe in Python
Create an environment for "Deep Learning from scratch" with Docker
Turn an array of strings with a for statement (Python3)