Use python's pixivpy to download all the works of the users you follow from pixiv at once (including moving) (Part 2)

Introduction

Download all works of a specific user from pixiv at once using pixivpy of python (including moving) We've extended this further so that you can download all the works of the users you follow on pixiv. See this article for more details.

What i did

1. Preparation 2. Get all the ids of the users you are following 3. I tampered with the previous program a little and tried to put the acquired id more and more with for.

Note

pixivpy is probably a library made by Chinese volunteers and is not official. Therefore, it is recommended to use a method such as throwing away the user id and password at the time of login so that they can be leaked. Abuse is strictly prohibited. There is an implicit rule of scraping that you can only requery once per second. Without this, it would be a dos attack and legally out. Please comment out sleep at your own risk. If you have any problems, please do so at your own risk.

Operating environment

This program runs under windows10 environment. I don't know if it works under Linux or Mac environment. Works with the latest version of anaconda as of December 30, 2020

directory

Please prepare like this

.
├── pixiv_follow_id_getter.py
├── pixiv_all_follow_downloader.py
├── img
│   ├── 
├── client.json

The image looks like this image.png

Please change to this directory before running python. How to move

Library import

pip install pixivpy

If you cannot install it, open a command prompt as an administrator and try it. http://y-okamoto-psy1949.la.coocan.jp/Python/Install35win/

Library used: pixivpy Chinese (original) https://github.com/upbit/pixivpy Japanese https://github.com/tsubasa/pixiv.py

Create client.json

Please write the information you use to log in Copy and paste the contents below with Notepad etc., write your own information, and when saving, change the extension to .json instead of .txt and save it as client.json. The program of the previous article Downloading all the works of a specific user from pixiv at once using pixivpy of python (including moving) also works with this client.json. However, this program only works with this client.json

client.json



{
  "version": "20210101",
  "pixiv_id": "pixiv_id",
  "password": "password",
  "user_id": "user_id",
  "ids": [
  ]
}

ログイン.png "pixiv_id" and "password" are the character strings that you usually enter in the image. Please enclose it in "" like " pixiv_id ":" 12345 ".

"user_id" is your user id.

Inked画像1_LI.jpg

Go to your user's page and it's the last number in the url.

InkedDesktop Screenshot 2020.12.31 - 21.22.03.74-1_LI.jpg https://www.pixiv.net/users/11 If there is like, it is 11.

"version" only indicates the date of update, so you can use any number you like.

Get the id of the user you are following

pixiv_follow_id_getter.py


from pixivpy3 import *
import json
from time import sleep
import datetime




with open("client.json", "r") as f:
    client_info = json.load(f)



#pixivpy login process
api = PixivAPI()
api.login(client_info["pixiv_id"], client_info["password"])
aapi = AppPixivAPI()
aapi.login(client_info["pixiv_id"], client_info["password"])


#Get current follow id
ids_now = []
a = aapi.user_following(client_info["user_id"])
while True:
    try:
        for i in range(30):
            #This api only takes 30 at a time
            id = a.user_previews[i].user.id
            ids_now.append(id)
            print(id)
        next_qs = aapi.parse_qs(a.next_url)
        a = aapi.user_following(**next_qs)
        #Sleep before going to the next page
        sleep(1)
    except:
        break






#client.I want to write to json
client_info["ids"] = ids_now
client_info["version"] = datetime.datetime.now().strftime('%Y%m%d')

with open("client.json", "w") as n:
    json.dump(client_info, n, indent=2, ensure_ascii=False)
    


#I want to display a numerical value
print("The current total number of followers")
print(len(ids_now))


Please do this. The following id is written to id in client.json.

New downloader

pixiv_all_follow_downloader.py


from pixivpy3 import *
import json
import os
from PIL import Image
import glob
from time import sleep


#This is a setting for each individual to do

#Maximum number of works to download per person, if you want to download all, make it as large as possible (exceeding the number of works of one author)
#Downloaded in newest order
works=10
#Filter by number of bookmarks, set minimum, 0 for all
score=0
#Filter by number of views, set minimum value, 0 is all
view=0

#Filter by tag How to write target_tag = ["Fate/GrandOrder","FGO","FateGO","Fate/staynight"]
target_tag = [] #target_If you write more than one in the tag, download if there is at least one of them
target_tag2 = []#Further target_If you write in tag2, target_Satisfy tag and target_Download only those that satisfy tag2
extag = ["R-18"]#Do not download if even one tag in extag is included

#Directory to save images
main_saving_direcory_path = "./img/"




#Pre-processing

#Read the file with the account information created in advance
with open("client.json", "r") as f:
    client_info = json.load(f)

#pixivpy login process
api = PixivAPI()
api.login(client_info["pixiv_id"], client_info["password"])
aapi = AppPixivAPI()
aapi.login(client_info["pixiv_id"], client_info["password"])




#About downloading from here
def downloader(id_search):
    #Sleep before fetching user data
    sleep(1)
    
    #Get the number of works with the entered id
    illustrator_id = api.users_works(id_search, per_page=works)

    
    #To avoid people who haven't posted pictures on their account alone
    if not illustrator_id.count == 0:
        total_works = illustrator_id.pagination.total
        if works < total_works:
            total_works=works
    
        #Get the data of the first work to get the user's information
        illust = illustrator_id.response[0]
    
        #Since the folder is created with the user name, replace the characters that cannot be used in the windows folder name.
        username = illust.user.name.translate(str.maketrans({'/': '_' , ':': ':', ',': '_', ';': ';' ,'*': '_', '?': '?', '"': "'", '>': ')', '<': '(', '|': '|'}))
        username = username.rstrip(".")
        username = username.lstrip(".")
        username = username.rstrip(" ")
        username = username.rstrip(" ")
    
        #Create a folder in the form of user name (id)
        
        saving_direcory_path = main_saving_direcory_path + username + ("(") +str(illust.user.id) + (")") + "/"
    
        #Change the folder name to the latest user name
        saving_direcory_name = saving_direcory_path[:-1]
        present_folder_list = glob.glob(main_saving_direcory_path + "*")
    
        for present_dir in present_folder_list:
            num = present_dir.rsplit("(", 1)[-1][:-1]
            #print(num)
            name = present_dir.rsplit("\\", 1)[-1]
            name = name.rsplit("(", 1)[0]
            #print(name)
            #print("--------------------------------------")
            if num == str(illust.user.id) and username != name:
                print(present_dir + "Next change" + saving_direcory_name)
                print("--------------------------------------")
                os.rename(present_dir, saving_direcory_name)
    
    
    
    
    
        #Create if the folder does not exist
        if not os.path.exists(saving_direcory_path):
            os.mkdir(saving_direcory_path)
        separator = "------------------------------------------------------------"
    
        #Display information of illustrator and the number of illustrations
        print("Illustrator: {}".format(illust.user.name))
        print("Works number: {}".format(illustrator_id.pagination.total))
        print(separator)
    
    
    
    
        #Download
        for work_no in range(0, total_works):
            illust = illustrator_id.response[work_no]
    
            #filter
    
            #Filter by tag
            if len(list(set(target_tag)&set(illust.tags))) == 0 and target_tag != []:
                continue
            if len(list(set(target_tag2)&set(illust.tags))) == 0 and target_tag2 != []:
                continue
            #Skip if even one ex tag is included
            if len(list(set(extag)&set(illust.tags))) > 0 :
                continue
            #Download only works above score
            if illust.stats.favorited_count.private + illust.stats.favorited_count.public < score :
                continue
            #Download only works above view
            if illust.stats.views_count < view :
                continue
    
    
    
            #if the illustration has already downloaded, skip downloading it
            #You can post to pixiv png,jpg,gif,Ugoira
            #Here, it is judged only if the first page is downloaded, so it is not clear whether the communication is interrupted during the download and the second and subsequent pages are not downloaded.
    
            if os.path.exists(saving_direcory_path+str(illust.id)+"_p0.png ") or os.path.exists(saving_direcory_path+str(illust.id)+"_p0.jpg ") or os.path.exists(saving_direcory_path+str(illust.id)+'_ugoira') or os.path.exists(saving_direcory_path+str(illust.id)+"_p0.gif"):
                print("Title:"+str(illust.title)+" has already downloaded.")
                print(separator)
                continue
    
            #Sleep before download
            sleep(1)
            print("Now: {0}/{1}".format(work_no + 1, total_works))
            print("Title: {}".format(illust.title))
    
    
            #Ugoira
            if illust.type == "ugoira":
            #Waiting for illustration ID input
                illust_id = illust.id
                ugoira_url = aapi.illust_detail(illust_id).illust.meta_single_page.original_image_url.rsplit('0', 1)
                ugoira = aapi.ugoira_metadata(illust_id)
                ugoira_frames = len(ugoira.ugoira_metadata.frames)
                ugoira_delay = ugoira.ugoira_metadata.frames[0].delay
                dir_name = saving_direcory_path + str(illust_id)+'_ugoira'
    
    
                #Create a folder to save the movement
                if not os.path.isdir(dir_name):
                    os.mkdir(dir_name)
    
                #Download all images used in Ugoira
                for frame in range(ugoira_frames):
                    #Sleep during download
                    sleep(1)
                    frame_url = ugoira_url[0] + str(frame) + ugoira_url[1]
                    aapi.download(frame_url, path=dir_name)
    
    
                #Create a gif based on the saved image
                #The image deteriorates considerably when making a gif due to lack of power
                #I searched for a gif creation library in python, but couldn't find one that could set the degree of compression.
                frames = glob.glob(f'{dir_name}/*')
                frames.sort(key=os.path.getmtime, reverse=False)
                ims = []
                for frame in frames:
                    ims.append(Image.open(frame))
                ims[0].save(f'{dir_name}/{illust_id}.gif', save_all=True, append_images=ims[1:], optimize=False, duration=ugoira_delay, loop=0)
    
            # illustrations with more than one picture
            elif illust.is_manga:
                work_info = api.works(illust.id)
                for page_no in range(0, work_info.response[0].page_count):
                    #Sleep during download
                    sleep(1)
                    page_info = work_info.response[0].metadata.pages[page_no]
                    aapi.download(page_info.image_urls.large, saving_direcory_path)
    
            # illustrations with only one picture
            else:
                aapi.download(illust.image_urls.large, saving_direcory_path)
    
            print(separator)
    
    
        print("Download complete! Thanks to {0}{1}!!".format(illust.user.id, illust.user.name))


followids = client_info["ids"]

for id_search in followids:
    try:
        downloader(id_search)
    except:
        print("erro")

For the initial settings etc., the previous article Download all the works of a specific user from pixiv at once using python's pixivpy (including moving) Please see. By default, 10 works are downloaded per person in the order of newest among the works that do not include R-18.

pixiv_all_follow_downloader.Near the beginning of py


#This is a setting for each individual to do

#Maximum number of works to download per person, if you want to download all, make it as large as possible (exceeding the number of works of one author)
#Downloaded in newest order
works=10
#Filter by number of bookmarks, set minimum, 0 for all
score=0
#Filter by number of views, set minimum value, 0 is all
view=0
#Filter by tag How to write target_tag = ["Fate/GrandOrder","FGO","FateGO","Fate/staynight"]
target_tag = [] #target_If you write more than one in the tag, download if there is at least one of them
target_tag2 = []#Further target_If you write in tag2, target_Satisfy tag and target_Download only those that satisfy tag2
extag = ["R-18"]#Do not download if even one tag in extag is included

How to use

  1. Write various things in client.json (first time only)
  2. Execute pixiv_follow_id_getter.py (even when the number of following users increases)
  3. Execute pixiv_all_follow_downloader.py (when you want to download)

It takes a lot of time to download for the first time. In addition, communication may be disconnected from the pixiv side and erro may appear. In such a case, please wait a while and try again.

Summary

This time, I explained how to use python to filter and download user's work. The reason I started making this was because a painter I liked a few years ago had erased all the works. At that time, I downloaded all of them, but I thought that I wouldn't have enough time manually, so I created it. You can use this program to back up your work. I've been using it myself for two years, fixing bugs and making gif downloads available. In the early days, if the user changed the name, the entire work might be downloaded again, which was difficult. I thought it was completed recently and released it. Please comment if you have any features or bugs you would like us to add. Thank you very much.

from now on

With this method, the server of pixiv will be overloaded every time, so I want to modify it so that it will be downloaded from new arrivals.

Recommended Posts

Use python's pixivpy to download all the works of the users you follow from pixiv at once (including moving) (Part 2)
Use python's pixivpy to download all the works of a specific user from pixiv at once (including moving)
Convert pixiv to mp4 and download from pixiv using python's pixivpy
The story of moving from Pipenv to Poetry
I made a tool to get the answer links of OpenAI Gym all at once
[Part 4] Use Deep Learning to forecast the weather from weather images
[Part 1] Use Deep Learning to forecast the weather from weather images
Download the top 10 views from one Youtube channel at once
[Part 2] Use Deep Learning to forecast the weather from weather images
After all, the story of returning from Linux to Windows