-Introduction -[What is 2Captcha](What is # 2captcha) -[Preparing to use 2Captcha](Preparing to use # 2captcha) -[Python + Selenium + 2Captcha breaks through "reCAPTCHAv2"](# pythonselenium2captcha breaks through recaptchav2) -[Saigo ni](# Saigo ni) -Reference
I think the biggest difficulty in the task of automating scraping and browser operation is breaking through various captures. In the first place, the capture function is installed because it is not operated by the robot, so I'm wondering what happens when I try to break through it, but there are times when I still want to do something about it. There is a service called "** 2 Captcha **" as a solution in such a case.
I recently learned about this service and tried to use it, and it was so easy to break through the capture, so I will introduce it here.
It is a service to break through the capture function provided by a Russian company. 2 Captcha's API can be used to automate the capture process. Although it is a paid service, the fee for one API request is about 0.3 yen, so I think it is a sufficiently cheap amount.
With a service called 2Captcha, you can break through the difficult capture function with overwhelming human wave tactics. When the user uses the API of 2Captcha to send the information of the capture that he wants to cancel, a large number of workers somewhere will cancel the capture and return the necessary information.
2Captcha provides libraries in multiple program languages as a way to use the API more easily.
Go to https://2captcha.com/
Register for an account from the "Registration" button on the upper right.
Set your e-mail address and password and registration is complete.
When you log in, you will see the following page.
Unfortunately 2Captcha is not available for free. After logging in, make a deposit from "Add funds" at the top of the screen.
Select an available payment service and set the amount. I paid with PayPal.
For the time being, let's deposit the minimum deposit of 3 $.
When the payment is completed, the original screen display should change to 3 $. (It seems that it may take some time depending on the payment method.)
After logging in, the API key is displayed in the center of the screen. 2 Make a copy as it is necessary for using Captcha.
Let's break through reCAPTCHA v2 using Python.
A package for Python is available, so install it first.
pip install 2captcha-python
In addition, the following test handles Headless Chrome with Selenium. For preparation of Selnium, please refer to this article.
This time I would like to test 2Captcha using this demo page. https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox.php
import traceback
import chromedriver_binary
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from twocaptcha import TwoCaptcha
solver = TwoCaptcha('YOUR_API_KEY') #Please set your own API key
url = 'https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox.php'
def main():
#Launch browser
options = Options()
options.add_argument('--headless')
driver = webdriver.Chrome(options=options)
try:
#Page access
driver.get(url)
# data-Get the value of the sitekey attribute
data_sitekey = driver.find_element_by_css_selector('[data-sitekey]').get_attribute('data-sitekey')
#2 Get the unlock code with Captcha
response = solver.recaptcha(sitekey=data_sitekey, url=url)
code = response['code']
#Enter the unlock code in the specified textarea
textarea = driver.find_element_by_id('g-recaptcha-response')
driver.execute_script(f'arguments[0].value = "{code}";', textarea)
#Button click
driver.find_element_by_css_selector('button[type="submit"]').click()
#Result display(success:"Success!",Failure:"Something went wrong")
result = driver.find_element_by_css_selector('body>main>h2:nth-child(3)').text
print(result)
except BaseException:
print(traceback.format_exc())
driver.quit()
if __name__ == '__main__':
main()
Execution result: Success!
The response of 2Captcha took about 5 to 20 seconds, but I was able to break through reCAPTCHA.
What did you think. This time I tried to break through Google's reCAPTCHA v2, but it seems that it also supports reCAPTCHA v3 and capture functions other than Google. The bottleneck is that it costs a little money, but it seems to be useful to have it as an option when absolutely necessary.