In the following, we will use Qiita as an example.
The source of the Qiita login form is as follows:
<form class="landingLoginForm" autocomplete="off" data-event_name="Login with password" action="/login" accept-charset="UTF-8" method="post">
<input name="utf8" type="hidden" value="✓">
<input type="hidden" name="authenticity_token" value="rYIMTVoDlb4eCzh6wZIRgiPQHmrr5ts9DyykDCE9FkHM7zQxX7WAhmUhW8y0BPIA3MvzH31KCFEyiPTVk4GBzQ==">
<input type="text" name="identity" id="identity" placeholder="Username or email" autofocus="autofocus" class="form-control landingLoginForm_identity">
<div class="row">
<div class="col-sm-9 landingLoginForm_passwordColumn">
<input type="password" name="password" id="password" placeholder="Password" class="form-control">
</div>
<div class="col-sm-3 landingLoginForm_submitColumn">
<input type="submit" name="commit" value="Login" class="btn btn-primary btn-block" data-disable-with="Login">
</div>
</div>
<div class="landingLoginForm_forgotPassword">
<a href="https://qiita.com/sessions/forgot_password">Forgot Password?</a>
</div>
<div class="help-block js-email-invalid-message" style="display: none"></div>
</form>
これより、ログインボタンを押すとqiita.com/login
に
utf8
authenticity_token
identity
password
It can be seen that the data is transmitted. Of these, utf8
is given a default value, so use it, and authenticity_token
is given a value for each access, so you need to get it. Enter the information by hand in the remaining identity
and `` `password```.
Python has a convenient library for HTTP communication called requests, so it's safe to use it. After logging in, it is necessary to save the value for session, so communicate via the `` `Session``` object.
Also, as mentioned above, it is necessary to obtain authenticity_token
before logging in, so first access the top page and BeautifulSoup. Get the value with /).
Below is a sample script to log in to Qiita:
from bs4 import BeautifulSoup
import requests
payload = {
'utf8': '✓',
'identity': 'username or email',
'password': 'secret'
}
# authenticity_Get token
s = requests.Session()
r = s.get('https://qiita.com')
soup = BeautifulSoup(r.text)
auth_token = soup.find(attrs={'name': 'authenticity_token'}).get('value')
payload['authenticity_token'] = auth_token
#Login
s.post('https://qiita.com/login', data=payload)
#Perform the processing that is possible after logging in below
...
In some cases, the above simple method does not work (for example, two-step authentication is required to log in to Google), so it is necessary to check the HTTP header information etc. in detail each time. ..
Recommended Posts