I am trying to use the curl function in php to login to a specific page. Please check the code below. I connect with my email and password at banggood.com and then i would like to redirect to another private page but it does not work as expected. I get no errors. I am redirected to this page instead ( https://www.banggood.com/index.php?com=account ) using the code below. After i login i want to access a private page where my orders exist. Any help appreciated.
//The username or email address of the account. define('EMAIL', 'aaa@gmail.com'); //The password of the account. define('PASSWORD', 'mypassword'); //Set a user agent. This basically tells the server that we are using Chrome ;) define('USER_AGENT', 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.2309.372 Safari/537.36'); //Where our cookie information will be stored (needed for authentication). define('COOKIE_FILE', 'cookie.txt'); //URL of the login form. define('LOGIN_FORM_URL', 'https://www.banggood.com/login.html'); //Login action URL. Sometimes, this is the same URL as the login form. define('LOGIN_ACTION_URL', 'https://www.banggood.com/login.html'); //An associative array that represents the required form fields. //You will need to change the keys / index names to match the name of the form //fields. $postValues = array( 'email' => EMAIL, 'password' => PASSWORD ); //Initiate cURL. $curl = curl_init(); //Set the URL that we want to send our POST request to. In this //case, it's the action URL of the login form. curl_setopt($curl, CURLOPT_URL, LOGIN_ACTION_URL); //Tell cURL that we want to carry out a POST request. curl_setopt($curl, CURLOPT_POST, true); //Set our post fields / date (from the array above). curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($postValues)); //We don't want any HTTPS errors. curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); //Where our cookie details are saved. This is typically required //for authentication, as the session ID is usually saved in the cookie file. curl_setopt($curl, CURLOPT_COOKIEJAR, COOKIE_FILE); //Sets the user agent. Some websites will attempt to block bot user agents. //Hence the reason I gave it a Chrome user agent. curl_setopt($curl, CURLOPT_USERAGENT, USER_AGENT); //Tells cURL to return the output once the request has been executed. curl_setopt($curl, CURLOPT_RETURNTRANSFER, true); //Allows us to set the referer header. In this particular case, we are //fooling the server into thinking that we were referred by the login form. curl_setopt($curl, CURLOPT_REFERER, LOGIN_FORM_URL); //Do we want to follow any redirects? curl_setopt($curl, CURLOPT_FOLLOWLOCATION, false); //Execute the login request. curl_exec($curl); //Check for errors! if(curl_errno($curl)){ throw new Exception(curl_error($curl)); } //We should be logged in by now. Let's attempt to access a password protected page curl_setopt($curl, CURLOPT_URL, 'https://www.banggood.com/index.php?com=account&t=ordersList'); //Use the same cookie file. curl_setopt($curl, CURLOPT_COOKIEJAR, COOKIE_FILE); //Use the same user agent, just in case it is used by the server for session validation. curl_setopt($curl, CURLOPT_USERAGENT, USER_AGENT); //We don't want any HTTPS / SSL errors. curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); //Execute the GET request and print out the result. curl_exec($curl);
3 Answers
Answers 1
You're doing several things wrong:
You're trying to login before you have a cookie session, but the site requires you to have a cookie session before sending the login request.
There's an CSRF token tied to your cookie session, here called
at
, that you need to parse out from the login page html and provide with your login request, which your code doesn't fetch.Most importantly, there is a captcha image tied to your cookie session that you need to fetch and solve, and who's text you need to append to your login request, which your code is completely ignoring.
Your login request needs the header
x-requested-with: XMLHttpRequest
- but your code isn't adding that header.Your login request needs the fields
com=account
andt=submitLogin
fields in the POST data, but your code isn't adding either of them (you try to add them to your URL, but they're not supposed to be in the url, they're supposed to be in the POST data, aka your $postValues array, not the url)
Here's what you need to do:
- First do a normal GET request to the login page. This will give you a session cookie id, the CSRF token, and the url to your captcha image.
- Store the cookie id and make sure to provide it with all further requests, then parse out the csrf token (it's in the html looking like
<input type="hidden" name="at" value="5aabxxx5dcac0" />
), and the url for the captcha image (its different for each cookie session, so don't hardcode it). - Then fetch the captcha image, solve it, and add them all to your login request's POST data, along with the username, password, captcha answer,
com
andt
, and add the http headerx-requested-with: XMLHttpRequest
to the login request too, send it tohttps://www.banggood.com/login.html
, then you should be logged in!
Here's an example implementation using hhb_curl for the web requests (it's a curl_ wrapper taking care of cookies, turning silent curl_ errors into RuntimeExceptions, etc), DOMDocument for parsing out the CSRF token, and deathbycaptcha.com's api for breaking the captcha.
Ps: the example code won't work until you provide a real credited deathbycaptcha.com api username/password on line 6 and 7, also the captcha looks so simple that I think breaking it could be automated if you're sufficiently motivated, I'm not. Also, the banggood account is just a temporary testing account, no harm comes of it being compromised, which obviously happens when I post the username/password here)
<?php declare(strict_types = 1); require_once ('hhb_.inc.php'); $banggood_username = 'igcpilojhkfhtdz@my10minutemail.com'; $banggood_password = 'igcpilojhkfhtdz@my10minutemail.com'; $deathbycaptcha_username = '?'; $deathbycaptcha_password = '?'; $hc = new hhb_curl ( '', true ); $html = $hc->exec ( 'https://www.banggood.com/login.html' )->getStdOut (); $domd = @DOMDocument::loadHTML ( $html ); $xp = new DOMXPath ( $domd ); $csrf_token = $xp->query ( '//input[@name="at"]' )->item ( 0 )->getAttribute ( "value" ); $captcha_image_url = 'https://www.banggood.com/' . $domd->getElementById ( "get_login_image" )->getAttribute ( "src" ); $captcha_image = $hc->exec ( $captcha_image_url )->getStdOut (); $captcha_answer = deathbycaptcha ( $captcha_image, $deathbycaptcha_username, $deathbycaptcha_password ); $html = $hc->setopt_array ( array ( CURLOPT_POST => 1, CURLOPT_POSTFIELDS => http_build_query ( array ( 'com' => 'account', 't' => 'submitlogin', 'email' => $banggood_username, 'pwd' => $banggood_password, 'at' => $csrf_token, 'login_image_code' => $captcha_answer ) ), CURLOPT_HTTPHEADER => array ( 'x-requested-with: XMLHttpRequest' ) ) )->exec ()->getStdOut (); var_dump ( // $hc->getStdErr (), $html ); function deathbycaptcha(string $imageBinary, string $apiUsername, string $apiPassword): string { $hc = new hhb_curl ( '', true ); $response = $hc->setopt_array ( array ( CURLOPT_URL => 'http://api.dbcapi.me/api/captcha', CURLOPT_POST => 1, CURLOPT_HTTPHEADER => array ( 'Accept: application/json' ), CURLOPT_POSTFIELDS => array ( 'username' => $apiUsername, 'password' => $apiPassword, 'captchafile' => 'base64:' . base64_encode ( $imageBinary ) // use base64 because CURLFile requires a file, and i cba with tmpfile() .. but it would save bandwidth. ), CURLOPT_FOLLOWLOCATION => 0 ) )->exec ()->getStdOut (); $response_code = $hc->getinfo ( CURLINFO_HTTP_CODE ); if ($response_code !== 303) { // some error $err = "DeathByCaptcha api retuned \"$response_code\", expected 303, "; switch ($response_code) { case 403 : $err .= " the api username/password was rejected"; break; case 400 : $err .= " we sent an invalid request to the api (maybe the API specs has been updated?)"; break; case 500 : $err .= " the api had an internal server error"; break; case 503 : $err .= " api is temorarily unreachable, try again later"; break; default : { $err .= " unknown error"; break; } } $err .= ' - ' . $response; throw new \RuntimeException ( $err ); } $response = json_decode ( $response, true ); if (! empty ( $response ['text'] ) && $response ['text'] !== '?') { return $response ['text']; // sometimes the answer might be available right away. } $id = $response ['captcha']; $url = 'http://api.dbcapi.me/api/captcha/' . urlencode ( $id ); while ( true ) { sleep ( 10 ); // check every 10 seconds $response = $hc->setopt ( CURLOPT_HTTPHEADER, array ( 'Accept: application/json' ) )->exec ( $url )->getStdOut (); $response = json_decode ( $response, true ); if (! empty ( $response ['text'] ) && $response ['text'] !== '?') { return $response ['text']; } } }
Answers 2
You should set CURLOPT_FOLLOWLOCATION
to 1
.
From https://curl.haxx.se/libcurl/c/CURLOPT_FOLLOWLOCATION.html
A long parameter set to 1 tells the library to follow any Location: header that the server sends as part of a HTTP header in a 3xx response. The Location: header can specify a relative or an absolute URL to follow.
Answers 3
Set CURLOPT_FOLLOWLOCATION
to 1 or true, you may also need CURLOPT_AUTOREFERER
instead of the static REFERER.
Do you get some cookies into your COOKIEJAR
(cookie.txt) ? Remember that the file must already exists and PHP needs write permissions.
If you have PHP executing on localhost then a Network sniffer tool could help debug the problem, try with Wireshark or some equivalent software. Because maybe the request still miss some important HTTP Headers like Host
0 comments:
Post a Comment