Monday, February 27, 2017

Creating a connection to a subscription site in python

Leave a Comment

I am looking to open a connection with python to http://www.horseandcountry.tv which takes my login parameters via the POST method. I would like to open a connection to this website in order to scrape the site for all video links (this, I also don't know how to do yet but am using the project to learn).

My question is how do I pass my credentials to the individual pages of the website? For example if all I wanted to do was use python code to open a browser window pointing to http://play.horseandcountry.tv/live/ and have it open with me already logged in, how do I go about this?

3 Answers

Answers 1

As far as I know you have two options depending how you want to crawl and what you need to crawl:

1) Use urllib. You can do your POST request with the necessary login credentials. This is the low level solution, which means that this is fast, but doesn't handle high level stuff like javascript codes.

2) Use selenium. Whith that you can simulate a browser (Chrome, Firefox, other..), and run actions via your python code. Then it is much slower but works well with too "sophisticated" websites.

What I usually do: I try the first option and if a encounter a problem like a javascript security layer on the website, then go for option 2. Moreover, selenium can open a real web browser from your desktop and give you a visual of your scrapping.

In any case, just goolge "urllib/selenium login to website" and you'll find what you need.

Answers 2

If you want to avoid using Selenium (opening web browsers), you can go for requests, it can login the website and grab anything you need in the background.

Here is how you can login to that website with requests.

import requests from bs4 import BeautifulSoup  #Login Form Data payload = {      'account_email': 'your_email',     'account_password': 'your_passowrd',     'submit':   'Sign In' }  with requests.Session() as s:     #Login to the website.     response = s.post('https://play.horseandcountry.tv/login/', data=payload)      #Check if logged in successfully     soup = BeautifulSoup(response.text, 'lxml')     logged_in = soup.find('p', attrs={'class': 'navbar-text pull-right'})     print s.cookies     print response.status_code     if logged_in.text.startswith('Logged in as'):         print 'Logged In Successfully!' 

If you need explanations for this, you can check this answer, or requests documentation

Answers 3

You could also use the requests module. It is one the most popular. Here are some questions that relate to what you would like to do.

Log in to website using Python Requests module

logging in to website using requests

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment