Most automated queries use HTTP requests (the dominant protocol for web queries).
Python's Requests Library:
import requests
response = requests.get("<http://www.datasciencecourse.org>")
# Check status and content
response.status_code
response.content # or response.text
response.headers
response.headers['Content-Type']
Reference: Requests Documentation
When you see a URL with extra parameters (e.g., from Google), you can provide them with the params
argument:
params = {"sa": "t", "rct": "j", "q": "", "esrc": "s", "source": "web", "cd": "9", "cad": "rja", "uact": "8"}
response = requests.get("<http://www.google.com/url>", params=params)
HTTP GET is most common; methods like PUT, POST, and DELETE also exist.
Key Points:
Example Query with REST API:
token = "" # Your access token here
response = requests.get("<https://api.github.com/user>", params={"access_token": token})
print(response.content)
Traditionally used, but many APIs now use OAuth.
Example:
response = requests.get("<https://api.github.com/user>", auth=('username', 'passwd'))
Note: Basic auth is largely being replaced by OAuth.