Python Docs

If this documentation is inaccurate or incomplete, please let us know your questions or suggested improvements.

Using Python, you can take full advantage of PhantomJs Cloud by making HTTP Post Requests to our HTTP Endpoint. Below are some examples outlining how to invoke the HTTP Endpoint from Python.

General Info

Avoiding 502 (Bad Gateway) Errors IMPORTANT

If you use Curl (pycurl) with Python, you need to set unset the Accept Header. You can do so like:

pycurl_connect.setopt(pycurl.HTTPHEADER, ['Accept:'])
If you do not, you will get 502 Bad Gateway errors with your curl requests. The errors won't occur when your POST payload is small, but medium and large POST payloads will always get this 502 error.

If you follow the examples below you can ignore this suggestion, as these examples use Python without 3rd party libraries (not using Curl!).

Post request content: (request.json)

In the following examples, we use Python to submit POST requests, sending a request.json file as the content. This file follows either the UserRequest or PageRequest specification, which can be viewed from our HTTP Endpoint docs. The default parameters (for pageRequest parameters you do not explicitly set) can be viewed here.

Basic Examples

Minimal Example: Capture a Page as PlainText
This Python example captures the text from a page and prints to the command-line.

request.json
(in pageRequest format)
{
	"url":"http://example.com",
	"renderType":"plainText"
}

python code
import urllib2

with open('request.json', 'r') as myfile:
    data=myfile.read()

url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()

print(results)

Minimal Example: Verbose output JSON
This Python example Captures the text from the page and PhantomJs Cloud metadata about the request and response, then saves to a .json file.

request.json
(in pageRequest format)
{
	"url":"http://example.com",
	"renderType":"plainText",
	"outputAsJson":true
}

python code
import urllib2

with open('request.json', 'r') as requestFile:
    data=requestFile.read()

url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()

with open('response.json', 'w') as responseFile:
	responseFile.write(results)

Minimal Example: Capture as JPEG
This Python example will generate a screenshot of a webpage (including rendering AJAX), then saves to a .jpg file.

request.json
(in pageRequest format)
{
	"url":"http://www.highcharts.com/stock/demo/intraday-area",
	"renderType":"jpeg"
}

python code
import urllib2
with open('request.json', 'r') as requestFile:
    data=requestFile.read()
url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()
print('\nresponse status code')
print(response.code)
print('\nresponse headers (pay attention to pjsc-* headers)')
print(response.headers)
with open('content.jpg', 'wb') as responseFile:
	responseFile.write(results)

Minimal Example: Capture as PDF
This Python example will generate a PDF of Amazon.com's main page (the live, current contents), then saves to a .pdf file.

request.json
(in pageRequest format)
{
	"url":"https://amazon.com",
	"renderType":"pdf"
}

python code
import urllib2
with open('request.json', 'r') as requestFile:
    data=requestFile.read()
url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()
print('\nresponse status code')
print(response.code)
print('\nresponse headers (pay attention to pjsc-* headers)')
print(response.headers)
with open('content.pdf', 'wb') as responseFile:
	responseFile.write(results)

Minimal Example: All parameters
This Python example shows using all PhantomJs Cloud parameters in a request, capturing the page as a .jpg image, returning metadata (json) and the image (base64 encoded), and then saves the image to a file. Most the parameters used are the defaults, but you can see a list of the most up-to-date default values here.

request.json
(in userRequest format)
{
	"pages":[
		{
			"url": "http://example.com",
			"content": null,
			"urlSettings": {
				"operation": "GET",
				"encoding": "utf8",
				"headers": {},
				"data": null
			},
			"renderType": "jpg",
			"outputAsJson": true,
			"requestSettings": {
				"ignoreImages": false,
				"disableJavascript": false,
				"userAgent": "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/534.34 (KHTML, like Gecko) Safari/534.34 PhantomJS/2.0.0 (PhantomJsCloud.com/2.0.1)",
				"authentication": {
					"userName": "guest",
					"password": "guest"
				},
				"xssAuditingEnabled": false,
				"webSecurityEnabled": false,
				"resourceWait": 15000,
				"resourceTimeout": 35000,
				"maxWait": 35000,
				"waitInterval": 1000,
				"stopOnError": false,
				"resourceModifier": [],
				"customHeaders": {},
				"clearCache": false,
				"clearCookies": false,
				"cookies": [],
				"deleteCookies": []
			},
			"suppressJson": [
				"events.value.resourceRequest.headers",
				"events.value.resourceResponse.headers",
				"frameData.content",
				"frameData.childFrames"
			],
			"renderSettings": {
				"quality": 70,
				"pdfOptions": {
					"border": null,
					"footer": {
						"firstPage": null,
						"height": "1cm",
						"lastPage": null,
						"onePage": null,
						"repeating": "%pageNum%/%numPages%"
					},
					"format": "letter",
					"header": null,
					"height": null,
					"orientation": "portrait",
					"width": null
				},
				"clipRectangle": null,
				"renderIFrame": null,
				"viewport": {
					"height": 1280,
					"width": 1280
				},
				"zoomFactor": 1,
				"passThroughHeaders": false
			},
			"scripts": {
				"domReady": [],
				"loadFinished": []
			}
		}
	],
	"proxy":false
}

python code
import urllib2
import json
import pprint
pp = pprint.PrettyPrinter(indent=4)
with open('request.json', 'r') as requestFile:
    data=requestFile.read()
url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()
obj = json.loads(results)
print('\nresponse status code')
print(response.code)
print('\nrphantomjs cloud metadata')
pp.pprint(obj['meta'])
with open('content.jpg', 'wb') as responseFile:
	responseFile.write(obj['content']['data'].decode('base64'))

Need more advanced examples?

Check out the Advanced Scenario Samples section of the Docs Index Page for Page Automation, Auto-Login, and more.