Python Docs

If this documentation is inaccurate or incomplete, please let us know your questions or suggested improvements.

Using Python, you can take full advantage of PhantomJsCloud by making HTTP Post Requests to our HTTP Endpoint. Below are some examples outlining how to invoke the HTTP Endpoint from Python.

General Info

Avoiding `502 (Bad Gateway)` Errors IMPORTANT

If you use Curl (pycurl) with Python, you need to set unset the Accept Header. You can do so like:

pycurl_connect.setopt(pycurl.HTTPHEADER, ['Accept:'])

If you do not, you will get 502 Bad Gateway errors with your curl requests. The errors won't occur when your POST payload is small, but medium and large POST payloads will always get this 502 error.

If you follow the examples below you can ignore this suggestion, as these examples use Python without 3rd party libraries (not using Curl!).

Post request content: (`request.json`)

In the following examples, we use Python to submit POST requests, sending a request.json file as the content. This file follows either the UserRequest or PageRequest specification, which can be viewed from our HTTP Endpoint docs. The default parameters (for pageRequest parameters you do not explicitly set) can be viewed here.

Basic Examples

Minimal Example: Capture a Page as PlainText

This Python example captures the text from a page and prints to the command-line.

`request.json`

(in pageRequest format)

{
	"url":"http://example.com",
	"renderType":"plainText"
}

`python code using builtin requests code`

import requests
import json

data ={
"url":"http://example.com",
"renderType":"plainText"
}

url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/';
req = requests.post(url, data=json.dumps(data))
results = req.content

print(results)

`python code using urllib2`

import urllib2

with open('request.json', 'r') as myfile:
    data=myfile.read()

url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()

print(results)

Minimal Example: Verbose output JSON

This Python example Captures the text from the page and PhantomJsCloud metadata about the request and response, then saves to a .json file.

`request.json`

(in pageRequest format)

{
	"url":"http://example.com",
	"renderType":"plainText",
	"outputAsJson":true
}

`python code`

import urllib2

with open('request.json', 'r') as requestFile:
    data=requestFile.read()

url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()

with open('response.json', 'w') as responseFile:
	responseFile.write(results)

Minimal Example: Capture as JPEG

This Python example will generate a screenshot of a webpage (including rendering AJAX), then saves to a .jpg file.

`request.json`

(in pageRequest format)

{
	"url":"http://www.highcharts.com/stock/demo/intraday-area",
	"renderType":"jpeg"
}

`python code`

import urllib2
with open('request.json', 'r') as requestFile:
    data=requestFile.read()
url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()
print('\nresponse status code')
print(response.code)
print('\nresponse headers (pay attention to pjsc-* headers)')
print(response.headers)
with open('content.jpg', 'wb') as responseFile:
	responseFile.write(results)

Minimal Example: Capture as PDF

This Python example will generate a PDF of Amazon.com's main page (the live, current contents), then saves to a .pdf file.

`request.json`

(in pageRequest format)

{
	"url":"https://amazon.com",
	"renderType":"pdf"
}

`python code`

import urllib2
with open('request.json', 'r') as requestFile:
    data=requestFile.read()
url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()
print('\nresponse status code')
print(response.code)
print('\nresponse headers (pay attention to pjsc-* headers)')
print(response.headers)
with open('content.pdf', 'wb') as responseFile:
	responseFile.write(results)

Minimal Example: All parameters

This Python example shows using all PhantomJsCloud parameters in a request, capturing the page as a .jpg image, returning metadata (json) and the image (base64 encoded), and then saves the image to a file. Most the parameters used are the defaults, but you can see a list of the most up-to-date default values here.

`request.json`

(in userRequest format)

{
	"pages":[
		{
			"url": "http://example.com",
			"content": null,
			"urlSettings": {
				"operation": "GET",
				"encoding": "utf8",
				"headers": {},
				"data": null
			},
			"renderType": "jpg",
			"outputAsJson": true,
			"requestSettings": {
				"ignoreImages": false,
				"disableJavascript": false,
				"userAgent": "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/534.34 (KHTML, like Gecko) Safari/534.34 PhantomJS/2.0.0 (PhantomJsCloud.com/2.0.1)",
				"authentication": {
					"userName": "guest",
					"password": "guest"
				},
				"xssAuditingEnabled": false,
				"webSecurityEnabled": false,
				"resourceWait": 15000,
				"resourceTimeout": 35000,
				"maxWait": 35000,
				"waitInterval": 1000,
				"stopOnError": false,
				"resourceModifier": [],
				"customHeaders": {},
				"clearCache": false,
				"clearCookies": false,
				"cookies": [],
				"deleteCookies": []
			},
			"suppressJson": [
				"events.value.resourceRequest.headers",
				"events.value.resourceResponse.headers",
				"frameData.content",
				"frameData.childFrames"
			],
			"renderSettings": {
				"quality": 70,
				"pdfOptions": {
					"border": null,
					"footer": {
						"firstPage": null,
						"height": "1cm",
						"lastPage": null,
						"onePage": null,
						"repeating": "%pageNum%/%numPages%"
					},
					"format": "letter",
					"header": null,
					"height": null,
					"orientation": "portrait",
					"width": null
				},
				"clipRectangle": null,
				"renderIFrame": null,
				"viewport": {
					"height": 1280,
					"width": 1280
				},
				"zoomFactor": 1,
				"passThroughHeaders": false
			},
			"scripts": {
				"domReady": [],
				"loadFinished": []
			}
		}
	],
	"proxy":false
}

`python code`

import urllib2
import json
import pprint
pp = pprint.PrettyPrinter(indent=4)
with open('request.json', 'r') as requestFile:
    data=requestFile.read()
url = 'http://PhantomJScloud.com/api/browser/v2/a-demo-key-with-low-quota-per-ip-address/'
headers = {'content-type':'application/json'}
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
results = response.read()
obj = json.loads(results)
print('\nresponse status code')
print(response.code)
print('\nrPhantomJsCloud metadata')
pp.pprint(obj['meta'])
with open('content.jpg', 'wb') as responseFile:
	responseFile.write(obj['content']['data'].decode('base64'))

Need more advanced examples?

Check out the Advanced Scenario Samples section of the Docs Index Page for Page Automation, Auto-Login, and more.

Python Docs

If this documentation is inaccurate or incomplete, please let us know your questions or suggested improvements.

General Info

Avoiding 502 (Bad Gateway) Errors IMPORTANT

Post request content: (request.json)

Basic Examples

Minimal Example: Capture a Page as PlainText

request.json

python code using builtin requests code

python code using urllib2

Minimal Example: Verbose output JSON

request.json

python code

Minimal Example: Capture as JPEG

request.json

python code

Minimal Example: Capture as PDF

request.json

python code

Minimal Example: All parameters

request.json

python code

Need more advanced examples?

Avoiding `502 (Bad Gateway)` Errors IMPORTANT

Post request content: (`request.json`)

`request.json`

`python code using builtin requests code`

`python code using urllib2`

`request.json`

`python code`

`request.json`

`python code`

`request.json`

`python code`

`request.json`

`python code`