The urllib2 module defines functions and classes which help
in opening URLs (mostly HTTP) in a complex world -- basic and digest
authentication, redirections and more.
The urllib2 module defines the following functions:
urlopen(
url[, data])
Open the URL url, which can be either a string or a Request
object (currently the code checks that it really is a Request
instance, or an instance of a subclass of Request).
data should be a string, which specifies additional data to
send to the server. In HTTP requests, which are the only ones that
support data, it should be a buffer in the format of
application/x-www-form-urlencoded, for example one returned
from urllib.urlencode().
This function returns a file-like object with two additional methods:
geturl() -- return the URL of the resource retrieved
info() -- return the meta-information of the page, as
a dictionary-like object
Raises URLError on errors.
install_opener(
opener)
Install an OpenerDirector instance as the default opener.
The code does not check for a real OpenerDirector, and any
class with the appropriate interface will work.
build_opener(
[handler, ...])
Return an OpenerDirector instance, which chains the
handlers in the order given. handlers can be either instances
of BaseHandler, or subclasses of BaseHandler (in
which case it must be possible to call the constructor without
any parameters). Instances of the following classes will be in
front of the handlers, unless the handlers contain
them, instances of them or subclasses of them:
ProxyHandler, UnknownHandler, HTTPHandler,
HTTPDefaultErrorHandler, HTTPRedirectHandler,
FTPHandler, FileHandler
If the Python installation has SSL support (socket.ssl()
exists), HTTPSHandler will also be added.
Beginning in Python 2.3, a BaseHandler subclass may also
change its handler_order member variable to modify its
position in the handlers list. Besides ProxyHandler, which has
handler_order of 100, all handlers currently have it
set to 500.
The following exceptions are raised as appropriate:
exceptionURLError
The handlers raise this exception (or derived exceptions) when they
run into a problem. It is a subclass of IOError.
exceptionHTTPError
A subclass of URLError, it can also function as a
non-exceptional file-like return value (the same thing that
urlopen() returns). This is useful when handling exotic
HTTP errors, such as requests for authentication.
exceptionGopherError
A subclass of URLError, this is the error raised by the
Gopher handler.
The following classes are provided:
classRequest(
url[, data[, headers]])
This class is an abstraction of a URL request.
url should be a string which is a valid URL. For a description
of data see the add_data() description.
headers should be a dictionary, and will be treated as if
add_header() was called with each key and value as arguments.
classOpenerDirector(
)
The OpenerDirector class opens URLs via BaseHandlers
chained together. It manages the chaining of handlers, and recovery
from errors.
classBaseHandler(
)
This is the base class for all registered handlers -- and handles only
the simple mechanics of registration.
classHTTPDefaultErrorHandler(
)
A class which defines a default handler for HTTP error responses; all
responses are turned into HTTPError exceptions.
classHTTPRedirectHandler(
)
A class to handle redirections.
classProxyHandler(
[proxies])
Cause requests to go through a proxy.
If proxies is given, it must be a dictionary mapping
protocol names to URLs of proxies.
The default is to read the list of proxies from the environment
variables protocol_proxy.
classHTTPPasswordMgr(
)
Keep a database of
(realm, uri) -> (user, password)
mappings.
classHTTPPasswordMgrWithDefaultRealm(
)
Keep a database of
(realm, uri) -> (user, password) mappings.
A realm of None is considered a catch-all realm, which is searched
if no other realm fits.
classAbstractBasicAuthHandler(
[password_mgr])
This is a mixin class that helps with HTTP authentication, both
to the remote host and to a proxy.
password_mgr, if given, should be something that is compatible
with HTTPPasswordMgr; refer to section 11.5.6
for information on the interface that must be supported.
classHTTPBasicAuthHandler(
[password_mgr])
Handle authentication with the remote host.
password_mgr, if given, should be something that is compatible
with HTTPPasswordMgr; refer to section 11.5.6
for information on the interface that must be supported.
classProxyBasicAuthHandler(
[password_mgr])
Handle authentication with the proxy.
password_mgr, if given, should be something that is compatible
with HTTPPasswordMgr; refer to section 11.5.6
for information on the interface that must be supported.
classAbstractDigestAuthHandler(
[password_mgr])
This is a mixin class that helps with HTTP authentication, both
to the remote host and to a proxy.
password_mgr, if given, should be something that is compatible
with HTTPPasswordMgr; refer to section 11.5.6
for information on the interface that must be supported.
classHTTPDigestAuthHandler(
[password_mgr])
Handle authentication with the remote host.
password_mgr, if given, should be something that is compatible
with HTTPPasswordMgr; refer to section 11.5.6
for information on the interface that must be supported.
classProxyDigestAuthHandler(
[password_mgr])
Handle authentication with the proxy.
password_mgr, if given, should be something that is compatible
with HTTPPasswordMgr; refer to section 11.5.6
for information on the interface that must be supported.
classHTTPHandler(
)
A class to handle opening of HTTP URLs.
classHTTPSHandler(
)
A class to handle opening of HTTPS URLs.
classFileHandler(
)
Open local files.
classFTPHandler(
)
Open FTP URLs.
classCacheFTPHandler(
)
Open FTP URLs, keeping a cache of open FTP connections to minimize
delays.