Previous topic

libearth.parser.atom — Atom parser

Next topic

libearth.parser.rss2 — RSS 2.0 parser

This Page

libearth.parser.autodiscovery — Autodiscovery

This module provides functions to autodiscovery feed url in document.

libearth.parser.autodiscovery.ATOM_TYPE = 'application/atom+xml'

(str) The MIME type of Atom format.

libearth.parser.autodiscovery.RSS_TYPE = 'application/rss+xml'

(str) The MIME type of RSS 2.0 format.

libearth.parser.autodiscovery.TYPE_TABLE = {<function parse_rss at 0x42c4b90>: 'application/rss+xml', <function parse_atom at 0x40eded8>: 'application/atom+xml'}

(collections.Mapping) The mapping table of feed types

class libearth.parser.autodiscovery.AutoDiscovery

Parse the given HTML and try finding the actual feed urls from it.

Namedtuple which is a pair of type` and ``url


Alias for field number 0


Alias for field number 1

exception libearth.parser.autodiscovery.FeedUrlNotFoundError(msg)

Exception raised when feed url cannot be found in html.

libearth.parser.autodiscovery.autodiscovery(document, url)

If the given url refers an actual feed, it returns the given url without any change.

If the given url is a url of an ordinary web page (i.e. text/html), it finds the urls of the corresponding feed. It returns feed urls in feed types’ lexicographical order.

If autodiscovery failed, it raise FeedUrlNotFoundError.

  • document (str) – html, or xml strings
  • url (str) – the url used to retrieve the document. if feed url is in html and represented in relative url, it will be rebuilt on top of the url

list of FeedLink objects

Return type:



Guess the syndication format of an arbitrary document.

Parameters:document (str, bytes) – document string to guess
Returns:the function possible to parse the given document
Return type:collections.Callable

Changed in version 0.2.0: The function was in libearth.parser.heuristic module (which is removed now) before 0.2.0, but now it’s moved to libearth.parser.autodiscovery.

Fork me on GitHub