Previous topic

libearth.schema — Declarative schema for pulling DOM parser of XML

Next topic

libearth.stage — Staging updates and transactions

This Page

libearth.session — Isolate data from other installations

This module provides merging facilities to avoid conflict between concurrent updates of the same document/entity from different devices (installations). There are several concepts here.

Session abstracts installations on devices. For example, if you have a laptop, a tablet, and a mobile phone, and two apps are installed on the laptop, then there have to be four sessions: laptop-1, laptop-2, table-1, and phone-1. You can think of it as branch if you are familiar with DVCS.

Revision abstracts timestamps of updated time. An important thing is that it preserves its session as well.

Base revisions (MergeableDocumentElement.__base_revisions__) show what revisions the current revision is built on top of. In other words, what revisions were merged into the current revision. RevisionSet is a dictionary-like data structure to represent them.

libearth.session.SESSION_XMLNS = ''

(str) The XML namespace name used for session metadata.

class libearth.session.MergeableDocumentElement(_parent=None, **kwargs)

Document element which is mergeable using Session.

class libearth.session.Revision

The named tuple type of (Session, datetime.datetime) pair.


Alias for field number 0


Alias for field number 1

class libearth.session.RevisionCodec

Codec to encode/decode Revision pairs.

>>> from import utc
>>> session = Session('test-identifier')
>>> updated_at = datetime.datetime(2013, 9, 22, 3, 43, 40, tzinfo=utc)
>>> rev = Revision(session, updated_at)
>>> RevisionCodec().encode(rev)
'test-identifier 2013-09-22T03:43:40Z'
RFC3339_CODEC = <libearth.codecs.Rfc3339 object at 0x7f303048f990>

(Rfc3339) The internally used codec to encode Revision.updated_at time to RFC 3339 format.

class libearth.session.RevisionParserHandler

SAX content handler that picks session metadata (__revision__ and __base_revisions__) from the given document element.

Parsed result goes revision and base_revisions.

Used by parse_revision().

done = None

(bool) Represents whether the parsing is complete.

revision = None

(Revision) The parsed __revision__. It might be None.

class libearth.session.RevisionSet(revisions=[])

Set of Revision pairs. It provides dictionary-like mapping protocol.

Parameters:revisions (collections.Iterable) – the iterable of (Session, datetime.datetime) pairs

Find whether the given revision is already merged to the revision set. In other words, return True if the revision doesn’t have to be merged to the revision set anymore.

Parameters:revision (Revision) – the revision to find whether it has to be merged or not
Returns:True if the revision is included in the revision set, or False
Return type:bool

Make a copy of the set.

Returns:a new equivalent set
Return type:RevisionSet

The list of (Session, datetime.datetime) pairs.

Returns:the list of Revision instances
Return type:collections.ItemsView

Merge two or more RevisionSets. The latest time remains for the same session.

Parameters:*sets – one or more RevisionSet objects to merge
Returns:the merged set
Return type:RevisionSet
class libearth.session.RevisionSetCodec

Codec to encode/decode multiple Revision pairs.

>>> from datetime import datetime
>>> from import utc
>>> revs = RevisionSet([
...     (Session('a'), datetime(2013, 9, 22, 16, 58, 57, tzinfo=utc)),
...     (Session('b'), datetime(2013, 9, 22, 16, 59, 30, tzinfo=utc)),
...     (Session('c'), datetime(2013, 9, 22, 17, 0, 30, tzinfo=utc))
... ])
>>> encoded = RevisionSetCodec().encode(revs)
>>> encoded
'c 2013-09-22T17:00:30Z,\nb 2013-09-22T16:59:30Z,\na 2013-09-22T16:58:57Z'
>>> RevisionSetCodec().decode(encoded)
             updated_at=datetime.datetime(2013, 9, 22, 16, 59, 30,
             updated_at=datetime.datetime(2013, 9, 22, 17, 0, 30,
             updated_at=datetime.datetime(2013, 9, 22, 16, 58, 57,
SEPARATOR_PATTERN = <_sre.SRE_Pattern object at 0x7f3030547738>

(re.RegexObject) The regular expression pattern that matches to separator substrings between revision pairs.

class libearth.session.Session

The unit of device (more abstractly, installation) that updates the same document (e.g. Feed). Every session must have its own unique identifier to avoid conflict between concurrent updates from different sessions.

Parameters:identifier (str) – the unique identifier. automatically generated using uuid if not present
IDENTIFIER_PATTERN = <_sre.SRE_Pattern object at 0x7f3030547670>

(re.RegexObject) The regular expression pattern that matches to allowed identifiers.

identifier = None

(str) The session identifier. It has to be distinguishable from other devices/apps, but consistent for the same device/app.

interns = {}

(collections.MutableMapping) The pool of interned sessions. It’s for maintaining single sessions for the same identifiers.

merge(a, b, force=False)

Merge the given two documents and return new merged document. The given documents are not manipulated in place. Two documents must have the same type.

  • a (MergeableDocumentElement) – the first document to be merged
  • b (MergeableDocumentElement) – the second document to be merged
  • force – by default (False) it doesn’t merge but simply pull a or b if one already contains other. if force is True it always merge two. it assumes b is newer than a

Pull the document (of possibly other session) to the current session.

Parameters:document (MergeableDocumentElement) – the document to pull from the possibly other session to the current session
Returns:the clone of the given document with the replaced __revision__. note that the Revision.updated_at value won’t be revised. it could be the same object to the given document object if the session is the same
Return type:MergeableDocumentElement

Mark the given document as the latest revision of the current session.

Parameters:document (MergeableDocumentElement) – mergeable document to mark
libearth.session.ensure_revision_pair(pair, force_cast=False)

Check the type of the given pair and error unless it’s a valid revision pair (Session, datetime.datetime).

  • pair (collections.Sequence) – a value to check
  • force_cast (bool) – whether to return the casted value to Revision named tuple type

the revision pair

Return type:

Revision, collections.Sequence


Efficiently parse only __revision__ and __base_revisions__ from the given iterable which contains chunks of XML. It reads only head of the given document, and iterable will be not completely consumed in most cases.

Note that it doesn’t validate the document.

Parameters:iterable (collections.Iterable) – chunks of bytes which contains a MergeableDocumentElement element
Returns:a pair of (__revision__, __base_revisions__). it might be None if the document is not stamped
Return type:collections.Sequence
Fork me on GitHub