New features in the upcoming Psycopg release
Posted by Daniele Varrazzo on 2010-11-02
Tagged as
news,
development
There are several new features being prepared in the upcoming Psycopg version. The documentation refers to it as release 2.2.3 but it may be ultimately released as 2.3. The main target of the release is to expose some of the new features introduced in PostgreSQL 9.0.
Update The new features have been released in Psycopg 2.3.0-beta1.
dict to hstore mapping
hstore is a very interesting data type available in PostgreSQL: each hstore value is a set of key/value pairs . In the release 9.0 hstore received a lot of improvements that made it still more useful. Building an index on an hstore column will allow fast search for values containing one or more provided keys/values. The new Psycopg release will allow to map Python dictionaries into hstore and back. String and Unicode keys and values are supported: you can find here documentation and implementation in the adapt-dict git branch.
Notify payload
PostgreSQL includes an asynchronous notification mechanism allowing clients to subscribe to a certain notification channel and receive events from the database, allowing efficient communication between processes connected to the same database. Previously only the channel name could be notified: in PostgreSQL 9.0 a payload string can be attached to an event, allowing more flexible communication: if previously a listening client could have been notified about data changed in a specific value, it is now possible, for instance, to report which record has been modified. The mechanism is completely generic and it's up to the client applications to find a good use for it. Here you can find documentation and implementation for the notification payload in Psycopg.
Two-phase commit
Two-phase commit is actually supported by PostgreSQL since release 8.1: it enables the database to participate in distributed operations with transactional characteristics. You can find here documentation and implementation in the git tpc branch.
Python DBAPI 2.0 suggests a specific interface for 2pc-related methods that maps fine enough on PostgreSQL commands. What maps slightly worse is the format chosen for the distributed transactions identifiers: the DBAPI is inspired to the Open Group XA specification according to which every transaction is identifies by a triple (format id, global transaction id, branch qualifier). PostgreSQL doesn't follow this standard and accepts any string (up to a certain length) as a transaction identifier: this creates the need to map from triples to strings and back. We have currently implemented the same algorithm used in the JDBC driver: it works fine but has the drawback of producing unreadable strings as transaction ids. We may use something different, such as concatenating the elements of a xid with a specific character (e.g. :) and escaping colons provided in the xid components. Would it be more valuable to have readable transaction identifiers or interoperation with Java tools? I don't have much experience about the matter, so any suggestion would be appreciated.
Please feel free to check the new features details in the documentation, to try the code under development and to leave your feedback!