Books from other libraries
What if the librarians of separate digital community libraries could establish links between those libraries they take care of and those of their friends?
To find out, we developed a very early proof of concept of library federation with OPDS in calibre-web. This work builds on earlier experiments with the Critical Technical Practice project at DARC.
For this PoC we leveraged existing functionality within the software that represents the collection of books in the form of a OPDS feed (Open Publication Distribution System). This is a system, based on RSS/Atom feeds, that represents the collection in a machine-readable form. Unlike RSS, it presents not only a chronological feed, but also supports searches and other types of listings. It is used widely in e-readers to navigate collections and acquire books.
All calibre-web libraries that enable guest access, meaning that they make parts of the collection visible without needing an account, can also present those books in OPDS format.
A walkthrough of the prototype
The question at the basis of our prototype was whether calibre-web could not only write OPDS feeds, but could also read them. Hence, our prototype could be considered an OPDS client, built in to calibre-web. The prototype is made up of a new series of routes and functionalities to calibre-web that deal with:
- adding remote libraries
- viewing part of their collections
- importing works from those collections
- presenting those works as coming from remote collections
The landing point is the route /federation/
which shows some configurable settings and a way of adding other libraries.
Once these are added, there is a list view that display the latest n number of books from those remote libraries. While this view looks like any other one in calibre-web that shows books, all the links (covers, descriptions etc.) take you to the remote libraries. This is thought of as a way to make links between the local and the remote, and allow for more exploration of the remote.
Aside from the list view, there is also an import view that shows slightly more metadata and that allows one to import the remote books in to the local collection.
Once a publication is imported, it is automatically assigned a tag “from the network” and added to a specific shelve.
All the imported books are displayed in the “federated shelves”, grouped by their library of origin. In addition, shelves with the remote libraries name are also made, and publications automatically added.
One of the nice side effects of relying on an open standard like OPDS is that there is also support for libraries apart from calibre-web, as long as they support OPDS. In the list view you can see theanarchistlibrary.org, for example. Our proof of concept implementation, however, has a very naive parser and other examples will inevitably fail.
on questions of content moderation and safety in federated environments
Drawing on my PhD work, the questions of what the federation is and how the data are supposed to circulate, as well as the down stream impacts of those questions, have been central considerations for this early prototype. It is both technically and conceptually simple to “just” interconnect distinct systems, but taking second-order effects in to account requires more thought. How does the system impact issues such as consent, mistakes, abuse, regret, safety, generosity and risk for instance? The following paragraphs examine some design choices of how this prototype federates.
Notably, unlike much of the fediverse, the prototype knows no notion of a “global” federation. Instead, federating here means that links to specific remote libraries need to be established explicitly. This does make discovery harder, but it means inter-connection of systems follows social connections, rather than the other way around. There is no sense within this prototype of federation offering a view of “the network”. Instead, every library constructs their network of relations.
After establishing connections –which entails copying a remote library URL to your federation list– publications do not automatically propagate between libraries either. Instead, publications need to be imported manually and explicitly. Similarly, there is no unmoderated feed of remote publications that reaches the public view. The federation features are only for the users with librarian access to see, who then can explicitly import remote publications to the local collection. This avoids irrelevant or unwanted remote content from automatically appearing and populating the local library, minimizing the propagation of mistakes and abuse.
Just as books do not automatically propagate, the current prototype has no sense of automatic discovery of “neighbors”: other libraries unknown to me, but known to a library I already connect to, do not propagate.
In the current design, which relies on unauthenticated public feeds, remote libraries cannot approve or reject being added to another library’s network. They can only disable the public feeds altogether. Future designs could require authentication to access the remote data, giving libraries the possibility to accept connections to some libraries, and to reject connections to others. The current proof of concept is all or nothing. Thus, currently, the issue of consent to connect is ambiguous. On the one hand, disabling automatic propagation means a local librarian is controlling how their network is being shaped. But only to the extent that they can not prevent other librarians from explicitly connecting to their library, without disabling public feeds altogether.
on reconsidering designs
Prototypes, or even git commit logs, tend to only reflect what ideas and designs were kept, rather than those discarded. However, the actual reasoning behind the choice to not pursue a particular design can be highly instructive. As such, it is interesting to reflect on two ideas that did not make it:
There were early ideas and designs for showing the provenance of imported works by embedding provenance in a publication’s metadata. Initially considered as a way to credit the remote library, or as a way to demonstrate how a particular collection is made up of many collections, these ideas were dropped as the downsides can be considerable. Showing that a specific publication came from a specific library is very risky in the context of shadow libraries. Especially when there are no explicit ways to reject interconnection and when metadata is embedded in the files themselves. Currently, provenance appears only in the form of particular shelves in the library named after the origin library (which can be disabled or made invisible to the public), and the tag “from the network” attached to particular publications.
Similarly, there was an early prototype that allowed the assignment of names to remote libraries. In effect, this became a publicly visible pet name that dangerously conflated pet names and nicknames1. This could have allowed inadvertently linking a specific person to a particular library; for instance, “Bob’s library”. One only needs to consider the context of shadow libraries to see the issue with this. Instead, the system was made such that only nicknames set by the remote end are used.
The UI of the federation settings still reflects that design and its reconsideration.
future work
We stress again that this concerns a very rough proof of concept and that further work on this will need to consider some conceptual issues to continue:
- how to properly work with consent and its revocation between libraries?
- Downstream of that: how to deal with deletions of remote content?
In addition, there is a plethora of technical bugs, shortcuts and
- not all metadata is imported
- all the network requests are synchronous, making the UI hang at points
- because of that, loading times increase linearly with more books or servers in the network
- there is not enough caching of results, slowing everything down even more.
- a lot of custom ui, and not enough reuse of existing elements means things look off and are hard to maintain
- the OPDS parser is very naive and will not work with a wide diversity of feeds
- no thorough sanitation of input nor sanity checks (there are hard coded paths etc everywhere)
-
See for instance: Ferdous, Md. Sadek & Jøsang, Audun & Singh, Kuldeep & Borgaonkar, Ravishankar. (2009). Security Usability of Petname Systems. 44-59. https://link.springer.com/chapter/10.1007/978-3-642-04766-4_4. ↩︎