
Introducing XOBIS to the FRBR Working Group (2003).
Dick R. Miller
August 4, 2003, Berlin, Germany
| I. Introclusion | (Fig.0 & 1) |
| II. Themes | |
| 1. Relationships and Functionality | (Fig.2) |
| 2. Delineation (Principal Elements) | (Fig.3) |
| 3. Derivation and Versions | (Fig.4) |
| 4. Identification | (Fig.5) |
| 5. Disambiguation and Qualification | (Fig.5) |
| 6. Substitution | |
| 7. Variation (Equivalence and Subsumption) | |
| 8. Description | |
| 9. Instantiation | (Fig.6) |
| 10. Recursion | |
| 11. Validation and Control | |
| III. Basic Structure | (Fig.7) |
| IV. FRBR Comparison and Pending Issues | (Fig.8 & 9) |
| V. Concluduction |
I. Introclusion
[Wearing purple antennae]
"I bear greetings from the [sign < >] planet Xobis."
Lest you do think that I arrived on a UFO, I'd better disambiguate by switching my XML markup from <Planet> to <Schema>. These antennae were inspired by the kind words about XOBIS recently posted on the FRBR Working Group's website. Thank you from all of us at Lane who have worked on XOBIS, which remains a work in progress. My colleague, Kevin Clarke, was especially pleased to see his thumbnail description of XOBIS selected for quotation. The UFO allusion is appropriate, as one of my goals today is to convince you that XOBIS, the schema, is not so "alien" after all.
This "introclusion" is inspired by Patrick [le Boeuf], as we share his sentiment, in that our work builds on that of our predecessors; if we have accomplished anything at all deserving of the glowing remarks, it is because we "have stood on the shoulders of giants" so to speak. Indeed, it is a bit eerie to consider that many folks working on information models have identified the same, or similar, problems and have found creative solutions that resonate with one another. Our independent attempt to synthesize bibliographic and authority information, and to fit these into a single structure with museum and archival information holds many striking similarities with FRBR, CIDOC's Conceptual Reference Model, the Semantic Network of the National Library of Medicine's UMLS, and other efforts. We believe that the variety of efforts are wise at this stage and that they contribute to making future solutions even better. XOBIS was undertaken in this spirit.
First, I will introduce various themes that emerged in the development of XOBIS, followed by a look at how its tightly integrated core structure supports these ideas. Then, I will venture a few comparisons with FRBR and touch on a few open issues. Because understanding much of XOBIS relies on the structural context, it's probably better to save questions for after this condensed overview. We will start with Figure 0, but when I'm not discussing a particular figure, you may want to glance at Figure 7 to keep oriented.
Figure 0 shows two fragmentary bibliographic "records" with an authority in the middle. These pseudo-XOBIS records provide a hint as to the general elements, attributes, and structure of XOBIS. Elements are in bold and attributes in italic. Due to the integrated structure, Relationships between records recapitulate the Entry of related elements. For example, in the first record Title anchors the Entry for a Work; it has three Relationships, the third referencing another Work--at the bottom of the page. These should include the related record's ID as well; the authority record illustrates an ID; it also shows that authorities use the same structure for Relationships as do bibliographic records. The Record at the bottom has a role attribute indicating that is represents both an "authority" and an "instance" of a bibliographic record.
On the next page, Figure 1 provides a hint of the questions we asked in deciding what was fundamental for each component of the evolving structure. Author, Title, and Subject are perennial favorites. Recognizing that Author and Subject are Relationships confirmed our inclination to focus on their centrality. Title, on the other hand, serves as the "name" for a Work, and suggested the importance of an identity for each Principal Element--title entry in case of Work; this would also serve for a cleaner title index, which brings us to functionality.
II. Themes
1. Relationships and Functionality
In an iterative process, we found that Relationships merited the most emphasis and were essential to a crisp structure--one simple and reliable enough to improve current functionalities, but intended to support future ones. This is particularly important, in view of the many competing information resources found on the Web. At the same time we were seeking solutions to the many long-standing and seemingly intractable bibliographic problems, which are coming into focus with the rise of digital materials, and threatening to segregate these from traditional formats.
Figure 2 shows how a Work, "The Fatal Shore" has a "vital" Relationship, with the value of Author, to a Being, Robert Hughes; an "organizational" one to a Publisher; a "geographic" one to a Place, and so forth. The 10 types of source-target Relationships shown allow 100 categories for grouping specific Relationships by target entity, or Principal Element. XOBIS also supports "navigational" Relationships, which identify the direction of a Relationship relative to a particular source or focal record. Specific cases may be indicated by attribute values of "superordinate," "subordinate," "preordinate," and "postordinate," as well as two coordinate ones, "associative" and "dissociative." The value "unspecified" supports mapping of older data. Specific cases of all of these Relationships, as well as general relationships, can be controlled as Relationship authorities just like those for any other Concept. The 10 types of source-target Relationships exactly parallel XOBIS' 10 fundamental categories of information called Principal Elements. Each of these was delineated in an attempt to optimize handling of a "homogenous" class of information to support improved functionality.
2. Delineation (Principal Elements)
Figure 3, part A, provides working definitions for these 10 categories of information fundamental to so-called memory institutions. All are structurally equal. String, covering words and phrases, is worth pointing out as the most novel, having potential for improvement of popular keyword searching. Part B, shows two attributes of special interest. XOBIS uses attributes sparingly to avoid being prescriptive--in an effort to clearly demarcate a "fixed" structure from "changing" content. Trying to establish precise cusps between these mutually exclusive groups raises interesting questions. For example, on the right column of Part B, an Organization may have an Entry with a class attribute of "individual" for a solo professional practice, or for a single performer who has incorporated, whereas a Being may have an Entry with a class "familial" to indicate an indeterminate number of family members, or "collective" for broader named groups--not organized as formally as to constitute an Organization. The value "referential" allows inclusion of informational records.
In the middle column of Part B, the 'type' attribute illustrates the need for frugality in determining broad groupings, as XML attributes are not repeatable. Note that there are only three for Being; human was fairly obvious, but we chose specimen to encompass collected or named human remains, animals, plants, microscopic specimens, etc. distinct from the idea of any of these (a Concept); the deliberately vague special allows fictional examples of all of these, as well as all manner of gods, monsters, angels, spirits, etc. -- To avoid arguments, other repeatable, over-lapping, or difficult to define values, that might be an attribute value, are accommodated by Relationships to a Concept. This essentially extends the idea of form/genre, or category, to all the Principal Elements; for example, Berlin is a City. Relationship is repeatable, allowing assignment to as many categories as apply. For Work, this resolved the "format" problem, because one value may be designated "primary" via a degree attribute of the Relationship.
In Part C, the 'role' attribute allows records for certain Principal Elements to represent an instance ("bibliographic" referring to an instance of a Work) or just an authority, as well as permitting the same record to serve both purposes. This distinction is limited to "substantive" Principal Elements, discussed next.
Figure 4 shows how 9 of the Principal Elements are derived from Concept, when its type attribute has the value "collective." A Concept can also be "abstract," Cataloging, or "specific," Heimlich Maneuver. (Thesaurus construction might be easier if potentially infinite "specific" values were de-emphasized to focus on structural relationships between abstract and collective concepts.) Five of the 9 Principal Elements are "notional" in that they represent the idea of something, are solely intangible, and inherently represent an authority. The remaining 4, Place, Being, Object, and Work, are "substantive" in that they may, in a sense, have a physical existence. Place includes topographic features and buildings, and Work is included due to its carrier or digital existence (as binary arrangements of electrons, atoms, or even pebbles on a beach). As such, they may represent an instance, an authority, or both.
Optionally, Object and Work may have Versions for "substitutable" content. Versions may potentially cover Place and Being. For example, this would allow one record for a person to include selected discrete identities as Versions. Substantive Principal Elements may link to Holdings or Items, envisioned as separate schemas of a larger suite.
The core of XOBIS consists of Relationships that can link any of the 10 Principal Elements to one another. To do this effectively, Principal Elements need a crisp identity. This also supports their consistent reuse in a variety of contexts. Figure 5 provides an approximation of how the integrated structure extends to naming, where we also sought to identify patterns. Many Principal Elements simply have a distinct Name. Special, optional structures cover "personal" names and "chronological" ones. Thus, a Being can have a Name, such as The Rock, or a personal name may consist solely of a Surname. Expansion accommodates spelled-out forms of abbreviated personal names. The chronological structure follows the ISO standard, although a simple Name, such as Autumn, is permissible for Time, as well as our adding a calendar attribute to indicate the Code for a specific calendar, considered a Work authority. Title serves as the name of a Work, and alternatively may consist two or more TitleSegment elements. This mechanism allows selected subtitles to be part of the Entry, similarly to section titles; however, many subtitles would be better handled via a Description element. A NameSegment permits the same for an Object.
5. Disambiguation and Qualification (Fig. 5 con't.)
This arrangement seems to accommodate naming--except for ambiguity. The Principal Elements themselves automatically provide a first level of disambiguation, although this apparently only suffices completely for Time. To avoid duplication, values representing "names" of the other 9 Principal Elements may be qualified as necessary--with the values of Principal Elements serving double duty in this capacity! Each Entry may be added as a Qualifier element to another Principal Element's name. While trying to keep rules separate from structure, we do believe that routine disambiguation of titles by edition (String) and date (Time) would reduce the need for "mediated" qualification considerably.
One defining moment in the development of XOBIS was recognition that the name of an Organization often involves pre-qualification. For example, a Preservation Dept. is the actual Organization represented by its authority. Including the parent body as part of an Entry serves to disambiguate such non-distinct names. That it is subordinate represents a discrete Relationship. The simplicity of XOBIS is that the only difference from post-qualification is the placement of the Qualifier in front of the Name (in this case, one Organization pre-qualified by another).
XOBIS' structure allows each Qualifier to be under authority control, providing suitable software is available. Ad hoc values may be used and are apparent due to their lack of an ID attribute identifying the associated Record. The Principal Element String, which can accommodate any word or phrase, adds unlimited flexibility to this elegant technique. A title such as Annual Report, qualified by the National Library of Medicine, qualified by the United States is not problematic.
6. Substitution
Neither is abbreviating United States to U.S., as several Substitutes are available for use in places where a substitute attribute has been defined. Currently, substitution permits an Abbreviation, a Code, a brief Citation, or a Singular form to be used, as required, instead of the full Entry, while maintaining authority control.
7. Variation (Equivalence and Subsumption)
XOBIS deals with variation, including see references and variant titles, as a Varia element, containing a repeatable Variant. Each represents an intra-record equivalence to the Entry. However, the technique also allows for subsumption of different, but closely related entries, on the same Record. Each Variant functions as an alternative Entry, using the same "name" structures, including Qualifiers. Additionally, a Variant may have a Type and a Duration. This works well for identifying acronyms, pseudonyms, nicknames, kinds of added titles, etc. and for indicating the specified Time when they apply. Type is a good example of a generic element in XOBIS. It is defined to allow individual values to be under authority control--like any Concept. For example, the collective Concept "Variant Titles" can serve as an umbrella (via a Relationship) for each specific Concept: "Spine Title," "Half Title," "Supplied Title," etc. Adding a new Type value is as simple as creating a new authority record, which could include the definition and application instructions. (And, there's no need to worry about running out of indicators or fixed field values). As in many parts of XOBIS, records for new and revised values could be supplied by authoritative agencies for other systems to load, much as we load authorities today. This should work better than announcing new and changed values in external documentation, and then waiting for proprietary systems to propagate them in varying ways. I like to think of XOBIS as self-documenting.
8. Description
XOBIS emphasizes simplicity without being simplistic. We tried to place descriptive aspects of cataloging in perspective by providing a Description element, containing repeatable Annotation elements distinguished by Type. Many notations actually refer to a Relationship; for example, a statement of responsibility may apply to a single Author Relationship. However, Description may be recorded at the Record level, especially to support mapping of retrospective data. The 100% transcription represented in fixed digital documents and XOBIS' treatment of Relationships reduce the need for notes. Such information is carried as part of the Relationship, rather than mingling it with information uniquely applying to the source or target Record being related.
Figure 6 provides another view of derivation and how Relationships between Principal Elements define the basic structure, even when not explicitly recorded. The "is a" relationship functions as a generic form/genre mechanism, which we refer to as a categorical Relationship, as opposed to a topical one. Specific cases of each Concept shown here are instantiated as, or represent one instance of, the value of a particular Principal Element. Fictional Relationships (as well as imaginary, mythical, legendary, etc. ones) function similarly, with these cases belonging to the same Principal Element as their real counterparts. This brings a great deal of homogeneity to the structure and underpins its integration. We expect that defining Relationships, including Relationships between Relationships, will be a "growth industry." This area holds potential for knowledge management, new opportunities for catalogers, and is a lot of fun. For example, Shangri-La is an imaginary Place, but, it is also related to the Concept Utopias (which includes imaginary places) and to the Work by Sir Thomas More. And then there's the real Place, Utopia, Texas, and others. Dystopias is a good example of a "dissociative"
Relationship. There are many reference tools containing such information, but I like to imagine what the world of scholarship would be like, if such tools shared an infrastructure. Clarifying such information should pay better than being a scribe.
10. Recursion
Another defining moment was when we recognized that we were dealing with a case of tangled hierarchies, made famous by Douglas Hofstadter. Polyhierarchical and other Relationships between Principal Elements, the Qualifiers technique, and the Type mechanism (using the database to control itself so to speak) are all recursive or self-referencing. An additional recursive feature expresses Duration of Relationship, birth dates as Qualifiers, date Qualifier to a Title, the publication date Relationship, and Record creation date -- using the same Time structure. Such complexities make it difficult to talk about any particular aspect of XOBIS in isolation, but we think the degree of integration will be easier to understand once software can delineate where you are in a Record, and what options are available to you at that point. Essentially, the complexity in the schema, simplifies that of the data.
Validation represents functionality. In addition to the access functionality mentioned earlier, XOBIS assumes a minimal amount of software support to enforce constraints on content beyond those of routine XML editors, particularly in regard to authority control. However, the design should make such development easier. For example, a changed Entry can be propagated throughout the structure and a dataset, based on the ID attribute found wherever an Entry is referenced. References to a Record in external Xobian implementations could periodically be checked for Entry changes. Control of content-related Type values anywhere in a Record would rely on straightforward software as well.
A second area relates to control of the Record itself, as opposed to its content. The ControlData element contains IDs with Varia for Record identity and matching, Actions for record transaction history, and Types to relate functionality to subsets, for example, Suppressed or Pending records, or other designations. This flexibility should support improved record management.
III. Basic Structure
Figure 7, initially developed by Kevin Clarke, provides a succinct recap of XOBIS' basic structure. It shows the root element, RecordList with the possibility of "Collection" for combining data from various implementations. Each Record has a tripartite structure: ControlData, any one of the Principal Elements, and Relationships. Each Relationship to another Principal Element may have a Name, Duration, and Description. Relationship attributes provide for degree, for example a primary vs secondary topical Concept; class for identifying the kind of target, and Type for categorizing the relative navigational direction or type of ordinate involved.
On the right, there is Entry, with an indication that various values may name a Principal Element, and its potential Qualifiers. Note that Qualifiers and Relationships both reference a Principal Element's Entry (shown by the double-dotted line), whether it is "established" or not. Below are optional Substitutes, Varia, and Description. Last, we have optional Versions, which segue nicely into FRBR. Notice that they have their own IDs and Relationships. They were initially included to support the "single-record" approach for similar versions of serials; we have used the term "substitutable" to suggest that a user, aware of differences in format, would find the content generally acceptable, whereas content in different translations or derivative works would not automatically satisfy the same request.
IV. FRBR Comparison and Pending Issues
[Figure 9a ][Figure 9b ][Figure 9c ]
Figure 8 is an initial attempt to compare XOBIS with FRBR. I think that XOBIS' Versions may approximate some FRBR Expressions, and that we have a lot to learn from areas in which the FRBR Working Group has concentrated. XOBIS' Work element, with a role attribute value of "authority," will allow any number of hierarchical and chronological Relationships. Some FRBR-related research at OCLC indicates that a relatively small number of cases really need this sort of "controlled" order.
FRBR's Entities and XOBIS' Principal Elements are remarkably similar in name, although looking closely at the composition of each, reveals significant differences in interpretation. For example, often a conference name represents an Event, although it may have been conducted by an organizing committee or secretariat, an Organization. XOBIS interprets Work very broadly, providing Substitutes for identifying subject heading schemes, classification schemes, transliteration tables, calendars, etc. to integrate such sources into the framework. Likewise, each alphabet, academic course, markup language, computer font, etc. would constitute a Work in XOBIS, while these are usually only treated as topical Subjects.
There are some striking similarities with CIDOC's CRM, but it too is more prescriptive and focuses on museums, much as FRBR has a library focus. It appears that FRBR has moved in the same direction as XOBIS, but perhaps not as far, and with a more conservative reliance on tradition. Those working on the FRBR, CRM, and XOBIS models all recognize the importance of interoperability--especially for cultural heritage materials.
Although the XOBIS core appears stable, there are a number of pending issues. Taking an idealistic approach has made mapping from MARC more difficult. Our initial concentration on a subset of common fields is supported by William Moen's recent research finding that less than 5% of available content designation in MARC accounts for 80% of the occurrences. Some features of XOBIS are not supported in MARC and will necessarily have to be mapped as inaccurate defaults; we treat others as local MARC extensions, but it is not practicable to do this in some basic areas, such as identifying the kind of Qualifer--when many are not subfielded in MARC in the first place.
A flawed clustering mechanism to identify preferred Entry by Language was removed before releasing the alpha version of XOBIS. We hope to resolve this deficit generically in a beta version for added internationalization, especially of authorities--although it is not a priority area for a medical library. Similarly, we lack motivation for testing non-topical subdivision support, as we only use topical MeSH subheadings. This is also the case in mapping highly specialized MARC data, which could be handled by separate coordinated schemas (to cover cartographic details, for example). Smaller issues involve cusp clarifications, mixed material treatment, etc.
Lastly, Figure 9 illustrates the range of Relationships supported in XOBIS; these need a great deal of further investigation. While the structure supports a broad range of possibilities, delving into specifics without rules to guide decisions in this new territory makes it difficult to proceed with assurance.
Now, my concluduction, again with thanks to Patrick, as I do hope that all our good work is just a prelude to better things to come. There is a truism that nothing is constant except change. Despite this, XOBIS attempts to separate relatively stable, simple structural elements -- from the notorious and changing complexities of metadata. This effort focuses on addressing metadata as the critical bridge between content and sophisticated access--all three increasingly focused on XML in a digital environment.
XOBIS emphasizes flexibility, generality, and pragmatism, but deliberately avoids specifying details of how content should populate its structure. However, the integrated structure strongly suggests the need for a coordinated, companion cataloging code. After all, XOBIS was designed to help solve problems with MARC and AACR. (These are covered extensively in Putting XML to Work in the Library, due in November from ALA Editions.)
FRBR, CRM, XOBIS, and other efforts strive toward supporting cooperative systems with improved resilience and increased interoperability. I venture that there might be avenues for further cooperation, or perhaps even synthesis of our many ideas to achieve a more comprehensive and generally acceptable solution. Is an International Cataloging Code, based on a common data structure, such an alien idea for libraries and museums? It is humbling to have been invited here today, especially since XOBIS is experimental, incomplete, and untested. Thank you very much for this opportunity.