[preprint from forthcoming article in Cataloging & Classification Quarterly copyright Haworth Press]
Dick R. MillerSUMMARY. XOBIS is an XML schema which reorganizes bibliographic and authority data elements into a single, integrated structure. It explores balancing valuable traditions with new technologies to create a potential foundation for future access to information in a distributed digital environment. It also attempts to determine a middle path between the complexity of MARC and the oversimplification of the Dublin Core. XOBIS represents an experimental effort focused on addressing metadata as the critical bridge between content and sophisticated access–all three increasingly focused on XML in a digital environment.
KEYWORDS. Integration, authorities, cataloging, XML, XOBIS, schema, entities, relationships, delineation, instantiation, identification, disambiguation, variation, equivalence, subsumption, description, validation, recursion, tangled hierarchies, interoperability.
[Author Information:] Dick R. Miller, MLS, is Head of Technical Services, Lane Medical Library, Stanford University Medical Center, Stanford, CA 94305 (E-mail: dick@stanford.edu). The author gratefully acknowledges the collective effort that XOBIS represents. This article is based on "Introducing XOBIS to the FRBR Working Group," presented August 4, 2003, at its meeting held during the Annual Conference of the International Federation of Library Associations and Institutions in Berlin, Germany.
The challenges of information management have grown immensely. Long discrete assemblages of data and documents that functioned well in their time are now found starkly juxtaposed in today's boundless and chaotic digital environment. Mechanisms to control the scholarly subset of this information now appear antiquated when compared to resources designed with the Web in mind. Traditional bibliographic structures in particular are at a disadvantage. These and many other resources manage to maintain their coherence only in isolation–silos of organization containing fodder to frustrate federated searching.
Fortunately, the Web is young. Libraries, archives, and museums must find new ways to accomplish their similar goals in this new open medium with its endless possibilities. A new medium calls for looking at problems in new ways. Mindful that collective effort is more likely to succeed in this competitive environment and to offer local benefits not possible in isolation, our choices today are strategic for success tomorrow. XOBIS is one attempt to explore whether it is possible to balance valuable traditions with new technologies to create a foundation for future access to information in a distributed digital environment.
Incite or Insight?XOBIS is primarily about information objects and relationships. From the vantage of any given record representing an information object, relationships provide a perspective on the surrounding informational landscape. Current systems provide limited fragments of what would be possible if a more comprehensive and integrated approach were used. For example, systems can retrieve works published at a particular time, although this is more often available only as a limit to another query. However, a researcher interested in events occurring at the same time, concepts that emerged at that time, names of people, places or organizations that were contemporaneous, objects made at the time, etc. would face a myriad of disparate resources not likely to ease the process of understanding the whole. It may be too much to dream that bibliographic and authority records will all be created according to the same rules, but it is within grasp to imagine that a sparse set of data elements can be adhered to universally in the academic realm in order to organize and share information effectively without relying on a single database. It is only a small additional step to imagine coordination of equivalent names of instances of such information objects around the world. Small changes have potential for large impacts when it comes to virtual information integration. Diverse efforts point in this direction.
The current bibliographic apparatus lacks the agility to adapt efficiently to ongoing changes in the digital environment, thus hampering librarians' ability to maintain and enhance their relevance to information management. This also makes interdisciplinary collaborative efforts more difficult. Consult the recent analysis of problems with MARC and AACR correlated with XML features for details (Miller and Clarke 2003, 102-136). Greater interoperability is more likely from exertion of control at a superstructural level than in attempting to achieve absolute fidelity to a myriad of details unlikely to address local needs of a broad variety of institutions.
Some confrontation is inevitable as old ideas and competing new ones clash. Instead of strident rhetoric, it is more useful to focus on how various efforts are striving to grapple with the same problems and to seek complementarities. Although XOBIS represents an independent effort, it holds many striking similarities with FRBR, CIDOC's Conceptual Reference Model (CRM), the Semantic Network of the National Library of Medicine's Unified Medical Language System, and other efforts. There is potential for greater resonance.
XOBIS–The XML Organic Bibliographic Information SchemaNote: In the following text, names of XML elements appear in capitals, attribute names in italics, and values of either in quotes. The schema, developed using the RELAX-NG schema language, is available on our website as well as extensive documentation (Miller and Clarke 2002).
One of the initial steps in developing a schema is to identify fundamental elements and determine their inter-relationships. Author, title, and subject might appear to be immediate choices for fundamental building blocks for bibliographic data. However, a systematic review of such data elements for XOBIS revealed more challenges than anticipated. The closer the inspection, the less obvious choices became. Pragmatism can make oversimplification tempting, while comprehensive theory can invite undue complexity. Balancing these tensions to determine a middle path, somewhere between Dublin Core and MARC, seemed a daunting, but desirable goal.

Some typical bibliographic facets are arranged in Table 1 to illustrate how they correspond with XOBIS' basic structure, represented by a few of its chief entities (called Principal Elements) and illustrative Relationships. In this model, "author" represents one specific kind of responsibility Relationship between a Work and a Being, or between a Work and an Organization. Similarly, a "subject" represents one kind of Relationship between a Work and a topical Concept, or between a Work and any of several entities delineated from Concept, such as Being for a person as the "subject" of a biography. A "form/genre" term represents a different relationship to a categorical Concept. However, the same Concept is involved whether a Work is about Dictionaries ("subject") or is a member of the category Dictionaries ("form/genre"). A "publisher" represents one kind of Relationship between a Work and often an Organization, really no different structurally from an Organization serving as a corporate "author." Lastly, a "series" represents a Relationship between two Works, the parent one conventionally being a serial when numbered, or a uniform title authority when unnumbered.
Quite differently from an author or a subject, a title represents the name of a given instance of a Work–an entity identity. It was clear that each instance of a Principal Element needed a Name (or Title) for identification and to serve as an anchor for Relationships. For Work, this was tantamount to title main entry. In Table 1, each of the entities (in the first and third columns) has a Name or Title. Just as importantly, Relationships need an identity. The middle column shows actual values for the Name of each Relationship between the named/titled entities. Several optional naming structures for Principal Elements accommodate segmented names/titles, personal name substructure, and hierarchical temporal names. Any Name or Title may be combined with other elements, such as Qualifiers, to create an Entry. An Entry builds on a Name or Title to uniquely represent a given instance of any Principal Element, and is replicated in Relationships to it.
While XOBIS has many similarities with conventional bibliographic interpretation, the preceding examples illustrate how somewhat subtle distinctions can result in significant structural differences. The fundamental importance of Relationships helped shape a tightly integrated structure. The schema evolved from insights gained by systematically questioning traditional approaches, while balancing valuable ideas distilled over time. The following sections attempt to explain the schema by exploring themes that emerged during the nine-month development process and that were important to resolve dilemmas that arose repeatedly. These build to the core structure presented in overview. Lastly, preliminary delving into FRBR compatibilities suggests promise in future collaboration.
1. Relationships and FunctionalityIn our iterative process, it became clear that relationships merited the most emphasis and were essential to a well-defined structure–one simple and reliable enough to improve current information retrieval and processing functionalities, yet intended to support future ones. This is particularly important, in view of the many competing information resources found on the Web. Could separate libraries, museums, and archives build compatible informational records that could function as a distributed database in this digital environment, the sum being greater than the individual parts? A new high-level structure could test whether a single standard based on relationships was feasible. An umbrella structure that would allow local companion data to be incorporated via XSLT transformations would hold potential for memory institutions to meet local needs and adhere to an over-arching, universal structure. At least, XOBIS could test an approach designed for the new environment. Perhaps there were solutions to the many long-standing and seemingly intractable bibliographic problems, coming into focus with the rise of digital materials and threatening to segregate these from materials in traditional formats.

Figure 1 shows how one Work, Behold Man, has Relationships to other entities comprising the 10 Principal Elements of XOBIS. For example, it has a "vital" Relationship (with the value of "Author") to a Being (with the value of "Nilsson, Lennart …"), an "organizational" Relationship (with value "Publisher") to an Organization (with value "Little, Brown …"), etc. The 10 types of these source-target Relationships parallel the 10 Principal Elements, allowing 100 categories for grouping specific Relationships by target entity. These are identified by a class attribute on the Relationship element using a matching adjectival value (shown in quotes above the corresponding Principal Element). This structure is envisioned to support categorized and/or conditional display of related information objects.
XOBIS also supports navigational Relationships, which identify the direction of a Relationship relative to a particular source or focal Record. These may be indicated for a specific Relationship via a type attribute with values of "superordinate," "subordinate," "preordinate," and "postordinate," as well as two coordinate ones, "associative" and "dissociative." The value "unspecified" supports mapping of older data. While making many Relationships explicit in XOBIS improves navigation and accessibility, it is not practicable or desirable to do this in all cases. A special group of "one to many" Relationships is envisioned. In such cases a single Relationship could trigger interface software to display multiple virtual Relationships calculated "on the fly" to give the appearance of being explicit. Examples of this are links to volumes in a series, chapters of a book, subordinate levels of hierarchical topics (Concept) with many exemplars, etc.
Additional types of Relationships fall under the rubric general. Source-target, navigational, and general Relationships may be defined within XOBIS by creating Concept records for each case because a Relationship is just a special kind of Concept entity. A Type element with a set attribute provides a mechanism to allow each case to be assigned to a category (another Concept) to allow ad hoc grouping as defined. Thus, the Relationship "mother" might be one member of the category Concept "Genealogical Relationships." XOBIS also makes provision for Duration and Description of Relationships, as well as a degree attribute to specify the strength of the association, e.g. "primary" versus "secondary" topics, or to identify a primary author.
Analyzing bibliographic information using such a structured approach raises issues beyond the scope of this overview. Omitted in the foregoing example, the same Being is author of both Works shown; "Swedish" is the Language of the related Work; "Boston" is the Place where the publishing Organization was then located; a Relationship to another Place, "Sweden," could identify Swedish authors; and using the author's birth year, "1922-" (Time), could subset 20th century authors. It would be more efficient to record such indirect relationships as direct Relationships on the related records, leaving conditional retrieval and presentation of these to software.
2. Delineation, Homogeneity, and Optimization (Principal Elements)The Principal Elements were delineated in an attempt to optimize handling of homogenous classes of information to support improved functionality. The reasoning considers that entities and relationships are more likely to be processed efficiently due to their structural similarities. For example, names of Beings, whether they be authors or subjects, real or imaginary, animal or human, would be more homogeneous than subjects, which might be topical, personal, corporate, subdivided, etc. Are structural differences in coding names like Hilary Clinton, Bayard Sartoris, or J. Fred Muggs justified?
In addition, users may be missing information that is artificially separated in different indexes or databases, such as works by or about a Being, or those citing, translating, superseding or being contents of a work. Separation by form, especially digital versus print, introduces further fragmentation. Language restrictions may cause an ideal result to be omitted from a retrieval, whereas including a prime foreign language Work and recommending a translation service could compensate. Temporal discrimination may bypass classics or allow reprints to appear newer. Software could easily retrieve relationship-based information, such as author or subject, or offer options based on form when the magnitude of retrieval suggests the need for stratification of results. Software balancing precision and recall, as well as balancing characteristics of both user and information, to yield the best results is more likely achievable with simple, flexible, and more universal information structures. Even when information is segregated, it could behave consistently in federated searches when adhering to a web-oriented architecture. Of course, this assumes open accessibility of metadata.

Table 2 provides working definitions for the 10 fundamental categories of information that resulted from reviewing a wide range of data collected by archives, libraries, and museums. These Principal Elements are all structurally equal. String, covering words and phrases, is the most novel; it has potential for improvement of popular keyword searching. "Uncontrolled" vocabulary could benefit from selective authority control of words/phrases, integrating features of dictionaries and thesauri within a database. Similarly, the uniform title of a Work could stand on its own, not conditionally dependent upon a main entry; this would improve browsing in a title index. Optimal delineation of other Principal Elements relies on their unique characteristics.
XOBIS uses attributes sparingly to avoid being prescriptive–in an effort to clearly demarcate a fixed structure from changing content. Once an attribute is defined as part of a schema, it becomes difficult to change. Trying to establish precise cusps between these mutually exclusive groups of information raises interesting questions. For example, an Organization may have an Entry with a class attribute of "individual" for a solo professional practice, or for a single performer who has incorporated, whereas a Being may have an Entry with a class "familial" to indicate an indeterminate number of family members, or "collective" for broader named groups–not organized as formally as to constitute an Organization. The value "referential" allows inclusion of informational records.
Another attribute, type, illustrates the need for frugality in determining broad groupings, as XML attributes are not repeatable. There are only three for Being; "human" was fairly obvious, but "specimen" encompasses collected or named human remains, animals, plants, microscopic specimens, etc. distinct from the abstract idea of any of these (a Concept); the deliberately vague "special" allows fictional examples of all of these, as well as all manner of gods, monsters, angels, spirits, etc. To avoid arguments, other repeatable, over-lapping, or difficult to define values, which might be chosen as an attribute value, are accommodated by Relationships to a Concept. This essentially extends the idea of form/genre, or category, to all the Principal Elements; for example, Paris is a City, French is a Romance Language, and the Bibliothèque nationale de France is a Library. More than one Relationship is allowable, permitting assignment to as many categories as applicable. For Work, this resolved the "format" problem, because one value may be designated "primary" via a degree attribute of the Relationship.
Certain Principal Elements have a role attribute to allow records to represent an instance ("bibliographic" referring to an instance of a Work) or just an authority, as well as permitting the same record to serve both purposes. This distinction is limited to "substantive" Principal Elements, discussed next.
3. Derivation, Instantiation, and VersionsThe integrated structure of XOBIS attempts to be comprehensive within the broad realm of memory institutions. While the Principal Elements may seem reasonably obvious, their working definitions and the cusps between them only emerged with difficulty. New cases continue to inspire debate, and it is likely that precise boundaries between them will only emerge in actual application of the schema.

Relationships between Principal Elements define the basic structure of XOBIS, even when not explicitly recorded. Table 3 provides an example of each Principal Element, illustrating how this extended to the distillation of the Principal Elements. Specific cases of each Concept are instantiated as, or represent one instance of, the value of another particular Principal Element. As in the examples in the previous section, this "isness" functions as a generic form/genre mechanism, or a categorical Relationship, as opposed to a topical one, to a Concept. Specific values not otherwise instantiated refer to another Concept. The value of systematically recording such relationships would be immense.
To elaborate, nine of the Principal Elements are derived from Concept–in selected cases where their type attribute would be "collective." A Concept can also be "abstract" (e.g. Cataloging) or "specific" (e.g. Heimlich Maneuver). Thesaurus construction might be easier if potentially infinite "specific" values were de-emphasized to focus on structural relationships between abstract and collective concepts. Five of the nine Principal Elements are considered notional in that they represent the idea of something, are solely intangible, and solely represent an authority. The remaining four, Place, Being, Object, and Work, are substantive in that they may, in a sense, have a physical existence. Place includes topographic features and buildings as they occupy space, and Work is included due to its carrier or digital existence (as binary arrangements of electrons, atoms, or even pebbles on a beach). As such, they may represent an instance, an authority, or authority/instance as the situation merits.
Fictional relationships are analogous to categorical and topical relationships. "Minnie Mouse" is a member of the category "Fictional Characters" (Concept), whereas she is a fictional exemplar of the topic "Mice" (Concept), the "fictional" Relationship conveying an important distinction to avoid her adulterating the retrieval of specimens of real mice. This is closely akin to the "depictional" Relationship, made famous by Magritte's noting "Ceci n'est pas une Pipe." on a painting of a pipe, and recently covered by a MARC relator code. XOBIS provides a flexible structure, but does not dictate how content populates it. Concepts with a temporal component, e.g. "Extinct Cities," raise similar issues. Resolving such issues is left to an assumed companion set of rules.
Defining Relationships, including Relationships between Relationships, is predictably a growth industry. This area holds potential for knowledge management, new opportunities for catalogers, and can be intellectually stimulating. Much of this kind of information has been gleaned with substantial effort in compiling various reference tools. It is interesting to consider what the world of scholarship would be like if such tools were integrated with bibliographic records and shared an infrastructure. Developing isolated products is counter intuitive in the integrated digital environment of the World Wide Web. Clarifying definitions and relationships present opportunities for catalogers to add value; such work should pay better than being a scribe and is more likely to attract bright students to the profession.
The substantive Principal Elements Object and Work may have optional Versions for "substitutable" content differing in form. This avoids repeating information shared by Versions as each separate Version would inherit common values, while retaining unique ones. Versions potentially apply to Place and Being, as they are also substantive. For example, this could allow one record for a person to include selected discrete identities as Versions. Versions have their own IDs and may have discrete Relationships. Substantive Principal Elements may link to Holdings or Items, envisioned as separate schemas of a larger suite.
4. IdentificationThe core of XOBIS consists of Relationships that can link any of the 10 entities, called Principal Elements, to any other one. For this to work effectively, each case of a Principal Element needs a distinct and relatively stable identity, or Entry. Although appellation is complex, a careful analysis of patterns occurring in representative data permitted application of an integrated structure to names of all the Principal Elements with considerable economy. This approach also supports reuse of resulting consistent identities in a variety of contexts. The Entry for many Principal Elements may consist solely of a distinct Name, although NameSegment is available to allow substructure in more complex cases. Qualifiers, discussed in the next section, handle ambiguity in either case. Since titles serve as the conventional names of works, the elements Title and TitleSegment are used instead in naming a Work. Works in particular would benefit from having unique identities as each Entry would be based on its context among other co-filing titles.
Two other Principal Elements, Being and Time, required additional structures to cover their unique complexities. Personal names may have Surname, Forename, and Expansion, to accommodate spelled-out forms where usage involves abbreviations. Other parts of the personal name Entry are covered by Qualifiers. Thus, a Being can have a Name, such as Pink, or a personal name may consist solely of a Forename. A chronological structure for Time follows the ISO standard (Year, Month, Day, …), although a plain Name may apply, e.g. Spring. XOBIS differs from ISO, in providing a calendar attribute to indicate the Code for each specific calendar, considered a Work authority (cf. discussion of substitution below).
5. Disambiguation (Qualification and Substitution)The Xobian naming structure contains an additional element, Qualifiers, to accommodate an otherwise ambiguous Entry. The Principal Elements themselves automatically provide a first level of disambiguation, although this apparently only suffices routinely for Time. To further avoid duplication of Entry, values representing names of instances of the other nine Principal Elements may be qualified as necessary–with the values of Principal Elements serving double duty in this capacity! Each Entry may be added as a Qualifier element to another Principal Element's name. While trying to keep rules separate from structure, it appears that routine disambiguation of titles (Work) by edition (String) and year (Time) would reduce the need for "mediated" qualification dramatically.
One defining moment in the development of XOBIS was recognition that the Entry for an Organization often involves pre-qualification. For example, an Education Dept. is the actual Organization represented by its authority. Inclusion of the parent body as part of its Entry serves to disambiguate such non-distinct names. That it is subordinate represents a discrete Relationship to its parent. The simplicity of XOBIS is that the only difference from post-qualification is the placement of the Qualifier in front of the Name (in this case, one Organization pre-qualified by another).
Each Qualifier is actually an instance of a Principal Element. Thus, any Qualifier may be under authority control, providing suitable software is available to enforce adherence. Ad hoc values may be used and are apparent due to their lack of an ID attribute identifying the Record of an associated Entry. The Principal Element String, which can accommodate any word or phrase, adds unlimited flexibility to this elegant technique. XOBIS's recursive structure allows any Qualifier to be qualified. For example, a generic title such as "Annual Report" may be qualified by an Organization, which in turn is qualified by a Place, qualified by another Place: Annual Report (Lane Hospital (San Francisco (California))) Such an Entry clearly identifies a Work and can stand alone in title indexes, brief displays, etc. Qualification in XOBIS may be thought of as subordinate precoordination, which keeps the Principal Element in the primary position for consistency in listing and referencing an Entry.
XOBIS contains a substitution feature to handle cases where using a Principal Element's full Entry as a Qualifier may be awkward, excessive, or different from cataloging policy. Several Substitutes are available for use in places where a substitute attribute has been defined. Currently, these equivalent elements include Abbrev, Code, Citation, and Singular. These forms may be used, as required, instead of the full Entry, while maintaining authority control. Because style sheets control display, punctuation may vary depending on the situation, e.g.: Annual Report (Lane Hospital (San Francisco, Calif.))
6. Variation (Equivalence and Subsumption)In XOBIS, a Varia element, containing repeatable Variant elements, handles variation homogeneously. This covers both see references for authorities and variant titles for works using a single mechanism. Each represents an intra-record equivalence relationship to the Entry. However, the technique also allows for subsumption of different, but closely related entries, on the same Record. Each Variant functions as an alternative Entry, using the same name structures, including Qualifiers discussed above. Additionally, a Variant may have a Type and Duration. This works well for identifying acronyms, pseudonyms, nicknames, kinds of added titles, and other equivalent or subsumptive variants and for indicating the specified Time period when they apply.
Type permits categorization of Variants. It is a good example of a generic element in XOBIS. It is structured to allow individual values to be under authority control–like any Concept may be. For example, the collective Concept "Variant Titles" could serve as an umbrella (via a Relationship) for each specific Concept: "Spine Title," "Half Title," "Supplied Title," etc. Adding a new Type value is as simple as creating a new authority record, which could include the definition and application instructions. Such clarity and extensibility contrasts with limitations inherent in MARC's coded indicators and fixed fields where values and definitions are defined separately. As in many parts of XOBIS, records for new and revised values could be supplied by authoritative agencies for other systems to load, much as we load authorities today. Such an automatic feed holds more promise than announcing new and changed values in external documentation, and then waiting for proprietary systems to propagate them in varying ways. XOBIS' relegation from the structure to the content or data makes it self-documenting to the extent definitions, etc. are included in authority records.
7. Description (Transcription and Annotation)XOBIS emphasizes simplicity without being simplistic. Descriptive elements have been redistributed to permit the direct association of such values with an Entry, or with a specific Relationship, depending on which they support. This helps focus emphasis on the relationships between Principal Elements. For example, the transcription of the author statement "by one of its members" and the annotation "Author identified as N.S. Davis" both involve the same Relationship between a Work and its author (Being). Typically, such closely related information is dispersed in three different fields. In XOBIS, many notes can be accommodated solely as Relationships, e.g. "Translation of …" becoming the Relationship to another Work identified by its Entry. Carrying descriptive information as part of a Relationship provides clarity of context and avoids repetition and intermingling information uniquely applicable to either to the source or the target Record being related. This divide and conquer approach works well in many cases, but does not preclude defaulting Description at the Record level, especially to support mapping of retrospective data and to describe complex situations.
The Description element, whether associated with either an Entry (for any one of the 10 Principal Elements) or a Relationship (to any of the same 10), has the same structure. It contains repeatable Notation elements which may have a class attribute, providing broad categories with potential to control display (e.g. transcription, annotation, documentation), and may be distinguished by Type, with a set attribute, to establish groups of named notes. The generic Type mechanism permits new types of notes to be defined without changing the schema.
Description is less important and more challenging in a volatile digital environment. Completed digital documents represent 100% transcription, often available for fulltext searching. Adding structured metadata to embellish an Entry and to identify and clarify Relationships enriches access, while transcriptive notes may replicate fulltext. XOBIS places emphasis on access rather than description, while supporting both.
8. RecursionThe XOBIS structure emerged erratically, as simplicity augured reuse of structural components. Another defining moment occurred when we recognized that we were dealing with a case of tangled hierarchies, made famous by Douglas Hofstadter (Hofstadter 1980). Polyhierarchical and other Relationships between Principal Elements, the Qualifiers technique with or without substitution, and the Type mechanism (using the database to define allowable values) are all recursive or self-referencing. An additional recursive feature expresses Duration of Relationship, birth dates as Qualifiers, date Qualifier to a Title, the publication date Relationship, and Record creation date–using the same Time structure. Such complexities make it difficult to talk about any particular aspect of XOBIS in isolation, but the degree of integration will be more evident once software is available to delineate where you are in a Record, and what options are available to you at that point. Essentially, the complexity of the schema simplifies that of the data.
9. Validation and ControlValidation represents functionality. In addition to the access functionality mentioned earlier, XOBIS assumes a minimal amount of software support to enforce constraints on content beyond those of routine XML editors, particularly in regard to authority control. However, the design should make such development easier. For example, a changed Entry can be propagated throughout the structure and a dataset, based on the ID attribute found wherever an Entry is referenced. References to a Record in external Xobian implementations could periodically be checked for Entry changes. Control of content-related Type values anywhere in a Record would rely on straightforward software as well. Using this approach, "lists" of valid values for languages, relators (Relationships), form/genre terms, etc. could be issued by authoritative agencies for loading into local systems as authority records.
A second area relates to control of the Record itself, as opposed to its content. The ControlData element contains IDs with Varia for Record identity and matching, Actions for record transaction history, and Types to relate functionality to subsets, for example, Suppressed or Pending records, or other designations. This flexibility should support improved record management.
Basic Structure of XOBIS
Figure 2, initially developed by Kevin Clarke, provides a succinct overview of XOBIS' basic structure. It shows the root element, RecordList with a projected Collection for combining data from various implementations. Each Record has a tripartite structure: ControlData, any one of the Principal Elements, and Relationships. Each Relationship to another Principal Element may have a Name, Duration, and Description. Three attributes are available for a Relationship: 1) degree covers the strength of the association, for example a "primary" versus "secondary" topical Concept; 2) class has 10 values, parallel to the Principal Elements, to identify the kind of target involved, e.g. "compositional" for Work, "vital" for Being, etc.; and, 3) type for categorizing the relative navigational direction or type of ordinate involved (subordinate, superordinate, preordinate, postordinate, associative, and dissociative).
Each Principal Element has an Entry forming the nucleus of a Record. "Value" indicates that one of the various name elements applies; depending on the case, this may be Title, TitleSegment, Name, Year, etc. In most cases, Qualifiers may apply. The heavy dotted line indicates that Qualifiers and Relationships both reference a Principal Element, whether it is "established" or not. This brings authority control to Qualifiers. Various Substitutes make this authority mechanism more flexible by recording controlled alternative forms of the Entry which may be used in Qualifiers, where an attribute designates the kind of substituted value. Description completes this core.
Lastly, optional Versions are shown by dotted lines. Each Version has its own IDs and Relationships. This structure was initially included to support the "single-record" approach for similar versions of serials building on the idea of "substitutability." This idea posits that a user, aware of differences in format, would find the substitute content generally acceptable, whereas content in different translations or derivative works would not automatically satisfy the same request. This is another example of structure underpinning functionality. Whether Versions are integrated in the same Record, as in the current structure, or a special kind of linked Record is defined, it does not make sense to duplicate data shared by all the Versions as the difficulty of maintaining one Record is well known.
FRBR ComparisonFRBR provides a framework delineating key bibliographic relationships, and FRANAR extends this to related authorities. XOBIS also provides a framework, albeit a data structure, covering the same areas and extending them. Despite differences, XOBIS and FRBR+FRANAR appear generally complementary. Both defer to external agencies to determine rules necessary to implement the designs.
FRBR's Entities and XOBIS' Principal Elements are remarkably similar in name, although looking closely at the composition of each reveals significant differences in interpretation. For example, often a conference name represents an Event, although it may have been conducted by an organizing committee or secretariat, an Organization. Notably, XOBIS treats series as a Work rather than as a separate entity. XOBIS interprets Work very broadly, providing Substitutes for succinctly identifying subject heading schemes, classification schemes, transliteration tables, calendars, etc. to integrate such sources into the framework. Likewise, each alphabet, academic course, markup language, computer font, etc. would constitute a Work in XOBIS, while these are usually only treated as topical subjects.
Versions in XOBIS may approximate some FRBR Expressions, and XOBIS' Work element, with a role attribute value of "authority," will allow any number of hierarchical Relationships encompassing those in FRBR. It is heartening that FRBR-related research at OCLC indicates that a relatively small number of cases really need this sort of controlled order to gain significant benefits (Bennett et al. 2003). For XOBIS, we were aware of such relationships, but considered them a pragmatic issue since work/expression distinctions are not necessary in most cases.
There are some striking similarities with CIDOC's CRM, but it too is more prescriptive and focuses on museums, much as FRBR has a library focus. It appears that FRBR has moved in the same direction as XOBIS, but perhaps not as far, and with a more conservative reliance on tradition. Those working on the FRBR, CRM, and XOBIS models recognize the need for improved resilience and increased interoperability of systems for cultural heritage materials.
ProjectionThe XOBIS core appears stable, although there are a number of pending issues. Taking an idealistic approach has made mapping from MARC more difficult although we are making progress. Our initial concentration on a subset of common fields is supported by William Moen's recent research finding that less than 5% of available content designation in MARC accounts for 80% of the occurrences (Moen and Bernardino 2003). Some features of XOBIS are not supported in MARC and will necessarily have to be mapped as defaults; local MARC extensions cover other details, but extensions are not practicable in some basic areas, such as identifying the kind of Qualifer.
A flawed clustering mechanism to identify preferred Entry by Language was removed before releasing the alpha version of XOBIS. We hope to resolve this deficit generically in a beta version for improved internationalization. Accommodating non-topical subheadings with a subdivision attribute has not been tested. Mapping highly specialized MARC data, which could be handled by separate coordinated schemas (e.g. for cartographic details), remains unaddressed. Lesser issues involve cusp clarifications, mixed material treatment, etc. Relationships need further investigation because of their importance.
While XOBIS supports many possibilities, delving into specifics without rules to guide decisions makes it difficult to proceed with assurance. Our intent was not to develop in isolation, but to demonstrate feasibility of a new structure and seek collaboration to further explore the possibilities that might ensue. To be effective, such a data structure would require development of correlating rules, ideally an International Cataloging Code. As these are such large tasks, we hope that XOBIS contributes toward encouraging the needed transitions.
There is a truism that nothing is constant except change. The XOBIS framework emphasizes generality and pragmatism, but deliberately avoids specifying details of how unpredictable data values should populate its structure, leaving this to implied rules governing content. Likewise, the structure articulates discretely with implicit software functionality. Separation of content, structure, and function, a uniform system for entries, preeminence of relationships, and other recursive features aim to balance rigor with flexibility to maximize information access and reuse. In sum, XOBIS represents an experimental effort focused on addressing metadata as the critical bridge between content and sophisticated access–all three increasingly focused on XML in a digital environment.
Works citedBennett, Rick, Brian F. Lavoie, and Edward T. O'Neill. 2003. "The Concept of a Work in WorldCat: An Application of FRBR". In Library Collections, Acquisitions, and Technical Services 27(1): 45-59.
Hofstadter, Douglas R. 1980. Gödel, Escher, Bach: an Eternal Golden Braid. Vintage Books.
Miller, Dick R. and Kevin S. Clarke. 2002. XOBIS: The XML Organic Bibliographic Information Schema. Available online at http://elane.stanford.edu/laneauth/XOBIS.pdf (accessed 2 January 2004). Schema website available online at http://xobis.stanford.edu.
Miller, Dick R., and Kevin S. Clarke. 2003. Putting XML to Work in the Library, American Library Association.
Moen, William E., and Penelope Benardino. 2003. "Assessing Metadata Utilization: An Analysis of MARC Content Designation Use". In Proceedings of the 2003 Dublin Core Conference: Supporting Communities of Discourse and Practice – Metadata Research & Applications. Available online at http://www.siderean.com/dc2003/502_Paper58.pdf (accessed 5 January 2004). 23/23