Content scheme banishes browser plug-ins

By Ted Smalley Bowen, Technology Research News

There’s not much guesswork involved in pulling a book off the shelf. Little has changed since Johann Gutenberg came up with movable type -- provided you read the language, you can just crack the cover and you're on your way.

By contrast, the babble of data formats represented on the present-day Internet considerably confuses the process of accessing digital information. Your basic Web browser can only handle so many data types, and the prospect of searching for and adding the right plug-in can be laborious even when successful. And as with all things digital, nothing stays the same for long.

In order to display what you want, your browser must be able to make sense of the relationships within groupings of digital files so it can, for instance, show the correct graphic with a block of text, accommodate both the thumbnails and larger views of a set of pictures, or synchronize a video with lecture slides. The problem involves finding and coordinating the right programs to display the various types of text, image, or sound files scattered throughout the Internet.

A Cornell University researcher has found a way to identify key characteristics of digital content in order to match content with programs that can display it in a browser.

The scheme involves a modification of existing software for storing and displaying digital content files that separates the two processes, according to the researcher, Naomi Dushay.

The scheme is especially useful for displaying content from the Internet because neither the content nor the program that activates it need to be present on the system that displays the content.

The context broker software at the heart of the scheme is a set of Java programs that generates Web pages. The context broker acts as a go-between for the repositories that store digital content, the programs that act on the content, and the browsers that display the results.

By separating the storage and maintenance of digital content from its presentation, the scheme could foster more specialization throughout the digital community, said Dushay. “Digital content providers might choose to specialize in content only, or also put up context brokers and go after both presentation as well as content,” she said.

The scheme also opens up the possibility of more individualized presentation of data, or for augmenting the presentation of data created by others, said Dushay. "For example, the Cornell University Library might have some whizzy presentation, rendering mechanisms targeted for members of the Cornell University community. These might be made available via... context brokers."

The method could also allow "searching, categorizing sites such as Google or Yahoo [to] provide a context broker so users want to access resources via their sites," she said.

The context broker gains information about the content from the metadata contained within content files. Metadata is data about data, and can include, for example, descriptions of the contents of a file or groups of files, or administrative details related to the data.

The scheme uses this structural metadata to identify the appropriate program for presenting the data. The programs are listed in a behavior registry, which also includes information about how a playback program can be accessed, which data structures the program can handle, and what effects each program can produce.

The context broker ties digital content to the behaviors these playback programs can produce. By changing the behaviors listed in the behavior registry, content behaviors can be changed without modifying the content itself.

Dushay tested the scheme using the Cornell Digital Library research group’s Fedora, a repository that stores agglomerations of different types of data drawn from multiple locations.

To use the scheme, a user looks through a list of playback effects available for a given piece of content in the repository and requests that a certain program present the content in question.

The software matches the content's structural metadata and access points for assigning behaviors to the content with the appropriate playback program, then loads the program and uses it to access the content and display it in a Web browser.

A lot of content is now created with the kind of explicit structural metadata the scheme calls for, said Dushay. In addition, objects lacking it could be assigned metadata by inferring the information or by "using some sort of fuzzy pattern matching on structural access points required by behavior mechanisms,” she said.

Although the context broker model does not require control over the content or playback programs, end-users will need direct or indirect authorization to access the content. Metadata access could be made separate from access to the data itself, said Dushay. “It's possible to expose structural metadata without exposing the content itself. It's possible to determine the potential for [playback] behaviors with only the structural metadata, though eventually that content will be required to actually perform those behaviors,” she said.

Dushay is also planning on making the scheme work with other content repositories. The scheme will eventually use more sophisticated pattern matching as a means of sorting through structural metadata, and there may be ways to add more detailed descriptive information to that metadata, she said.

She also has plans to tailor the context broker’s playback for individual users, to allow differences in spoken language or language proficiency to condition how each user receives the data.

To bring the scheme beyond the proof-of-concept stage the amount of computing and network resources needed to pull the various pieces together could become an issue, Dushay noted.

The network resources needed to provide access to structural metadata "could get costly, but perhaps this could be alleviated with caching or mirroring of frequently used data and mechanisms at context broker sites,” she said.

The format of metadata and the behavior mechanism input requirements will also impact performance, she said. "If it's possible to index the input requirements [and] structural metadata for fast look up, great. But if they're "fuzzy" matches, then this may become a performance issue.”

Dushay's work was funded by the National Science Foundation (NSF).

Timeline:   Now
Funding:   Government
TRN Categories:   Databases and Information Retrieval; Internet
Story Type:   News
Related Elements:  Technical paper, “Using Structural Metadata to Localize Experience of Digital Content”,


April 17/24, 2002

Page One

Shake and serve

Odds not hopeless for new Web sites

Content scheme banishes browser plug-ins

Polarized light speeds messages

File compressor ID's authors


Research News Roundup
Research Watch blog

View from the High Ground Q&A
How It Works

RSS Feeds:
News  | Blog  | Books 

Ad links:
Buy an ad link


Ad links: Clear History

Buy an ad link

Home     Archive     Resources    Feeds     Offline Publications     Glossary
TRN Finder     Research Dir.    Events Dir.      Researchers     Bookshelf
   Contribute      Under Development     T-shirts etc.     Classifieds
Forum    Comments    Feedback     About TRN

© Copyright Technology Research News, LLC 2000-2006. All rights reserved.