Web searches tap databases

By Kimberly Patch, Technology Research News

Although the computer has made it possible to quickly search through documents and databases, sifting through a series of sources -- like a local database, a bunch of text documents, and the Web -- still means using different programs and different searches.

Researchers from Birkbeck University of London in England have written software designed to allow users to search for something without having to know where it might reside.

The search method makes it possible to search different types of sources at the same time, said Richard Wheeldon, a researcher at the Birkbeck University of London. "Think of how difficult it is to search a company's intranet, file system and databases at the same time," he said. "With a few alterations to our technology, it could be made incredibly simple."

The key to the method is software, dubbed DbSurfer, that permits free-text searches on the contents of relational databases. Data stored in relational databases is ordinarily accessed using queries structured to match the organization of the database.

DbSsurfer enables free text relational database searches through a modified version of the trails method used to organize the links contained in hypertext, said Wheeldon.

A trail is a sequence of connected pages. As far back as 1945, computer pioneer Vannevar Bush wrote about the concept of a web of trails. Hypertext and the World Wide Web take advantage of this concept, but databases do not, according to Wheeldon. "Trails have often been used in hypertext systems, but never in relational database systems," he said.

Relational databases organize information using tables subdivided by fields like columns and rows. Individual pieces of information, or records, reside in cells delineated by columns and rows. Relationships between records are determined by fields that the records have in common.

The researchers' software automatically constructs trails across tables in relational databases, according to Wheeldon. The software treats each database row as a virtual Web page, and builds links according to database settings, he said.

When presented with a free text database query, DbSurfer's navigation engine calculates scores for each database row, and the best scores are used to construct trails. The scheme uses a probabilistic best-first algorithm to select the most relevant trails. A probabilistic best-first algorithm assigns more promising alternatives higher probabilities. The researchers' Best Trail algorithm does this in two ways -- proportionally according to the score assigned to the trail, and decreasing exponentially according to rank.The program presents the results to the user as a navigation search interface.

The researchers have also used the same basic system to search the Web, a group of Java documents, program code, and Usenet newsgroups, said Wheeldon. "Theoretically, it could also be used in virtual environments or as a search application at the operating system level," he said.

The method uses standard keyword searches of data sources, and is easily customized, said Wheeldon. Data is represented in the Internet's extensible markup language (XML), and this means "the look of the pages can be changed in many different ways," Wilson said.

The technical challenge to building the software was being able to construct trails efficiently, said Wheeldon. "Trail construction is now typically performed in a few hundredths of a second," he said.

The current prototype won't scale to very large databases, but this is not a fundamental limitation, said Wheeldon. "Anything more than a few tens of millions of rows and the system will choke [but] this is easily fixed in theory," he said.

The software does have a downside -- it is not secure enough for highly sensitive data, Wheeldon said.

The next steps in developing the system are linking the DbSurfer indexer to a Web robot, optimizing the indexer for common databases, and adding software that will enable the entire index and trail structure to be accessed from within the database interface, according to Wheeldon.

The software could be ready for deployment in less than a year, said Wheeldon.

Wheeldon's research colleagues were Mark Levine and Kevin Keenoy. The research was funded by the UK Engineering and Physical Sciences Research Council (EPSRC).

Timeline:   > 1 year
Funding:   Government
TRN Categories:  Databases and Information Retrieval; Internet
Story Type:   News
Related Elements:  Technical paper, "Search and Navigation in Relational Databases," posted in the Computing Research Repository (CoRR) database at arxiv.org/abs/cs.DB/0307073.


September 24/October 1, 2003

Page One

Radio tags give guidance

Laser made from single atom

Web searches tap databases

Heated plastic holds proteins

News briefs:
Reflective dust IDs substances
Rapid process shapes aluminum
3D display goes deeper
Artificial DNA stacks metal atoms
Teamed lasers make smaller spots
Glow shows individual DNA


Research News Roundup
Research Watch blog

View from the High Ground Q&A
How It Works

RSS Feeds:
News  | Blog  | Books 

Ad links:
Buy an ad link


Ad links: Clear History

Buy an ad link

Home     Archive     Resources    Feeds     Offline Publications     Glossary
TRN Finder     Research Dir.    Events Dir.      Researchers     Bookshelf
   Contribute      Under Development     T-shirts etc.     Classifieds
Forum    Comments    Feedback     About TRN

© Copyright Technology Research News, LLC 2000-2006. All rights reserved.