Web searches tap databases
By
Kimberly Patch,
Technology Research News
Although the computer has made it possible
to quickly search through documents and databases, sifting through a series
of sources -- like a local database, a bunch of text documents, and the
Web -- still means using different programs and different searches.
Researchers from Birkbeck University of London in England have
written software designed to allow users to search for something without
having to know where it might reside.
The search method makes it possible to search different types
of sources at the same time, said Richard Wheeldon, a researcher at the
Birkbeck University of London. "Think of how difficult it is to search
a company's intranet, file system and databases at the same time," he
said. "With a few alterations to our technology, it could be made incredibly
simple."
The key to the method is software, dubbed DbSurfer, that permits
free-text searches on the contents of relational databases. Data stored
in relational databases is ordinarily accessed using queries structured
to match the organization of the database.
DbSsurfer enables free text relational database searches through
a modified version of the trails method used to organize the links contained
in hypertext, said Wheeldon.
A trail is a sequence of connected pages. As far back as 1945,
computer pioneer Vannevar Bush wrote about the concept of a web of trails.
Hypertext and the World Wide Web take advantage of this concept, but databases
do not, according to Wheeldon. "Trails have often been used in hypertext
systems, but never in relational database systems," he said.
Relational databases organize information using tables subdivided
by fields like columns and rows. Individual pieces of information, or
records, reside in cells delineated by columns and rows. Relationships
between records are determined by fields that the records have in common.
The researchers' software automatically constructs trails across
tables in relational databases, according to Wheeldon. The software treats
each database row as a virtual Web page, and builds links according to
database settings, he said.
When presented with a free text database query, DbSurfer's navigation
engine calculates scores for each database row, and the best scores are
used to construct trails. The scheme uses a probabilistic best-first algorithm
to select the most relevant trails. A probabilistic best-first algorithm
assigns more promising alternatives higher probabilities. The researchers'
Best Trail algorithm does this in two ways -- proportionally according
to the score assigned to the trail, and decreasing exponentially according
to rank.The program presents the results to the user as a navigation search
interface.
The researchers have also used the same basic system to search
the Web, a group of Java documents, program code, and Usenet newsgroups,
said Wheeldon. "Theoretically, it could also be used in virtual environments
or as a search application at the operating system level," he said.
The method uses standard keyword searches of data sources, and
is easily customized, said Wheeldon. Data is represented in the Internet's
extensible markup language (XML), and this means "the look of the pages
can be changed in many different ways," Wilson said.
The technical challenge to building the software was being able
to construct trails efficiently, said Wheeldon. "Trail construction is
now typically performed in a few hundredths of a second," he said.
The current prototype won't scale to very large databases, but
this is not a fundamental limitation, said Wheeldon. "Anything more than
a few tens of millions of rows and the system will choke [but] this is
easily fixed in theory," he said.
The software does have a downside -- it is not secure enough for
highly sensitive data, Wheeldon said.
The next steps in developing the system are linking the DbSurfer
indexer to a Web robot, optimizing the indexer for common databases, and
adding software that will enable the entire index and trail structure
to be accessed from within the database interface, according to Wheeldon.
The software could be ready for deployment in less than a year,
said Wheeldon.
Wheeldon's research colleagues were Mark Levine and Kevin Keenoy.
The research was funded by the UK Engineering and Physical Sciences Research
Council (EPSRC).
Timeline: > 1 year
Funding: Government
TRN Categories: Databases and Information Retrieval; Internet
Story Type: News
Related Elements: Technical paper, "Search and Navigation
in Relational Databases," posted in the Computing Research Repository
(CoRR) database at arxiv.org/abs/cs.DB/0307073.
Advertisements:
|
September 24/October 1, 2003
Page
One
Radio tags give guidance
Laser made from single
atom
Web searches tap databases
Heated plastic holds
proteins
News briefs:
Reflective dust
IDs substances
Rapid process
shapes aluminum
3D display goes deeper
Artificial
DNA stacks metal atoms
Teamed lasers
make smaller spots
Glow shows individual
DNA
News:
Research News Roundup
Research Watch blog
Features:
View from the High Ground Q&A
How It Works
RSS Feeds:
News | Blog
| Books
Ad links:
Buy an ad link
Advertisements:
|
|
|
|