Surf's up: Computer wavelet tool filters information
December 02, 1998
RICHLAND, Wash. –
The amount of information available to businesses, governments and scientists today is unprecedented. Businesses must pay close attention to marketing plans, strategy reports and government regulations. Governments must analyze satellite data, news and intelligence reports quickly and thoroughly.
The pressure to keep one step ahead of the competition can create information overload.
But mathematicians and computer scientists at the Department of Energy's Pacific Northwest National Laboratory are developing an escape from information anxiety - TOPIC ISLANDSTM. This new interactive software program transforms data from large documents into visualizations and excerpted summaries. It recognizes themes and the evolution of topics within a document then breaks it into easily understandable sections.
"This technology could help people who are overloaded with information, such as teachers, researchers and lawyers," said Rik Littlefield, senior research scientist at Pacific Northwest. "They could find out what topics are discussed in a document and to what degree without having to spend 10 hours reading."
TOPIC ISLANDSTM creates visual outlines of major themes much like a student would outline an essay using Roman numerals depicting major themes and letters indicating subtopics. However, this new computer program categorizes the document into themes using algorithms and can process many pages simultaneously.
For example, Pacific Northwest researchers tested the technology on speeches given by Fidel Castro over the past 30 years. The test detected the main theme of each speech while sifting through tangent topics. TOPIC ISLANDSTM was able to quickly focus on the main theme of Castro's speeches and the order in which he visited various topics in each speech.
TOPIC ISLANDSTM is applicable to the daily needs of individuals and organizations. The program could be used by businesses wanting to better manage document storage, legal aides searching for case law, intelligence agencies needing improved information analysis and business owners trying to keep up on the latest trends. Scientists and technical editors also could use the technology to better manage their workload, information needs and research requirements.
The underlying technology is called TOPIC-O-GRAPHYTM. Here's how it works:
A computer program creates a digital signal using words within a document. The signal is inputted to a wavelet engine, or a tool that mathematically filters the signal to varying degrees. TOPIC ISLANDSTM can be tailored to a person's needs by further filtering the text to create more detailed information of theme changes. The resulting thematic structure can be visualized in many different ways or can be formatted into a table of contents with emphasis on the most prominent themes.
"Our goal is to reduce the time a person needs to spend reading long articles," Littlefield said. "This technology allows a person to determine if that document possesses pertinent information and deserves further attention. A person can understand what information is in a document, what themes it covers and whether it requires complete reading."
The U.S. intelligence community paid about $200,000 for development of TOPIC ISLANDSTM over the past year. The technology is being advanced further this year through the Defense Advanced Research Projects Agency for about $120,000.
This developing technology is not yet available for licensing. Business inquiries should be directed to Dennis McQuerry, business development coordinator for PNNL Information Visualization, at firstname.lastname@example.org or 509-375-2953. Also, information about TOPIC ISLANDSTM is available at Pacific Northwest's information visualization web site, http://multimedia.pnl.gov:2080/infoviz/index.html.
no tag display