( 1 of 1 ) |
United States Patent | 5,819,260 |
Lu , et al. | October 6, 1998 |
A phrase recognition method breaks streams of text into text "chunks" and selects certain chunks as "phrases" useful for automated full text searching. The phrase recognition method uses a carefully assembled list of partition elements to partition the text into the chunks, and selects phrases from the chunks according to a small number of frequency based definitions. The method can also incorporate additional processes such as categorization of proper names to enhance phrase recognition. The method selects phrases quickly and efficiently, referring simply to the phrases themselves and the frequency with which they are encountered, rather than relying on complex, time-consuming, resource-consuming grammatical analysis, or on collocation schemes of limited applicability, or on heuristical text analysis of limited reliability or utility.
Inventors: | Lu; Xin Allan (Springboro, OH), Miller; David James (Dayton, OH), Wassum; John Richard (Springboro, OH) |
Assignee: |
Lexis-Nexis
(Miamisburg,
OH)
|
Appl. No.: | 08/589,468 |
Filed: | January 22, 1996 |
Current U.S. Class: | 707/700 ; 707/770; 707/917; 707/968; 707/999.001; 707/999.003; 707/999.004; 707/999.005; 707/E17.058 |
Current International Class: | G06F 17/30 (20060101); G06F 017/30 () |
Field of Search: | 395/604,603,605,600,602 707/3,1,4,5 |
4864502 | September 1989 | Kucera et al. |
4868750 | September 1989 | Kucera et al. |
4914590 | April 1990 | Loatman et al. |
4931936 | June 1990 | Kugiyama et al. |
4994966 | February 1991 | Hutchins |
5123103 | June 1992 | Othaki et al. |
5146405 | September 1992 | Church |
5161105 | November 1992 | Kugimiya et al. |
5225981 | July 1993 | Yokogawa |
5251316 | October 1993 | Anick et al. |
5265065 | November 1993 | Turtle |
5287278 | February 1994 | Rau |
5289375 | February 1994 | Fukumochi et al. |
5297042 | March 1994 | Morita |
5299124 | March 1994 | Fukumochi et al. |
5410475 | April 1995 | Lu et al. |
5418948 | May 1995 | Turtle |
5481742 | January 1996 | Worley et al. |
5488725 | January 1996 | Turtle |
Salton et al., "A Simple Syntactic Approach for the Generation of Indexing Phrases", Technical Report, Department of Computer Science, Cornell University, Ithaca, New York, pp. 1-8 (note this reference is cited on form PTO 1449), Jul. 1990. . Salton, Gerard, et al., "A Simple Syntactic Approach for the Generation of Indexing Phrases", Technical Report, Department of Computer Science, Cornell University, Ithaca, New York, Jul. 1990. . Coates-Stephens, Sam, "The Analysis and Acquisition of Proper Names for the Understanding of Free Text", Computers and the Humanities, Kluwer Academic Publishers, the Netherlands, vol. 26, pp. 441-456, 1993. . Evans, David A., et al., "Automatic Indexing Using Selective NLP and First Order Thesauri", Intelligent Text and Image Handling, Proceedings of a Conference on Intelligent Text and Image Handling `RIAO 91`, Barcelona, Spain, Apr. 1991. . Chruch, Kenneth Ward, et al. (Bell Laboratories), "A Stochastic Parts Program and Noun Phrases Parser for Unrestricted Text", Proceedings of 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No. 89CH2673-2), Glasgow, Scotland, UK, May 1989, pp. 695-698. . Ahlswede, Thomas, et al., "Automatic Construction of a Phrasal Thesaurus for an Information Retrieval System from a Machine Readable Dictionary", Proceedings of RIAO '88, Cambridge, Massachusetts, Mar. 1988, pp. 597-608.. |