A New Filtering Algorithm for Duplicate Document Based on Concept Analysis. A New Filtering Algorithm for Duplicate Document Based on Concept Analysis.

A New Filtering Algorithm for Duplicate Document Based on Concept Analysis‪.‬

Journal of Computer Science 2006, May, 2, 5

    • 2,99 €
    • 2,99 €

Beschreibung des Verlags

Abstract: Data bases and web pages contain currently a huge number of duplicate document. It is then fundamental to have a filter which can be embedded, for instance, within an information retrieval system like a search engine in order to prohibit the redundant documents references to appear on the screen as a reply to the user's query. This filter can save the user time and increases his satisfaction. In this study, we propose a new algorithm based on concept analysis principle, which can act as a filter for duplicate document. It can be applied on a collection of documents or databases and reduce their storage spaces by eliminating redundant documents without loosing knowledge. Our experiments show that this algorithm increases the precision of the information retrieval system and improves its performance. Key words: Duplicate document, concept analysis, information retrieval, information filtering

GENRE
Computer und Internet
ERSCHIENEN
2006
1. Mai
SPRACHE
EN
Englisch
UMFANG
21
Seiten
VERLAG
Science Publications
ANBIETERINFO
The Gale Group, Inc., a Delaware corporation and an affiliate of Cengage Learning, Inc.
GRÖSSE
193,7
 kB
Experiment and Evaluation in Information Retrieval Models Experiment and Evaluation in Information Retrieval Models
2017
Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIII Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIII
2015
Information Retrieval Technology Information Retrieval Technology
2008
Advances in Distributed Agent-Based Retrieval Tools Advances in Distributed Agent-Based Retrieval Tools
2010
Advances in Databases and Information Systems Advances in Databases and Information Systems
2018
Information Search, Integration, and Personalization Information Search, Integration, and Personalization
2020
Data Mining: A Preprocessing Engine. Data Mining: A Preprocessing Engine.
2006
An Exact Algorithm for the Unbounded Knapsack Problem with Minimizing Maximum Processing Time. An Exact Algorithm for the Unbounded Knapsack Problem with Minimizing Maximum Processing Time.
2007
Effective Factors on Iranian Consumers Behavior in Internet Shopping: A Soft Computing Approach (Report) Effective Factors on Iranian Consumers Behavior in Internet Shopping: A Soft Computing Approach (Report)
2009
New Cryptosystem Using Multiple Cryptographic Assumptions (Report) New Cryptosystem Using Multiple Cryptographic Assumptions (Report)
2011
Childhood Cancer-a Hospital Based Study Using Decision Tree Techniques (Report) Childhood Cancer-a Hospital Based Study Using Decision Tree Techniques (Report)
2011
Exploring the Relationship Between Cohesion and Complexity. Exploring the Relationship Between Cohesion and Complexity.
2005