International Conference on Communication, Media, Technology and Design

En | الرئيسية | التسجيل الالكتروني | التعلم الالكتروني | اتصل بنا | البريد الالكتروني

BUILDING ARABIC AUTOMATIC THESAURUS USING CO-OCCURANCE TECHNIQUE

Famagusta – North Cyprus

2013.05.02

Abstract
One of the major problems of modern Information Retrieval (IR) systems is the word mismatch word mismatch that concerns the discrepancies between terms used for describing documents and the terms used by the researchers to describe their information need.
One way of handling the Word mismatch is by using a thesaurus, that shows (usually semantic) the relationships between terms. The main goal of this study is to design and build an automatic Arabic thesaurus using Co-occurrence technique that can be used in any special field or domain to improve the expansion process and to get more relevance documents for the user's query. Results from this study were compared with the traditional information retrieval system.
Two hundred and forty two Arabic documents and 59 Arabic queries were used for building the requirements of the thesaurus,such as inverted File, indexing, term-term co-occurrence matrix, etc. All of these documents involve computer science and information system vocabulary.
The system was implemented in ORACLE 10 g environment and run on Pentium-4 laptop with 2.13GHz speed, 2.86MB RAM memory, and hard disk capacity of 500GB. Building this technique can be used in any special field or domain to improve the expansion process and to get more relevant documents for the user's query.
In this paper, we concluded that the Co-Occurrence thesaurus improved the recall. However, it has many limitations over the traditional information retrieval system in terms of recall and precision level.

عرض الملف