استخدام اللغة العربية في نظم استرجاع المعلومات
استخدام اللغة العربية في نظم استرجاع المعلومات
No Thumbnail Available
Date
2015-06-13
Authors
ناديه مصطفى العيدورس أحمد
Journal Title
Journal ISSN
Volume Title
Publisher
University of Khartoum
Abstract
Dissertation Title: The use of Arabic Language in Information Retrieval
Systems
Prepared by: Nadia Mustafa Elidrous Ahmed
Supervisor: Professor Radia Adam Mohammed
University: University of Khartoum - Faculty of Arts- Department
of Library and Information Sciences
Pages: 309 p
Year: 2007
The aim of this research is to study the characteristics of Arabic language
which affect the information retrieval process and to identify the problems
arising from the use of Arabic language in the different information systems
then come up with recommendations to solve such problems. Also to
compare the tools and search strategies employed in the information
retrieval systems as regards to their influence on the precision ratio of the
retrieval documents and to study the implications of the application or none
application of Arabic language processing technologies in computers
(spelling and grammar checking, and morphological analysis … etc) on the
processing of Arabic language characteristics (synonymy, homonym,
antonymy … etc).
The research adopted the descriptive and experimental methodologies by
executing a series of investigations on a selected sample of the Arabic and
Arabic language supporting retrieval systems ( integrated, none integrated,
open source and the search engine available in the internet ) using different
search implement levels (word, prefix and sub fix and the word, two words,
a sentence and the root of the word ) and employing different search
methods ( the natural language, Boolean, search in specified fields, search
using quotation signs ). All this to measure the effect of using Arabic
language characteristics (synonymy, and homonym … etc) in addition to the
spelling mistakes and miswriting of Arabic words and in regards to the
comparison between the precision ratio (finding the percentage of the
records concerning the search topic to the information retrieval system
tacking in consideration the processing of Arabic language technologies of
the characteristics of the language, the Arabic definition article ( أل ) and the
different levels of research mentioned above.
The study depended on a sample selected from the following information
retrieval systems: the UNESCO mechanized documentation system.
(CDSISIS), the Horizon system, the Greenstone system, the Arabic search
engine (Alidrisi and Ayna) and the Arabic language supporting search engine
(Google).
The study report comprises of five chapters. The first chapter is about the
Arabic language, its characteristics, properties, systems and sciences. The
second chapter handles the use of Arabic language in computers,
arabicization of computers, standardization of Arabic language in
computers, the mechanized processing of Arabic language in computers, the
techniques of processing the Arabic language in computers and the
problems of information retrieval in the Arabic language. The third chapter
concentrates on the commercial and none commercial information retrieval
systems employed in the information institutions with stress on the services
provided, language used, the technologies affecting the systems and the
steps followed and tools used in searches. Sample of these systems are
mentioned. Chapter four deals with the Arabic information retrieval systems
and the Arabic information retrieval and Arabic language supporting
systems available in the internet. The advantages and disadvantages, the
services of the internet and the technologies influencing it. The chapter also
concentrates on the search tools, the search engine, the directories and
portals of search, examples of these are also mentioned. Chapter five
contains an analytical study aimed to investigate the effect of using the
characteristics of Arabic language in the information retrieval systems.
The study culminated in many results. The most important of which are:
Despite the increase of the Arabic content in the internet, the proper and
accurate dealing with this content and making use of it as much are possible
is still limited. This is due to inefficiency of the search mechanism of the
Arabic language which takes in consideration all the characteristics of the
Arabic language. Many conclusions are made e.g: the need for a compulsory
standard for Arabic encoding characters when publishing information in the
internet.
Description
340 Pages
Keywords
اللغه العربيه; نظم استرجاع المعلومات; الحروف العربية; النظام اللغوي; النظام الصرفي