Iranian Researchers Launch Text Mining System for Persian Literary Texts
“One of the new fields of computer science is text analysis which enjoys a precise and solid connection with the Persian language and literature in its method and purpose,” Ehsan Rayeesi, a faculty member of Isfahan University who has a PhD in Persian language and literature - mystical literature and a bachelor's degree in software engineering.
He explained that text mining focuses on extraction of valuable information and patterns from unstructured data which has become very popular in the past two decades.
“Designing and developing a web-based system for using artificial intelligence and text mining to analyze and understand text in various fields, from the specialized needs of people studying Persian language and literature to the non-scientific needs of different classes of society, producing technical knowledge, using the methods and techniques of literary sciences for the promotion and development of methods and techniques of text mining and artificial intelligence were among the goals of this project that was fortunately realized,” Rayeesi said.
Text mining, text data mining (TDM) or text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources."
Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning.
Text mining usually involves the process of structuring the input text (usually parsing, along with the addition of some derived linguistic features and the removal of others, and subsequent insertion into a database), deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interest.
Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling (i.e., learning relations between named entities).
4155/v