|

Models and Methods of Automatic Classification of Text Documents

Authors: Andreev A.M., Berezkin D.V., Syuzev V.V., Shabanov V.I. Published: 06.05.2014
Published in issue: #4(53)/2003  
DOI:

 
Category: Informatics & Computing Technology  
Keywords:

A problem to determine term weights is considered for text data processing programs (search, classification, making a quasi-abstract, clustering). Possible problem solutions are analyzed. Algorithms and applicable software are developed for each of them. The experimental comparison of the methods has been conducted on the basis of an example of automatic classification. The best results are obtained for the method of approximated extraction of word combinations, based on the statistical data.