Document Classification Using Weighted Ontology


This paper presents document comparison and classification model for Lithuanian language texts based on weighed ontology. The tests have been performed to measure several aspects: i) quality of comparison of documents; ii) optimal size of ontology; iii) type of part of speech words used to create ontology. Final results indicate 96% of correct classification cases and suggest that all the main part of speech terms should be used from the text. The proposed model can be used to classify texts more efficiently than keyword based systems.