The invention discloses an integrated automatic
lexical analysis method for ancient Chinese texts. The method includes the following steps: pre-training the word vector of the ancient Chinese with semantic features by using the Word2Vec model; adding the
information data appearing in the historical documents to the ancient name
database to form a number of
proper noun entries; adjusting Bi-LSTM- Each parameter of the CRF neural
network model preprocesses the final training corpus into a model readable form, loads into the neural
network model, continuously iteratively learns, and automaticallyevaluates the labeling result of the test corpus. According to the method, a
sentence segmentation, word segmentation and part-of-speech tagging integrated tagging method is adopted, the repeated tagging process of
lexical analysis of multiple sub-tasks is omitted, and multi-stage
diffusion of repeated tagging errors is also avoided; According to the method, a
deep learning model is adopted, richlanguage features can be learned automatically, and the work of manually customizing a feature template in traditional
machine learning is omitted; The labeling model is accelerated by adopting GPU hardware, the model
training time can be greatly shortened, and the efficiency is much higher than that of a traditional
machine learning model.