DOU Luyao, WANG Zehao, ZHOU Zhigang, MIAO Junzhong
LIBRARY TRIBUNE. 2026, 46(5): 73-84.
Identifying named entities in classical texts—the vital repository of Chinese civilization—holds great significance for advancing digital humanities,digitizing ancient manuscripts,and building knowledge graphs of ancient books. To address challenges in model architecture and transfer learning capability for named entity recognition (NER) in classical texts,this paper proposes AT-LSFT:a novel framework based on the collaboration of large and small models,as well as system fine-tuning strategies. Specifically,the large language model,LLaMA3-8B,provides semantic guidance and entity pre-localization,while the smaller BiLSTM-CRF model handles feature decoupling and fine-grained entity recognition. Through three phases of system fine-tuning involving prompt engineering,knowledge distillation,and dynamic feedback,the two models effectively balance semantic comprehension and reasoning efficiency. Extensive experiments on 19 baseline models,including traditional,deep learning-based,pre-trained,and large language models,show that AT-LSFT achieves superior performance with an F1 score of 84.8%. The framework consistently outperforms existing mainstream methods in recognizing various types of classical text entities,such as persons,locations,official titles,and organizations,demonstrating stronger semantic parsing capabilities and generalization performance. It also excels at recognizing low-frequency entities and resolving semantically ambiguous samples.