论文部分内容阅读
主题网络爬虫是针对某一特定领域进行信息采集的网络爬虫,本文提出将主题网络爬虫应用于数字档案馆的信息采集,以档案采集系统的设计目标为出发点,阐述了基于主题网络爬虫的档案信息采集系统的设计方案和该系统实现的相关技术。
The subject crawler is a web crawler that collects information in a specific area. This paper proposes to apply the subject crawler to the collection of information in the digital archives. Based on the design goal of the archive acquisition system, this paper describes the archives information based on the crawler Acquisition system design and the system to achieve the relevant technology.