摘要
目前大多数P2P系统只支持基于文件标识的搜索,大大限制了P2P的应用范围.纯P2P网络所采用的广播式搜索盲目低效,浪费网络带宽.提出了P2P环境下基于内容的智能搜索算法.利用向量空间模型进行基于相似度的查询.节点对以往的查询进行查询聚类,对当前的查询,根据查询类选择最有可能包含查询结果的节点发送查询,提高了搜索的效率.随着查询的进行,查询类可以自动调整,维护代价不大,具有自适应的特点.实验证明,基于内容的智能搜索在保证查询效果的前提下大大提高搜索的效率.
Abstract Most existing Peer-to-Peer (P2P) systems only support simple title-based search, which is limited in functionality. Broadcast search is widely used in pure P2P network, which is not efficient and costs a lot of bandwidth. An intelligent search algorithm based on content of document is proposed. Vector space model (VSM) is used to do similarity search. Each peer does query clustering with the past queries. For a new arriving query, the most possible peers that have the query answers are selected according to the query cluster to send the query, which improves the search efficiency. With queries done, query clusters can be adjusted automatically with a little cost. It is proved by experiments that the intelligent search algorithm can greatly improve the search efficiency, and meanwhile guarantee the query effectiveness.
Key Words P2P; similarity; clustering; intelligent search
本帖隐藏的内容需要回复才可以浏览