文章摘要
张明西,张雷洪,吕巍,孙刘杰.按需印刷平台中的相似搜索研究[J].包装工程,2015,36(23):135-139.
ZHANG Ming-xi,ZHANG Lei-hong,LYU Wei,SUN Liu-jie.Similarity Search over Print-on-demand Platform[J].Packaging Engineering,2015,36(23):135-139.
按需印刷平台中的相似搜索研究
Similarity Search over Print-on-demand Platform
投稿时间:2015-05-23  修订日期:2015-12-10
DOI:
中文关键词: 按需印刷  P-Rank  相似搜索  “用户-产品” 关系图
英文关键词: Print-On-Demand  P-Rank  similarity search  "user-product" relation graph
基金项目:上海市教委科研创新项目 (15ZZ074);上海高校青年教师培养资助计划 (ZZSLG14021);上海出版传媒研究院招标课题 (SAYB1410);上海理工大学博士启动基金 (1D-14-309-001)
作者单位
张明西 上海理工大学上海 200093 
张雷洪 上海理工大学上海 200093 
吕巍 上海理工大学上海 200093 
孙刘杰 上海理工大学上海 200093 
摘要点击次数:
全文下载次数:
中文摘要:
      目的 研究按需印刷平台中的相似搜索效率问题。方法 利用用户与产品之间的 “购买” 关系构建 “用户-产品” 关系, 基于P-Rank提出一种高效的相似搜索方法POD-Rank, 用于从 “用户-产品” 关系中发现相似产品。POD-Rank相似搜索过程依据 “用户-产品” 关系离线计算用户相似性, 并利用用户相似性在线计算产品相似性, 而后进一步提出优化的在线查询处理算法, 以降低查询处理的时间开销。结果 POD-Rank的计算时间开销和存储开销显著低于P-Rank, 而且能够快速响应查询请求。结论 POD-Rank 的相似性计算开销为 P-Rank 的 0.03%, 存储开销为 P-Rank 的 0.06%, 计算效果与P-Rank接近, 能够满足按需印刷平台中大规模产品数据处理的需求。
英文摘要:
      The aim of this work was to study the efficiency problem of similarity search over Print-On-Demand (POD) Platform. A "user-product" relation graph was built by utilizing the purchasing relationship between user and product, the similarity between products was measured according to the structure of "user-product" relation graph. For improving the efficiency, we proposed a similarity search method, POD-Rank, which divided the computation process into 2 steps. In the first step, we computed the similarity between users in an off-line manner; and in the second step, we computed the similarity between the query and each candidate product based on user similarity in an online manner. For further reducing the response time of on-line query processing, we proposed an optimized online query processing algorithm by skipping the unnecessary accumulation operations on zero-values. The space cost and pre-computation time cost of POD-Rank were evidently lower than those of P-Rank with little effectiveness loss and short online query time. By adopting the 2-step similarity computation method, the time cost was significantly reduced, the computation time cost was only 0.03% of that of P-Rank, the size of similarity matrix was only 0.06% of that of P-Rank, and the effectiveness was close to that of P-Rank. This method can therefore be efficiently applied to processing of large datasets of POP platform.
查看全文   查看/发表评论  下载PDF阅读器
关闭

关于我们 | 联系我们 | 投诉建议 | 隐私保护 | 用户协议

您是第21272681位访问者    渝ICP备15012534号-2

版权所有:《包装工程》编辑部 2014 All Rights Reserved

邮编:400039 电话:023-68795652 Email: designartj@126.com

    

渝公网安备 50010702501716号