個(gè)性化搜索引擎中的用戶興趣模型分析與研究.doc
約41頁(yè)DOC格式手機(jī)打開(kāi)展開(kāi)
個(gè)性化搜索引擎中的用戶興趣模型分析與研究,2.24萬(wàn)字自己原創(chuàng)的畢業(yè)論文,已經(jīng)通過(guò)校內(nèi)系統(tǒng)檢測(cè),重復(fù)率低,僅在本站獨(dú)家出售,大家放心下載使用 摘要 快速發(fā)展的現(xiàn)代互聯(lián)網(wǎng)技術(shù),溝通了世界各地的信息交流,也致使了互聯(lián)網(wǎng)信息資源的爆炸式增長(zhǎng),同時(shí)不可避免地產(chǎn)生了難以讓用戶快速獲取有效信息的問(wèn)題。傳統(tǒng)的搜索引擎提供的信息檢索服務(wù)...
內(nèi)容介紹
此文檔由會(huì)員 jiji888 發(fā)布
個(gè)性化搜索引擎中的用戶興趣模型分析與研究
2.24萬(wàn)字
自己原創(chuàng)的畢業(yè)論文,已經(jīng)通過(guò)校內(nèi)系統(tǒng)檢測(cè),重復(fù)率低,僅在本站獨(dú)家出售,大家放心下載使用
摘要 快速發(fā)展的現(xiàn)代互聯(lián)網(wǎng)技術(shù),溝通了世界各地的信息交流,也致使了互聯(lián)網(wǎng)信息資源的爆炸式增長(zhǎng),同時(shí)不可避免地產(chǎn)生了難以讓用戶快速獲取有效信息的問(wèn)題。傳統(tǒng)的搜索引擎提供的信息檢索服務(wù)職能被動(dòng)地接受用戶的請(qǐng)求,并反饋一定相關(guān)度的信息,無(wú)法自主地感知用戶的需求。而這些問(wèn)題可能的緣由是現(xiàn)代的搜索引擎采用了全文檢索的匹配方法,使得用戶往往會(huì)得到相當(dāng)多的查詢結(jié)果網(wǎng)頁(yè),而用戶一般只會(huì)訪問(wèn)其中感興趣的網(wǎng)頁(yè),抑或是有檢索傾向的網(wǎng)頁(yè)。在面臨用戶輸入的檢索詞意向模糊、不全面的情況下 ,不同的個(gè)性化需求用戶,在輸入相同的檢索詞的時(shí)候也往往得到相同的結(jié)果,更甚者得到相同的網(wǎng)頁(yè)排序。顯然,這樣傳統(tǒng)的信息檢索無(wú)法在信息膨脹的互聯(lián)網(wǎng)上滿足用戶愈加復(fù)雜和差異化的需求。所以,迫切需要一種能迎合不用的用戶需求差異的個(gè)性化信息檢索服務(wù)。本文的方向在于搜索引擎提供個(gè)性化信息檢索服務(wù)的關(guān)鍵——用戶興趣模型。
首先,闡述了搜索引擎的研究背景,介紹個(gè)性化信息服務(wù)的發(fā)展和體系結(jié)構(gòu)??偨Y(jié)了國(guó)內(nèi)為搜索引擎的發(fā)展現(xiàn)狀和其基本理論。簡(jiǎn)述了他的定義以及其組成的各個(gè)部分功能,闡述了現(xiàn)有搜索引擎的缺陷和不足。
基于向量空間模型表示的用戶興趣在準(zhǔn)確性和全面性上存在問(wèn)題,在不考慮用戶興趣多樣性的前提下提出了層次性向量空間模型表示用戶的興趣。
采用xml存儲(chǔ)用戶興趣,并建立“用戶--xml”的映射關(guān)系,使得搜索引擎通過(guò)用戶名找到用戶的興趣文件,保證個(gè)性化搜索。
提出一種改進(jìn)的用戶興趣模型方案,通過(guò)搭載nutch搜索引擎,將用戶興趣模型運(yùn)用于搭載的搜索引擎中實(shí)驗(yàn)。
關(guān)鍵詞:個(gè)性化信息搜索 層次性向量空間模型 用戶興趣建模
Abstract The rapid development of modern Internet technology, communication and the exchange of information around the world, has also led to the explosion of Internet information resource growth, also produced inevitably difficult to give the user quick access to effective information problem.Information retrieva l service function of traditional search engines provide a passive acceptance of the user's request, and certain relevance feedback information, not self perceived user requirements.And the reason of these problems may be modern search engines by full text retrieva l, users tend to get quite a lot of query results Webpage, while users usually access the interested Webpage, or are prone Webpage retrieva l.In the face of the user input query intention fuzzy, not overall situation, not the individual needs of users, in the same input to retrieve the same result also often get the word, even get the same Webpage sort.Obviously, the traditional information retrieva l can not meet the user more and more complex and differentiated demands in the expansion of the information on the internet.Therefore, the urgent need of personalized information a user needs to do not have different retrieva l service.In this paper, the key lies in the direction of search engines provide personalized information retrieva l services -- the user interest model.
Firstly, it elaborates the research background of search engine, introduces the development of personalized information service and system structure.Summarized the domestic development status of search engine and its basic theory.In his definition as well as its composition function of every part, expounds the defects and deficiencies of the existing search engine.
User interest based on vector space model problems in accuracy and comprehensiveness, presents hierarchical vector space model to represent the interests of users without considering the user interest diversity premise.
Based on the analysis of the basic user residence time in the Webpage and repeated browsing Webpage times to calculate interest concentration on Webpage users, considering the size of the Webpage, put forward to calculate the user interest concentration Webpage algorithm based on browsing speed.
Using XML to store the user interest, and establish the "mapping between user --xml", allows the search engine to find the user interest file by name, ensure personalized
search.
Proposes an improved user interest model, search engine by playing in the nutch, the user interest model used in the flight search engine experiment.
Keywords: personalized information search, hierarchical vector space model, user interest modeling
目錄
引言-----------------------------------------------------------------------------------1
第一章 緒論------------------------------------------------------------------------------------2
1.1搜索引擎---------------------------------------------------------------------------------2
1.2興趣模型---------------------------------------------------------------------------------3
1.3研究現(xiàn)狀及課題意義-----------------------------------------------------------------4
1.4課題內(nèi)容---------------------------------------------------------------------------------5
1.5本章小結(jié)---------------------------------------------------------------------------------5
第二章 搜索引擎相關(guān)理論-------------------..
2.24萬(wàn)字
自己原創(chuàng)的畢業(yè)論文,已經(jīng)通過(guò)校內(nèi)系統(tǒng)檢測(cè),重復(fù)率低,僅在本站獨(dú)家出售,大家放心下載使用
摘要 快速發(fā)展的現(xiàn)代互聯(lián)網(wǎng)技術(shù),溝通了世界各地的信息交流,也致使了互聯(lián)網(wǎng)信息資源的爆炸式增長(zhǎng),同時(shí)不可避免地產(chǎn)生了難以讓用戶快速獲取有效信息的問(wèn)題。傳統(tǒng)的搜索引擎提供的信息檢索服務(wù)職能被動(dòng)地接受用戶的請(qǐng)求,并反饋一定相關(guān)度的信息,無(wú)法自主地感知用戶的需求。而這些問(wèn)題可能的緣由是現(xiàn)代的搜索引擎采用了全文檢索的匹配方法,使得用戶往往會(huì)得到相當(dāng)多的查詢結(jié)果網(wǎng)頁(yè),而用戶一般只會(huì)訪問(wèn)其中感興趣的網(wǎng)頁(yè),抑或是有檢索傾向的網(wǎng)頁(yè)。在面臨用戶輸入的檢索詞意向模糊、不全面的情況下 ,不同的個(gè)性化需求用戶,在輸入相同的檢索詞的時(shí)候也往往得到相同的結(jié)果,更甚者得到相同的網(wǎng)頁(yè)排序。顯然,這樣傳統(tǒng)的信息檢索無(wú)法在信息膨脹的互聯(lián)網(wǎng)上滿足用戶愈加復(fù)雜和差異化的需求。所以,迫切需要一種能迎合不用的用戶需求差異的個(gè)性化信息檢索服務(wù)。本文的方向在于搜索引擎提供個(gè)性化信息檢索服務(wù)的關(guān)鍵——用戶興趣模型。
首先,闡述了搜索引擎的研究背景,介紹個(gè)性化信息服務(wù)的發(fā)展和體系結(jié)構(gòu)??偨Y(jié)了國(guó)內(nèi)為搜索引擎的發(fā)展現(xiàn)狀和其基本理論。簡(jiǎn)述了他的定義以及其組成的各個(gè)部分功能,闡述了現(xiàn)有搜索引擎的缺陷和不足。
基于向量空間模型表示的用戶興趣在準(zhǔn)確性和全面性上存在問(wèn)題,在不考慮用戶興趣多樣性的前提下提出了層次性向量空間模型表示用戶的興趣。
采用xml存儲(chǔ)用戶興趣,并建立“用戶--xml”的映射關(guān)系,使得搜索引擎通過(guò)用戶名找到用戶的興趣文件,保證個(gè)性化搜索。
提出一種改進(jìn)的用戶興趣模型方案,通過(guò)搭載nutch搜索引擎,將用戶興趣模型運(yùn)用于搭載的搜索引擎中實(shí)驗(yàn)。
關(guān)鍵詞:個(gè)性化信息搜索 層次性向量空間模型 用戶興趣建模
Abstract The rapid development of modern Internet technology, communication and the exchange of information around the world, has also led to the explosion of Internet information resource growth, also produced inevitably difficult to give the user quick access to effective information problem.Information retrieva l service function of traditional search engines provide a passive acceptance of the user's request, and certain relevance feedback information, not self perceived user requirements.And the reason of these problems may be modern search engines by full text retrieva l, users tend to get quite a lot of query results Webpage, while users usually access the interested Webpage, or are prone Webpage retrieva l.In the face of the user input query intention fuzzy, not overall situation, not the individual needs of users, in the same input to retrieve the same result also often get the word, even get the same Webpage sort.Obviously, the traditional information retrieva l can not meet the user more and more complex and differentiated demands in the expansion of the information on the internet.Therefore, the urgent need of personalized information a user needs to do not have different retrieva l service.In this paper, the key lies in the direction of search engines provide personalized information retrieva l services -- the user interest model.
Firstly, it elaborates the research background of search engine, introduces the development of personalized information service and system structure.Summarized the domestic development status of search engine and its basic theory.In his definition as well as its composition function of every part, expounds the defects and deficiencies of the existing search engine.
User interest based on vector space model problems in accuracy and comprehensiveness, presents hierarchical vector space model to represent the interests of users without considering the user interest diversity premise.
Based on the analysis of the basic user residence time in the Webpage and repeated browsing Webpage times to calculate interest concentration on Webpage users, considering the size of the Webpage, put forward to calculate the user interest concentration Webpage algorithm based on browsing speed.
Using XML to store the user interest, and establish the "mapping between user --xml", allows the search engine to find the user interest file by name, ensure personalized
search.
Proposes an improved user interest model, search engine by playing in the nutch, the user interest model used in the flight search engine experiment.
Keywords: personalized information search, hierarchical vector space model, user interest modeling
目錄
引言-----------------------------------------------------------------------------------1
第一章 緒論------------------------------------------------------------------------------------2
1.1搜索引擎---------------------------------------------------------------------------------2
1.2興趣模型---------------------------------------------------------------------------------3
1.3研究現(xiàn)狀及課題意義-----------------------------------------------------------------4
1.4課題內(nèi)容---------------------------------------------------------------------------------5
1.5本章小結(jié)---------------------------------------------------------------------------------5
第二章 搜索引擎相關(guān)理論-------------------..