子空間離群點數(shù)據(jù)挖掘系統(tǒng)的設(shè)計與實現(xiàn)[獨家原創(chuàng)].doc
約33頁DOC格式手機打開展開
子空間離群點數(shù)據(jù)挖掘系統(tǒng)的設(shè)計與實現(xiàn)[獨家原創(chuàng)],子空間離群點數(shù)據(jù)挖掘系統(tǒng)的設(shè)計與實現(xiàn)1.47萬字自己原創(chuàng)的畢業(yè)論文,已經(jīng)通過校內(nèi)系統(tǒng)檢測,重復率低,僅在本站獨家出售,大家放心下載使用摘要 離群數(shù)據(jù)挖掘是數(shù)據(jù)挖掘中的主要研究內(nèi)容之一,通過離群數(shù)據(jù)挖掘,能夠發(fā)現(xiàn)一些真實的、但又出乎人們意外的知識,可以揭示稀有事件和現(xiàn)象,發(fā)現(xiàn)有趣的模式。近些年來,離群數(shù)據(jù)挖掘成為信息科學...
內(nèi)容介紹
此文檔由會員 淘寶大夢 發(fā)布
子空間離群點數(shù)據(jù)挖掘系統(tǒng)的設(shè)計與實現(xiàn)
1.47萬字
自己原創(chuàng)的畢業(yè)論文,已經(jīng)通過校內(nèi)系統(tǒng)檢測,重復率低,僅在本站獨家出售,大家放心下載使用
摘要 離群數(shù)據(jù)挖掘是數(shù)據(jù)挖掘中的主要研究內(nèi)容之一,通過離群數(shù)據(jù)挖掘,能夠發(fā)現(xiàn)一些真實的、但又出乎人們意外的知識,可以揭示稀有事件和現(xiàn)象,發(fā)現(xiàn)有趣的模式。近些年來,離群數(shù)據(jù)挖掘成為信息科學中一個活躍的分支,在數(shù)據(jù)庫、數(shù)據(jù)挖掘、機器學習和統(tǒng)計學等領(lǐng)域受到廣泛關(guān)注。
隨著數(shù)據(jù)獲取手段的發(fā)展,表示現(xiàn)實世界的數(shù)據(jù)越來越復雜,“豐富的數(shù)據(jù)與貧乏的知識”問題也日漸突出,這些數(shù)據(jù)背后隱藏著許多有用的信息和知識,如何獲取這些知識和信息,促使了對數(shù)據(jù)挖掘技術(shù)的廣泛研究。然而這些數(shù)據(jù)的維數(shù)普遍都非常高,數(shù)據(jù)的高維性是最棘手的,這對已有的離群數(shù)據(jù)挖掘算法是一個挑戰(zhàn),針對這一問題,本課題基于子空間的離群數(shù)據(jù)挖掘方法,先把高維數(shù)據(jù)投影到低維子空間,然后在子空間中觀察數(shù)據(jù),并利用微粒群算法搜索稀疏子空間和最優(yōu)劃分,進而確定離群數(shù)據(jù)。主要針對高維數(shù)據(jù)集中的離群數(shù)據(jù)挖掘問題進行了研究,研究內(nèi)容主要包括以下幾個方面:
1. 給出了一種基于基于距離的關(guān)聯(lián)子空間離群點挖掘算法。第一類是先搜索所有的關(guān)聯(lián)子空間,然后在關(guān)聯(lián)子空間中進行離群點挖掘,如HiCS。二類是先確定給定數(shù)據(jù)點的關(guān)聯(lián)子空間集合,然后計算相應的離群度。這種方式通常會更加有意義,可以更好的解釋數(shù)據(jù)點離群的原因,如OUTRES。
2. 給出了一種基于微粒群和子空間的離群數(shù)據(jù)挖掘算法,該算法的核心思想是針對實際應用中,對于高維數(shù)據(jù)的異常行為通常只發(fā)生在屬性子集上,而與其余維幾乎沒有關(guān)系。算法首先將高維數(shù)據(jù)投影到低維子空間,計算每個子空間的稀疏系數(shù),把子空間稀疏系數(shù)作為子空間異常程度的度量。采用帶有變異算子的PSO算法來搜索子空間。
在上述研究的基礎(chǔ)上,以eclipse為開發(fā)工具,設(shè)計并實現(xiàn)離群數(shù)據(jù)挖掘系統(tǒng),對軟件模塊功能、關(guān)鍵技術(shù)進行詳細描述。
關(guān)鍵詞 離群數(shù)據(jù) 子空間 數(shù)據(jù)挖掘
Design and Implementation of Management System of Outlier Mining Algorithms Based on Subspace
Abstract outlier mining is one of the most important topic in data mining.outlier mining can help people discover true and unexpected information,and has aroused the interest of the many researchers.most traditional methods of outlier mining regard outliers from overall point of view .so it is difficult to find bias data or outliers in subspace.this paper studies outliers mining in subspace by partitioning high dimensional space into low dimensional subspace.main researches are as follows:
(1)HiCS will search for high-contrast sub-space as a subspace outlier mining preprocessing step, and then the various high-contrast subspace outlier score integrate, to get the final results will be sorted outliers,HiCS search subspace from the overall situation, not determined its associated sub-space for each data point.。
(2) An outlier mining algorithm based on PSO (Particle swarm optimization)and subspace is presented .the algorithm regards outlier subspace swarm,and searches for outlier subspace with mutational PSO algorithm according to sparsity coefficient of subspace.data in outlier subspace is regard as outlier.finally,the experiment results validate the PSO algorithm by taking the star spectra data from the lamost project.
(3) Local outlier mining algorithm based in subspace partitioning is presented .firstly ,data set is divided into the disjoint subspace.merits of partition are measured by skew of partition,and the best partition of the subspace is searched by using the PSO.secondly,the local outlier is measured by its SPLOF value.finally,experimental results show that the PSO-LOF algorithm does not depend on user’s parameters ,and has scalability and high efficiency by taking spectral data as data set.
(4) On the base of the above ,the outlier mining system based on subspace is designed and implemented by using ECLIPSE as development tools .its function modules and key technology are elaborated.
Key words Outlier;subspace;data mining
目 錄
第一章 緒論 1
1.1 研究背景 1
1.2 研究現(xiàn)狀 1
1.3 研究內(nèi)容 2
1.4 論文結(jié)構(gòu) 2
第二章 相關(guān)技術(shù) 3
2.1 數(shù)據(jù)挖掘技術(shù) 3
2.2 JAVA技術(shù) 6
2.3 Eclipse 開發(fā)工具 6
第三章 基于距離的關(guān)聯(lián)子空間離群算法 8
3.1 Hics算法 8
3.2 outres算法 13
3.3 LOF算法 16
第四章 基于微粒群和子空間的離群數(shù)據(jù)挖掘算法 18
4.1 引言 18
4.2 PSO算法 18
第五章 基于子空間的離群數(shù)據(jù)挖掘系統(tǒng)的實現(xiàn) 22
5.1 系統(tǒng)功能模塊 22
5.2 主界面 22
5.3 運行結(jié)果分析 23
第六章 總結(jié)與展望 27
6.1 結(jié)論 27
6.2 展望 27
致 謝 28
參考文獻 29
1.47萬字
自己原創(chuàng)的畢業(yè)論文,已經(jīng)通過校內(nèi)系統(tǒng)檢測,重復率低,僅在本站獨家出售,大家放心下載使用
摘要 離群數(shù)據(jù)挖掘是數(shù)據(jù)挖掘中的主要研究內(nèi)容之一,通過離群數(shù)據(jù)挖掘,能夠發(fā)現(xiàn)一些真實的、但又出乎人們意外的知識,可以揭示稀有事件和現(xiàn)象,發(fā)現(xiàn)有趣的模式。近些年來,離群數(shù)據(jù)挖掘成為信息科學中一個活躍的分支,在數(shù)據(jù)庫、數(shù)據(jù)挖掘、機器學習和統(tǒng)計學等領(lǐng)域受到廣泛關(guān)注。
隨著數(shù)據(jù)獲取手段的發(fā)展,表示現(xiàn)實世界的數(shù)據(jù)越來越復雜,“豐富的數(shù)據(jù)與貧乏的知識”問題也日漸突出,這些數(shù)據(jù)背后隱藏著許多有用的信息和知識,如何獲取這些知識和信息,促使了對數(shù)據(jù)挖掘技術(shù)的廣泛研究。然而這些數(shù)據(jù)的維數(shù)普遍都非常高,數(shù)據(jù)的高維性是最棘手的,這對已有的離群數(shù)據(jù)挖掘算法是一個挑戰(zhàn),針對這一問題,本課題基于子空間的離群數(shù)據(jù)挖掘方法,先把高維數(shù)據(jù)投影到低維子空間,然后在子空間中觀察數(shù)據(jù),并利用微粒群算法搜索稀疏子空間和最優(yōu)劃分,進而確定離群數(shù)據(jù)。主要針對高維數(shù)據(jù)集中的離群數(shù)據(jù)挖掘問題進行了研究,研究內(nèi)容主要包括以下幾個方面:
1. 給出了一種基于基于距離的關(guān)聯(lián)子空間離群點挖掘算法。第一類是先搜索所有的關(guān)聯(lián)子空間,然后在關(guān)聯(lián)子空間中進行離群點挖掘,如HiCS。二類是先確定給定數(shù)據(jù)點的關(guān)聯(lián)子空間集合,然后計算相應的離群度。這種方式通常會更加有意義,可以更好的解釋數(shù)據(jù)點離群的原因,如OUTRES。
2. 給出了一種基于微粒群和子空間的離群數(shù)據(jù)挖掘算法,該算法的核心思想是針對實際應用中,對于高維數(shù)據(jù)的異常行為通常只發(fā)生在屬性子集上,而與其余維幾乎沒有關(guān)系。算法首先將高維數(shù)據(jù)投影到低維子空間,計算每個子空間的稀疏系數(shù),把子空間稀疏系數(shù)作為子空間異常程度的度量。采用帶有變異算子的PSO算法來搜索子空間。
在上述研究的基礎(chǔ)上,以eclipse為開發(fā)工具,設(shè)計并實現(xiàn)離群數(shù)據(jù)挖掘系統(tǒng),對軟件模塊功能、關(guān)鍵技術(shù)進行詳細描述。
關(guān)鍵詞 離群數(shù)據(jù) 子空間 數(shù)據(jù)挖掘
Design and Implementation of Management System of Outlier Mining Algorithms Based on Subspace
Abstract outlier mining is one of the most important topic in data mining.outlier mining can help people discover true and unexpected information,and has aroused the interest of the many researchers.most traditional methods of outlier mining regard outliers from overall point of view .so it is difficult to find bias data or outliers in subspace.this paper studies outliers mining in subspace by partitioning high dimensional space into low dimensional subspace.main researches are as follows:
(1)HiCS will search for high-contrast sub-space as a subspace outlier mining preprocessing step, and then the various high-contrast subspace outlier score integrate, to get the final results will be sorted outliers,HiCS search subspace from the overall situation, not determined its associated sub-space for each data point.。
(2) An outlier mining algorithm based on PSO (Particle swarm optimization)and subspace is presented .the algorithm regards outlier subspace swarm,and searches for outlier subspace with mutational PSO algorithm according to sparsity coefficient of subspace.data in outlier subspace is regard as outlier.finally,the experiment results validate the PSO algorithm by taking the star spectra data from the lamost project.
(3) Local outlier mining algorithm based in subspace partitioning is presented .firstly ,data set is divided into the disjoint subspace.merits of partition are measured by skew of partition,and the best partition of the subspace is searched by using the PSO.secondly,the local outlier is measured by its SPLOF value.finally,experimental results show that the PSO-LOF algorithm does not depend on user’s parameters ,and has scalability and high efficiency by taking spectral data as data set.
(4) On the base of the above ,the outlier mining system based on subspace is designed and implemented by using ECLIPSE as development tools .its function modules and key technology are elaborated.
Key words Outlier;subspace;data mining
目 錄
第一章 緒論 1
1.1 研究背景 1
1.2 研究現(xiàn)狀 1
1.3 研究內(nèi)容 2
1.4 論文結(jié)構(gòu) 2
第二章 相關(guān)技術(shù) 3
2.1 數(shù)據(jù)挖掘技術(shù) 3
2.2 JAVA技術(shù) 6
2.3 Eclipse 開發(fā)工具 6
第三章 基于距離的關(guān)聯(lián)子空間離群算法 8
3.1 Hics算法 8
3.2 outres算法 13
3.3 LOF算法 16
第四章 基于微粒群和子空間的離群數(shù)據(jù)挖掘算法 18
4.1 引言 18
4.2 PSO算法 18
第五章 基于子空間的離群數(shù)據(jù)挖掘系統(tǒng)的實現(xiàn) 22
5.1 系統(tǒng)功能模塊 22
5.2 主界面 22
5.3 運行結(jié)果分析 23
第六章 總結(jié)與展望 27
6.1 結(jié)論 27
6.2 展望 27
致 謝 28
參考文獻 29
TA們正在看...
- 牛頭刨床機械原理課程設(shè)計說明書.pdf
- 牛頭刨床課程設(shè)計說明書.pdf
- [優(yōu)秀課程設(shè)計課程論文]啤酒瓶壓蓋機機械系統(tǒng)方案...doc
- zl30裝載機工作裝置cad研究(本科畢業(yè)論文設(shè)計).doc
- 盤類零件畢業(yè)設(shè)計說明書模板.doc
- [優(yōu)秀畢業(yè)設(shè)計畢業(yè)論文]氣門搖臂軸支座的作用和工...doc
- 搖桿軸支座工藝規(guī)程與鏜夾具設(shè)計.doc
- zl50裝載機工作裝置cad研究(本科畢業(yè)論文設(shè)計).doc
- 汽車白車身焊接夾具的結(jié)構(gòu)設(shè)計.doc
- 汽車設(shè)計.doc