Web Personalized Index based N-GRAM Extraction
Ahmed Mudassar Ali1, M. Ramakrishnan2

1Ahmed Mudassar Ali, Research Scholar, Bharath University, Chennai. India,
2M. Ramakrishnan, Professor, School of Information Technology, Madurai Kamaraj University, Madurai, India.
Manuscript received on February 01, 2015. | Revised Manuscript Received on February 09, 2015. | Manuscript published on February 20, 2015. | PP: 6-9 | Volume-3 Issue-3, February 2015. | Retrieval Number: C0589023315/2015©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Web mining is the analysis step of the “Knowledge Discovery in Web Databases” process which is an intersection of computer science and statistics. In this process results are produced from pattern matching and clustering which may not be relevant to the actual search. For example result for tree may be apple tree, mango tree whereas the user is searching for binary tree. N-grams are applied in applications like searching in text documents, where one must work with phrases. Eg: plagiarism detection. Thus relevancy becomes major part in searching. We can achieve relevancy using n-gram algorithm. We show an index based method to discover patterns in large data sets. It utilizes methods at the conjunction of AI, machine learning and statistics. We also induce a method of personalization where the search engine is used for indexing purposes in addition to the current n-gram techniques. A collaborative web search method is used for user’s personalization in the web search engine to extract the required accurate data.
Keywords: Web Mining, Knowledge Discovery, N-Gram, Stemming.