| Search Engine |
Score | 1st Place | (Distinction+) "114%" |
Course | Advanced Programming Techniques |
Year | 2012 |
Description | |
|
|
The aim of this project is to develop a simple web-based search engine that demonstrates the main features of a search engine (Web Crawling, Indexing, Query Processing and Ranking) and the interaction between them.
Web Crawler: It is a software agent that collects documents from the web. The crawler starts with a list of URL addresses. It downloads the documents identified by these URLs and extracts hyper-links from them. The hyper-link URLs are added to the list of URLs to be downloaded. Thus, web crawling is a recursive process.
Indexer: The output of web crawling process is a set of downloaded HTML documents. To respond to user queries fast enough, the contents of these documents have to be indexed in a data structure that stores the words contained in each document and their importance.
Query Processor: This module receives search queries, performs necessary pre-processing and queries the index for relevant documents.
Phrase Searching: Search engines will generally search for words as phrases when quotation marks are placed around the phrase.
Ranker: The ranker module sorts documents based on their popularity and relevance to the search query.
Java EE, JSP & MySQL Server
I was in a team of four members and my role was:
- The Team Leader.
- Database: Participating in designing the ER diagram of the search engine.
- Database: Implementing the entity and session beans.
- Crawler: The complete crawling recursive operation.
- Indexer: Lookup function by URLs and returns the list of keywords which are used to get this URL in the search results.
- User Interface: Developing and designing the Interface of the search engine.
- Participating in the main flow of all servlets for Crawler, Indexer, Ranker & Lookup.