Sunday, January 31, 2016

ASP.NET MVC web application : Creating a Search Engine using Lucene .NET


Introduction

This article show you how you can create a search engine using Lucene.Net

Background


This article may be useful for intermediate developers who have a some basic in C#, ASPNET MVC, HTML, CSS3, Jquery, Ajax.

Using the code

 A. Installation

First, To use Lucene project you must add Lucene.NET in your project references (you can use Manage NuGet Packages).

B. Indexing

 

The indexing part, is the step where the application stores index related to the data uploaded, to can easly found in the searching process.
The example adopted in our application is : the user must enter a image file, writing some description related to the image, and click on indexing button.
Then the application will check if the index Directory exist, and create indexWriter with the specification of the nature of analyser.
after there will be the build of document that stores the following fields :
  • first field take the name of file,
  • second field store the description.

1. Implementation

  • Code JavaScript


 //JQuery code that detect the click on searching button  
   $("#formSearch").on("submit", function (e) {  
     e.preventDefault();  
     //url of service  
     var textToSearch = $("#textToSearch").val();  
     var url = "/Home/Searching";  
    //send request to the mentioned service in url variable  
     $.ajax({  
       type: "POST",//choose your HTTP request type ('POST' or 'GET')  
       url: url,  
       data: { 'textToSearch': textToSearch },  
       success: function (data) {  
         //display data (HTML+CSS) into html bloc  
         $("#idOutputSearch").html(data);  
       },  
       error: function (xhr, ajaxOptions, thrownError) {  
         //display error message into html bloc  
         $("#idOutputSearch").html("Echec Loading");  
       }  
     });  
  });  
  • Code Html
   <form action="/Home/Indexing" method="post" enctype="multipart/form-data" role="form" class="form-horizontal">  
     <div class="form-group">  
        <label class="col-lg-3">Image File :</label>  
        <div class="input-group col-lg-offset-3">  
           <input type="file" name="filePicture" placeholder="Add a description">  
         </div>  
      </div>  
      <div class="form-group">  
         <label class="col-lg-3">Description :</label>  
         <div class="input-group col-lg-offset-3">  
           <input type="text" name="description" placeholder="Add a description">  
         </div>  
     </div>  
     <div class="form-group">  
         <div class="col-lg-offset-3">  
            <input type="submit" class="btn-default btn-sm" value="Go Indexing" />  
         </div>  
     </div>  
   </form>   
  • Code C# Asp.net MVC
  public ActionResult Indexing()  
       {  
         string description = Request.Params["description"];  
         HttpPostedFileBase filePicture = Request.Files["filePicture"];  
         if (description != null && filePicture != null)  
         {  
           //create file path  
           string fileName = filePicture.FileName;  
           string pathFile = Server.MapPath("~/Content/images/");  
          string pathImage = System.IO.Path.Combine(pathFile, fileName);  
          //saving image to server  
          filePicture.SaveAs(pathImage);  
          //Create Index :  
          string pathIndexFolder = Server.MapPath("~/Content/indexDirectory");  
          Directory directory = FSDirectory.Open(pathIndexFolder);  
          //Create standardAnalyser  
          //with StandardAnalyzer : Most sophisticated analyzer that knows about certain token types, lowercases, removes stop words, ...  
          //WhitespaceAnalyzer : Splits tokens on whitespace  
          Analyzer analyser = new Lucene.Net.Analysis.WhitespaceAnalyzer();  
          bool create = !IndexReader.IndexExists(directory);  
          //initialize indexWriter  
          IndexWriter indexWriter = new IndexWriter(directory, analyser, create, IndexWriter.MaxFieldLength.UNLIMITED);  
          //intialize document  
          Document doc = new Document();  
          //Create fiels  
          Field field1 = new Field("name", fileName, Field.Store.YES, Field.Index.ANALYZED);  
          Field field2 = new Field("description", description, Field.Store.YES, Field.Index.ANALYZED);  
          // remove older index entry  
          var searchQuery = new TermQuery(new Term("name", fileName));  
          indexWriter.DeleteDocuments(searchQuery);  
          //add new index entry  
          //add field to document  
          doc.Add(field1);  
          doc.Add(field2);  
          //add document to index Directory  
          indexWriter.AddDocument(doc);  
          //free indexWriter  
          indexWriter.Dispose();  
        }  
        return View("Index");  
      }  

2. Result 


C. Searching Part 

The searching part, is the step where the application start the searching process to find the suitable indexed data that match with the searched text from indexed document stored into specific Index Directory.
In our example, when user click on searching button after writing a text. the application create a Query statement based on the nature of analyser (must take the some analyzer as the indexing step) and with specification of field key. after the searching process will start to fetch between stored documents based on search text.
If the result are not null, the application build html data that take information from documents fields.

1. Implementation

  • Code Html
   <form enctype="multipart/form-data" id="formSearch" role="form" class="form-horizontal">  
      <div class="form-group">  
        <label class="col-lg-3">Write a text :</label>  
        <div class="input-group col-lg-offset-3">  
          <input type="text" name="textToSearch" id="textToSearch" placeholder="write a term"/>  
         </div>  
       </div>  
       <div class="form-group">  
          <div class="col-lg-offset-3">  
            <input type="submit" class="btn-default btn-sm" value="find" />  
          </div>  
       </div>  
   </form>  
   <div role="form" class="form-horizontal">  
     <div class="form-group">  
       <label class="col-lg-12">Search Result</label>          
       <div id="idOutputSearch">  
         <!--Result of searching-->  
       </div>  
      </div>   
    </div>   
  • Code C# Asp.net MVC
   public ActionResult Indexing()  
       {  
         string description = Request.Params["description"];  
        HttpPostedFileBase filePicture = Request.Files["filePicture"];  
         if (description != null && filePicture != null)  
         {  
           //create file path  
           string fileName = filePicture.FileName;  
           string pathFile = Server.MapPath("~/Content/images/");  
          string pathImage = System.IO.Path.Combine(pathFile, fileName);  
          //saving image to server  
          filePicture.SaveAs(pathImage);  
          //Create Index :  
          string pathIndexFolder = Server.MapPath("~/Content/indexDirectory");  
          Directory directory = FSDirectory.Open(pathIndexFolder);  
          //Create standardAnalyser  
          //with StandardAnalyzer : Most sophisticated analyzer that knows about certain token types, lowercases, removes stop words, ...  
          //WhitespaceAnalyzer : Splits tokens on whitespace  
          Analyzer analyser = new Lucene.Net.Analysis.WhitespaceAnalyzer();  
          bool create = !IndexReader.IndexExists(directory);  
          //initialize indexWriter  
          IndexWriter indexWriter = new IndexWriter(directory, analyser, create, IndexWriter.MaxFieldLength.UNLIMITED);  
          //intialize document  
          Document doc = new Document();  
          //Create fiels  
          Field field1 = new Field("name", fileName, Field.Store.YES, Field.Index.ANALYZED);  
          Field field2 = new Field("description", description, Field.Store.YES, Field.Index.ANALYZED);  
          // remove older index entry  
          var searchQuery = new TermQuery(new Term("name", fileName));  
          indexWriter.DeleteDocuments(searchQuery);  
          //add new index entry  
          //add field to document  
          doc.Add(field1);  
          doc.Add(field2);  
          //add document to index Directory  
          indexWriter.AddDocument(doc);  
          //free indexWriter  
          indexWriter.Dispose();  
        }  
        return View("Index");  
      }  

2. Result

 

D. Reset Indexing

In this step, when clicking on Reset indexing button, the index directory will be empty of documents.

1. Implementation

  • Code JavaScript
   //detect the click on Reset index button  
    $("#idResetIndex").on("click", function () {  
      //invoke the Delete Indexes service  
      $.post("Home/DeleteIndexes");  
    });   
  • Code Html
    <input type="button" id="idResetIndex" class="btn-default btn-lg" value="Reset Index Directory" />
  • Code C# Asp.net MVC
  public ActionResult DeleteIndexes()  
    {  
      string pathIndexFolder = Server.MapPath("~/Content/indexDirectory");  
      Directory directory = FSDirectory.Open(pathIndexFolder);  
      var reader = IndexReader.Open(directory, false);  
      var docs = new List<document>();  
      //get all documents inside the index Directory  
      var term = reader.TermDocs();  
       //Fetch all available documents   
      while (term.Next())  
      {  
        //removing document from index Directory  
        reader.DeleteDocument(term.Doc);  
       }  
       //free reader  
       reader.Dispose();  
       return View();  
    }  

In closing

i hope that you appreciate this article. try to download the source code and i'm waiting for your feedbacks.

No comments:

Post a Comment