Optical Character Recognition Using KNN Classification Algorithm

Optical Character Recognition Using KNN Classification Algorithm Abstract


The optical character recognition is the conversion of textual, manuscript or printed text into machine-coded text, from a scanned document, a photo of a document, a scene photo or from a Subtitle text superimposed on an image. The problem of optical character recognition, OCR, has been widely discussed in the literature. By having a handwritten text, the program aims to recognize the text. Although there are several approaches to this topic, it is still an open problem. In this project, we would like to propose an approach that uses the closest K-neighbor algorithm and has the accuracy of more than 80%. Training and run time are also very short.

Existing System:

In the world underway, there is a growing demand for users to convert printed documents into electronic documents to maintain the security of their data. Therefore, the basic OCR system was invented to convert the available data into documents in the documents capable of processing the computer so that the documents can be editable and reusable. The existing system / previous OCR system in a network infrastructure is only OCR without grid functionality. This is the existing system that deals with the homogeneous character recognition or character recognition of individual languages.

Proposed System:

Our proposed system is OCR in a network infrastructure that is a character recognition system that supports the recognition of the characters in the Telugu languages. This feature is what we call network infrastructure that eliminates the problem of heterogeneous recognition of characters and supports the realization of multiple features in the document. Multiple functionalities include editing and searching, while the existing system only supports editing of the document. In this context, the Grid infrastructure is the infrastructure that supports groups of the specific set of languages. Therefore, OCR in a network infrastructure is multilingual.

Software Specifications:

  • Image Database: iris
  • Operating System: Windows 10    
  • Programming Language: Python   

Hardware Specifications:

  • Processor: intel i5
  • Hard disk: 100GB
  • RAM: 8GB 

Login here to download Optical Character Recognition Using KNN Classification Algorithm

Account Login

New users must register now. For Forgot password click here.