1. By utilizing artificial intelligence (AI) based on deep learning, EduLab, Inc has achieved 93.5% letter recognition rate of complicated handwritten characters (Japanese) including addresses and names
  1. By utilizing artificial intelligence (AI) based on deep learning, EduLab, Inc has achieved 93.5% letter recognition rate of complicated handwritten characters (Japanese) including addresses and names

By utilizing artificial intelligence (AI) based on deep learning, EduLab, Inc has achieved 93.5% letter recognition rate of complicated handwritten characters (Japanese) including addresses and names

By utilizing artificial intelligence (AI) based on deep learning, EduLab, Inc has achieved 93.5% letter recognition rate of complicated handwritten characters (Japanese) including addresses and names

September 6, 2017
EduLab, Inc.

In 2015, EduLab, Inc. (Location: Minato-ku, Tokyo; President and CEO: Junichi Takamura; hereinafter referred to as “EduLab Group”), which develops business in the field of Edtech, started a project to develop letter recognition technology of handwritten characters (Japanese) based on deep learning. We achieved 98.66% letter recognition rate of handwritten characters (Japanese) by June, 2016, which was among the top levels in the industry. The characters that were recognized up to that point, however, were individual characters (Japanese) written by hand in designated frames. We continued developing the technology and have achieved 93.5% letter recognition rate of complicated characters (Japanese), including addresses and names, written by hand in an area with no designated frames, similar to actual test answer sheets. (Recognition rate is the a measure of how closely the reading result matches that of a human being. The data used to test the accuracy includes approximately 35,000 cases.)

Illustration 1: Image of individual and multiple characters being read

Illustration 1: Image of individual and multiple characters being read

Background

Following the trend of putting more weight on problem solving skills, the number of constructed response tests is rapidly increasing in the field of education in Japan for entrance exams, nationwide and local governments’ assessment of academic ability, and a variety of certification tests. Because constructed response tests are evaluated by humans, they take longer to mark and the increase in marking costs becomes a major issue.
It is from this context that we started examining converting written letters into electronic text data in order to improve marking efficiency. To date, it has been difficult to substantialy shorten amount of time required to input content written by human hand on answer sheets and convert it into electronic text data. In addition, with conventional OCR (Optical Character Recognition) technology, which reads answer sheets using a scanner and changes them into electronic text data, it is difficult to ensure sufficient accuracy to mark due to restrictions in the writing area and cases where characters within the writing area could not be recognized.
We started research and development of highly accurate letter recognition technology of handwritten characters using artificial intelligence (AI) based on deep learning and achieved a 98.66% letter recognition rate of handwritten individual characters (Japanese) in 2016.
http://edulab-inc.com/press-release/20160706.html

Characteristics of multiple letter recognition technology

The individual letter recognition technology of handwritten characters at the point of 2016 had the following problems:
] Single letter recognition was highly accurate, but the characteristics of Chinese characters used in Japanese, where different radicals are combined to make a single character, lowered the accuracy when recognizing multiple characters. Radicals would be recognized separately as single characters when recognizing multiple characters, which was a problem.

Illustration 2: Image of wrong recognition of multiple characters

Illustration 2: Image of wrong recognition of multiple characters

Therefore, we started examining a method of reading multiple characters at the same time, just like how people read them, and outputing the results all together. We developed a new letter recognition technology of handwritten characters that can read (recognize) even multiple characters accurately. With this new approach, we were able to improve the process of separating multiple characters into individual characters accurately.

Illustration 3: Image of current letter recognition of multiple characters

Illustration 3: Image of current letter recognition of multiple characters

During the development process, we collected data from multiple characters of Japanese addresses, names, and words written by hand and had artificial intelligence (AI) learn them. As a result, we achieved 93.5% letter recognition accuracy in reading the data of handwritten multiple characters of addresses. (*Recognition rate is the a measure of how closely the reading result matches that of a human being. The data used to test the accuracy includes approximately 35,000 cases.)
Shown below are example data of handwritten addresses and the results of letter reading (recognizing). In the examples below, all of the address data consist of multiple characters read (recognized) without any errors (letter recognition accuracy is 100%).

Illustration 4: Data of handwritten addresses and reading (recognition) results by our leter recognition technology

Illustration 4: Data of handwritten addresses and reading (recognition) results by our leter recognition technology

Future consideration

Since this technology is applicable not only for addresses but also for names, general documents, and industry-specific documents, we will repeat the process of collecting data, having artificial intelligence (AI) learn, testing, and developing the technology.
We also plan to develop a cloud app so that we can provide the service to any groups or industries that have a need to use it easily.

Illustration 5: Image of cloud app screen

Illustration 5: Image of cloud app screen

We will continue to develop this technology based on deep learning in order to improve the accuracy of letter recognition of handwritten characters. We will also strive to develop technology that automatically marks test answers converted into electronic text data from handwritten characters as well as technologies that are based on artificial intelligence (AI) in order to make marking and the marking process more efficient and automated.

About EduLab Group

EduLab Group realizes educational solutions for the next generation based on the latest learning science, which includes building new businesses and investment in the field of Edtech, providing IT solutions and platforms for the education industry, and supporting next generation education and school management systems. We have offices in Tokyo, Seattle, Singapore, Hong Kong, Beijing, Shanghai, Bangalore, and Pune.