• En
  • /
  • Jp
  • What is deep learning, the hottest learning technology of AI?

    2016.12.27
    • Share
    • Twitter
    • Google+
    • はてなブックマーク

    introduction

    So far we have seen two booms of AI and it is said that we are experiencing the third one. It’s the deep learning who is driving this boom.
    Thanks to deep learning, the accuracy of information extracted from output data.
    However, what are the differences from conventional machine learning? This time, I will explain about deep learning.

    contents

    1-1. Outline of machine learning
    1-2. Outline of deep learning
    1-3. Confusing “machine learning” and “deep learning”
    2-1. Explanation of mechanism of deep learning
    3-1. The trigger of getting attention
    3-2. Backgroun d of progressing practical use of deep learning
    3-3. Present situation of practical use
    4-1. Conclusion

    1-1.Outline of machine learning

    • Machine learning, a technology which reproduces human ability to learn using computer, is one of the research about AI. In machine learning, AI program learns rules or knowledge based on sample data. Here, learning means “dividing”. It divides objects, like drinks or foods, right or wrong answers. AI itself learns by improving its accuracy and percentage of questions answered correctly.

    1-2.Outline of deep learning

    Deep learning is, in a broad sense, one of the machine learning technologies using multilayer neural networks.

    I already explained that learning means dividing and there are many ways of dividing. One of the typical ways of dividing is neural networks that I mentioned above. Neural networks are composed of three different layers like human brains; input layer, hidden layer and output layer. We change learning processes from the input layer to the output layer by weighting and then coordinate them so that they output optimum value.

    It was difficult to make the neural networks more multi-layered because if we dig into more than three layers, the results of learning would be worse. That is why they were composed of three layers. But recently, it became possible to learn by digging into multiple hidden layers. This is the deep learning, the high-visibility technology which outputs more accurate values than neural networks with larger number of learning processes.

    1-3.Confusing “machine learning” and “deep learning”

    The key difference between machine learning and deep learning is “feature value”.
    Feature value, variable used for an input of machine learning, represents features of object quantitatively. It isbuilt into computer by human because in machine learning, it is difficult to create the feature value by itself.

    For example, when you want to know the type of baggage, you would set the feature value like weight, size and whether it is breakable or not. If you put irrelevant feature, computer cannot find correct type. Computer is inferior to human being in finding out appropriate feature.

    However, in deep learning, computer can learn and find feature value by itself. We human being no longer have to show feature value and computer can do much more thing.

    2-1.Explanation of mechanism of deep learning

    Briefly, in deep learning, computer digs into many layers of information.

    As I explained in 1-2, these layers are composed of input layer, hidden layer and output layer. In deep learning, computer can drill down into many hidden layers to find hidden feature value.

    In the second hidden layer, there are combined feature values which are found in the first hidden layer. Thus, the deepercomputer drills down, the more high-quality feature value could be found which brings more accurate output value.

    3-1.The trigger of getting attention

    Deep learning grabbed its first attention in 2012 at worldwide competition of visual recognition “ILSVRC (Imagenet Large Scale Visual Recognition Challenge)”. In this competition, the Super Vision of Toronto University, who participated for the first time, made a clean sweep of AIs of other research institutes such as University of Tokyo and University of Oxford.

    • Institutes competed in percentages of correct answers of their computers who recognize visuals. Super Vision got its error rate of 15 to 16 % while other institutes’ AIs were battling around 26%. This extraordinary result sent a shock wave on the industry because they were spending one whole year to reduce 1 % of error rate at that moment.

      The cause of victory was the deep learning, one of machine learning method which professor of University of Toronto, Geoffrey Hinton, developed with his group. This victory consolidated his talent and he became the focus of attention, although he already had started his research on deep learning in 2006.

    3-2.Background of progressing practical use of deep learning

    With this competition as a trigger, deep learning started to yield practical applications. After the next year of the competition, major companies such as Google and facebook started to focus on AI research.
    Google accepted Professor Hinton in its research team and facebook established three AI institutes in the world.
    These move raised a profile of the application of AI and deep learning, which prosecuted their practical use.

    3-3.Present situation of practical use

    Normally, it takes years when new technologies are put to practical use after they make headlines around the world. Now it’s been four years after deep learning grabbed a great attention and it is put to practical use in many ways.

    ■Voice recognition
    • Voice recognition is a technology of recognizing human voice and transforming it into words, understanding and carrying out spoken commands. It got a lot of visibility after the arrival of Siri and Cortana, computer with which we can communicate.

    For example, cognitive system”Watson”, developed by IBM, one of the leading company of AI research, trounced human “quiz king” on US TV quiz show “Jeopardy” in the middle of 2011 and this became a hot topic.
    This company provides “Watson Speech to Text”, a service which converts the human voice into the written word. Ii is used in many fields such as transcript of conversation in call centers and so on.

    reference : Watson Speech to Text : http://www.ibm.com/smarterplanet/jp/ja/ibmwatson/developercloud/speech-to-text.html
    ■Visual Recognition
    • Visual recognition is a technology of analyzing visual content to recognize shapes.
      Although it is easy for human to extract and distinguish objects in images, it has been complicated for computer. However, deep learning made it possible for computer to dig into deep layers that allowed visual recognition to make dramatic progress and to be put in practical use.is a technology of recognizing human voice and transforming it into words, understanding and carrying out spoken commands. It got a lot of visibility after the arrival of Siri and Cortana, computer with which we can communicate.

    MetaMind, based in Silicon Valley, provides AI solutions for companies. “Vision”, its visual recognition engine, can classify photographed objects into 22 000 categories.
    On the result screen, the original picture is shown on the left side and detailed result on the right side. In detailed result, you can read options of the subject such as “cat”, “lion” and “tiger” etc. along with confidence level shown as a percentage. For example, if you put a picture of a dog, it can show you even precise dog breeds like “Shiba Inu” and “Akita Inu”.

    reference : MetaMind : http://metamind.io/
    ■Natural language processing
    • Natural language processing is a technology of processing natural language that human use daily. Some of you may find it familiar because it has been used for machine translation. In November 2016, Google translation has become a topic when it improved its accuracy after applying neural network.

      Besides, MetaMind, the company I introduced in the part of visual recognition, provides a natural language analytical engine called “Language” which understands financial statements and assesses risk. Natural language processing is a really useful technology.

    reference :Google Japan Blog : https://japan.googleblog.com/2016/11/google.html
    ■Recommendation system
    • Recommendation system is a technology of presenting what each user is expected to be interested in from their past behaviors.
      « Recommendations » of Microsoft provides customers with lists of recommended products from data based on customer’s behavior. This raises conversion rate and is a very effective system in marketing.

      From now on, more companies are expected to adopt this system since the importance of customer personalization is highly publicized in Web marketing.

    reference : Microsoft : https://azure.microsoft.com/ja-jp/services/cognitive-services/recommendations/

    4-1.Conclusion

    After the competition in 2012 where Prof. Hinton announced Super Vision, deep learning is sharply raising a profile and major companies of internet industry like Google have started investing in it.
    It is said that deep learning, a technique which can extract feature value from input data, is a great breakthrough in AI and will be put to practical use in various other industries.