ZHEJIANG LAB
News  Detail
PAN Yunhe: AI Will Become Data- and Knowledge-driven
Date: 2022-06-20

Recently, PAN Yunhe, academician of the Chinese Academy of Engineering and chief AI scientist of ZJ Lab delivered an academic report themed "AI Tends to Be Data- and Knowledge-driven" at the Nanhu Headquarters of ZJ Lab.

Visual knowledge and multiple knowledge representation is the key to AI 2.0

One of the major driving forces for the AI fever is the rapid development of pattern recognition based on deep learning technology. Breakthroughs in pattern recognition technology promote not only the recognition of the human face, fingerprint and medical images but also the development of smart vehicles, security surveillance, robotics, unmanned aerial vehicles and smart manufacturing. Deep neural network (DNN), a means of AI knowledge representation, has such weaknesses as it hardly explains things, is unable to do reasoning, and inevitably introduces data bias because it needs a huge amount of identified data to train its network parameters. Mr. PAN held that therefore we need to introduce a brand new knowledge representation- visual knowledge representation.

According to cognitive psychology, some of our visual memories are called visual images that are used for the purpose of thinking in images. From an AI perspective, such visual images are visual knowledge.  "In fact, there is much more visual knowledge than explicit knowledge in our memories. For example, when a child aged less than five sees a set of cups, he will naturally grasp the cup and drink water. He will not take the cup mat. But he can hardly make explicit by means of a verbal statement why he does so." Mr. PAN pointed out that the difference between visual knowledge and explicit knowledge is  that the former not only tells the objects' sizes, colors, textures, space shapes and their relations, but also comprehends the objects' movements, speeds and time histories and makes their space-time transformations,  operations and reasoning.

Researches reveal that the way of visual recognition by human is unlike that by deep neural network models. Human visual recognition not only analyzes data in the short-term memory coming through the retina, but also employs the images in long-term memory, i.e. visual knowledge. That is why our visual recognition needs little data to do explanation and reasoning. Mr. PAN believes that "Visual recognition uses both data and visual knowledge. This is the key to the breakthroughs achieved by AI 2.0."   

"At present, the development of visual knowledge faces five fundamental challenges. They are visual knowledge representation, visual recognition, visual perception and learning, simulation of visual thinking in image, and multiple knowledge representation," added Mr. PAN.

One of the keys to developing visual knowledge is to make breakthroughs in visual perception, and to make analysis and simulation based on visual perception classification. As for learning visual knowledge, it is necessary to transform from rebuilding the objects' shape to rebuilding the concepts and propositions about visual knowledge. Mr. PAN pointed out that researchers in the fields of AI, computer graphics and computer vision should join hands to particularly focus on automatic learning concerning visual perception and visual knowledge.

The latest findings in brain science in recent years support the theories on visual knowledge and multiple representations. According to Mr. PAN, AI 2.0 should have multiple knowledge representation, i.e. knowledge map for semantic memory, visual knowledge for scenario memory, and deep neural network for sensory memory. "AI 2.0 should combine multiple means of knowledge representation. This will form the technical basis for multi-modal intelligence and big data intelligence," said Mr. PAN.

Big data and multi-modal intelligence will lead the fourth round of AI innovation

Reviewing the six-decade development of AI, there have been three rounds of innovation of mainstream AI technology, namely rules and logic-driven AI, knowledge and reasoning-driven AI and data and deep neural network model-driven AI. The third innovation, starting from 2006, to a large extent was ascribed to deep learning, proposed by scientists represented by Geoffrey Hinton, as well as its application. Data and deep neural network model-driven AI features visual and hearing recognition. However, it has prominent weaknesses such as it hardly explains things, is hard for migration application, relies on identifications, and is hard for generalization. 

Mr. PAN believes that big data and multi-modal intelligence will lead the way to the fourth round of AI renovation, along which visual knowledge, multiple knowledge representation and visual perception will help pave the way. "AI will become data- and knowledge-driven," said Mr. PAN, "big data and mockup are important, while big knowledge is equally important."

This is an undeveloped zone of great potential, as well as a hopeful unmanned zone worthy of exploration. Mr. PAN encouraged ZJ Lab's researchers to "do your best and hang on AI studies, and combine fundamental research and application research".

He further pointed out that the core driving forces   for this round of AI development are not computing strength, algorithm or   data, but deep neural network theories and models, as well as   breaking-through innovations and applications such as AlphaGo and AlphaFold.   For example, DeepMind's AI program AlphaFold2 successfully predicted the   structures of 98.5% of human protein. This epoch-making breakthrough in   structural biology creates new potentials for medical design and the   development of synthetic biology.

Mr. PAN proposed the key areas and system model diagrams in five aspects of China's AI development, suggesting we should make breakthroughs in new AI theories to build up data, computing force and algorithm platforms, make breakthroughs in innovative applications to form knowledge and scenario platforms, and promote industrial and social development. "The ultimate indicator to measure a country's AI development is the level of industrial and social application of AI. Either theoretic innovation or algorithm platform innovation shall serve industrial and social development."