The 2023 China Computational Power Conference was held in Yinchuan, Ningxia on August 19, 2023. The list of "China's Young Pioneers in Computational Power" was announced, including CHEN Hongyang, a research fellow from Zhejiang Lab (ZJ Lab) who is committed to exploring a new path of AI for Science for graph computing. Recently, he accepted an exclusive interview with the Yi Jian AI Jue Jin Zhi (AI for Medical Care and Health Updates) under Leiphone (www.leiphone.com) to talk about graph computing and AI foundation models.
The following article comes from the Yi Jian AI Jue Jin Zhi, written by WU Tong.
Source: Official WeChat Account of the China Computational Power Conference. The 2023 China Computational Power Conference was jointly hosted by the Ministry of Industry and Information Technology and the People's Government of Ningxia Hui Autonomous Region. The conference featured a youth forum on "Youth as the Strength of Nation" and launched nationwide solicitation campaigns for "China's Young Pioneers in Computational Power". Since its inception, the event has received extensive attention and strong support from all walks of life. After preliminary screening, review and final expert judgment, 10 youth representatives recommended by Tsinghua University, Chinese Academy of Sciences (CAS), Shanghai Jiao Tong University, National Supercomputing Center in Wuxi, Zhejiang Lab, China Academy of Information and Communications Technology, Southern University of Science and Technology, Ant Group, China Mobile and China Unicom stood out.
CHEN Hongyang from ZJ Lab: It is imperative to address computational power gap in order to implement foundation models
In the era of foundation models, the "AI arms race" is shifting from previous algorithm and data competition to underlying computational power competition.
According to the China AI Foundation Model Map Research Report, China has developed 79 AI foundation models each with over 1 billion parameters by the end of May 2023. In terms of global distribution, the US and China have led the global development of such models and contributed to more than 80% of the AI foundation models across the world.
Since developing foundation models is a long way to go, the superpower game largely ends up with the race in underlying computational power.
Therefore, in over nine months after ChatGPT's launch, many foundation models have been built on domestic ultra-large-scale computational power to create a full-chain AI R&D system featuring "computational power - data - algorithm - application".
CHEN Hongyang, an expert in network and computing technology, is leading ZJ Lab's Research Center for Graph Computing to engage in research and development of foundation models.
"The first is to develop pretrained foundation models based on graph computing, and the second is to adapt to domestic hardware so as to build an intelligent graph computing system. That's what is called 'software-hardware collaboration'."
CHEN Hongyang has a background in network information. He worked on the research of IoT theory and algorithms, the research and development of wireless communication systems, and the international standardization of information and communications technologies (ICTs) in Southwest Jiaotong University, CAS Institute of Computing Technology, Ningbo CAS IC Design Center, the University of Tokyo, the University of California, Los Angeles (UCLA), and Fujitsu Laboratories Ltd.
In July 2020, CHEN Hongyang returned to China and joined ZJ Lab. Then he has devoted more attention to "intelligent computing" (computational power). In mid-2022, ZJ Lab and Huazhong University of Science and Technology (HUST) established the Joint Research Center for Graph Computing to create a software-hardware collaborative graph computing system, and CHEN Hongyang serves as the Deputy Director of the Center.
It is reported that at present, the research center has launched the "Zhuque Graph Pretrained Foundation Model" and the efficient "ZJ Zhuque" Graph Computing Platform. This platform is capable of providing one-stop empowerment to the fields in pharmaceutical manufacturing and biological breeding, etc. In addition, the research center signed a cooperation agreement with a pharmaceutical company this year.
Recently, the Yi Jian AI Jue Jin Zhi under Leiphone (www.leiphone.com) has launched Ten Experts Talk about Medical Foundation Models to explore how domestic AI foundation models move towards ecosystem construction and the layout and exploration of different institutions about transformation and implementation. Given below is an edited transcript of the conversation with CHEN Hongyang, without changing its original meaning.
Yi Jian AI Jue Jin Zhi: ChatGPT accelerates the advent of the "era of computational intelligence". At present, the "ZJ Zhuque" platform built by your team integrates three technical capabilities: GPT, graph computing, and accelerating drug discovery. Does it rely on your past accomplishments?
CHEN Hongyang: At present, I am engaged in two areas of research. Before joining ZJ Lab, I always focused on network information, and have built large-scale ICT systems with my team, such as IoT and 5G.
From 2007 to 2011, I went to the University of Tokyo in Japan for Ph.D. to participate in research on theory and algorithms for wireless sensor networks. During this period, I went to UCLA as a visiting scholar, mainly engaged in research on distributed signal processing in the laboratory led by Prof. Ali. H. Sayed.
Then I worked at Fujitsu Laboratories Ltd. for ten years (2011-2020). Probably in 2017 and 2018, I took part in the research and development of some big data platforms, especially for the mining and analysis of operators' data. Since then, my research work has gradually shifted from originally "connect" to "computing", more precisely, "intelligent computing".
At the same time, ZJ Lab was established in 2017. Moreover, I am from Zhejiang Province and met with ZJ Lab's people on several occasions, so I returned to China in July 2020 and officially joined the lab. Because of my background in network, computing and data analysis, I worked in an "intelligent network" research center at the beginning. Later, as ZJ Lab strategically focuses on "intelligent computing", I start working on projects in this area.
But how did I go into graph computing?In the era of big data, graph computing has emerged as a fundamental enabler for efficient analysis and mining of massive data, and it is the critical arena that countries including the United States have been striving to hold in the field of intelligent computing in recent years.
In order to enhance ZJ Lab's research strength and strategic position in the field of graphic computing, Research Center for Graph Computing was jointly established by ZJ Lab and HUST in June 2022, which is expected to implement graph computing gradually from theory to system, from prototype to chip, and from special-purpose computing to general-purpose computing.
Since ChatGPT became the focus of global innovation last year, I think I have to change and take advantage of what I've learned in the fields of network and computing over the years.
The first is to develop "pretrained foundation models based on graph computing", and the second is to adapt to domestic hardware so as to build an "intelligent graph computing system". That's what is called "software-hardware collaboration".
As you can see, ZJ Zhuque has been connected to our "Zhuque Graph Pretrained Foundation Model", integrated with many traditional graph deep learning methods and self-developed graph learning algorithms. Therefore, we can perform a lot of scientific computing via the platform, of which drug R&D is an important part.
"Building a large-scale efficient graph computing platform" is just one small step for our team. Our goal covers chips, programming frameworks, and integrated design of hardware and software platforms, in order to build an autonomous and controllable graph computer fully developed by China.
Yi Jian AI Jue Jin Zhi: Domestic and foreign enterprises mostly use graph computing to study consumer behavior, telecom fraud, financial trade, etc. Why do you use this technology for biopharmaceutical manufacturing?
CHEN Hongyang: Indeed, graph computing has extended to many fields in recent years. In July 2021, Alphafold2 set off a wave of computational pharmaceutics. That's when I started using graph computing for pharmaceuticals manufacturing.
Technically speaking, a drug molecule can be regarded as a graph composed of atoms and chemical bonds. For example, its atoms can be regarded as "nodes" and chemical bonds as "edges". Therefore, graph computing can be well applied in this field to help predict the properties of compounds, drug-drug interactions, and drug-target interactions. At present, our Zhuque Graph Pretrained Foundation Model mainly uses graph structure data to accelerate drug discovery.
Why must we redevelop such a vertical foundation model? The root cause is that there are still many limitations when ChatGPT is directly used in the biopharmaceutical industry:
No control over credibility, poor performance in specific fields, and high cost.
Bert and ChatGPT, for example, have demonstrated impressive results in natural language processing, but when applied to the field of biopharmaceuticals, they cannot deal with biological non-Euclidean structure data, over-smoothing in graph neural networks, scarcity of data labels, integration with domain knowledge, and big data foundation model engineering.
Therefore, we must build our own "BioGPT". Moreover, we should not blindly pile up data, but also embed pharmaceutical domain knowledge into the foundation model.
At this point, our Zhuque Graph Pretrained Foundation Model is a complementary combination of "knowledge graph + graph computing + foundation model", which can largely avoid hallucinations where the foundation model generates nonsensical text.
Then, after performing self-supervised pretraining on large amounts of molecular data, the resulting encoder is just fine-tuned for downstream tasks in the future. DDI (drug-drug interaction), DTI (drug-target interaction) and MPP (molecular property prediction), for example, require only minor adjustments. The whole process carries on as a foundation model.
Yi Jian AI Jue Jin Zhi: So drug R&D is only one of the applications of Zhuque Graph Computing Platform. What technical and engineering challenges did you encounter in the research and development process?
CHEN Hongyang: There are three key technical difficulties in the research and development of Zhuque Graph Computing Platform:
Establish an efficient adaptive graph learning platform based on knowledge fusion, develop an efficient graph neural network and knowledge graph algorithm, and solve the problem of knowledge fusion for scientific graph computing and sparse learning.
In order to solve the problem of insufficient adaptation between computational power and operators in interdisciplinary graph learning and the software and hardware incompatibility of domestic chip clusters, an adaptive intelligent graph operator is developed with more than double the performance of typical algorithms and operators.
In order to solve the representation difficulties in interdisciplinary scientific graph learning, the low self-learning ability of graph structures and the lack of domain knowledge for graph generation, the graph structure search, graph generation learning, graph representation learning, knowledge graph techniques and prediction algorithm software are developed by using interdisciplinary pretrained models and domain knowledge.
In addition, data is a non-technical difficulty.We have our own large cell sequencers, and also work with the sequencing team of Liangzhu Laboratory to share data. Moreover, as a national scientific and technological force, our platforms and data are ultimately open source.
The big difficulty, for now, is whether discovered targets and hospital data can be shared through distributed federated learning. In this regard, we got only a small amount of open-source data.
Yi Jian AI Jue Jin Zhi: What are the differences in the development paths of foundation models at home and abroad?
CHEN Hongyang: In the development process of foundation models of the United States and China, the United States pays more attention to technology R&D and innovation, and has made important progress in hardware and deep learning frameworks.
For example, NVIDIA and Google's GPU and TPU chips designed for deep learning, as well as open-source frameworks including TensorFlow and PyTorch, all are leading the world. Last year, NVIDIA also launched BioNemo, a framework of large language models in the field of life science.
By contrast, China pays more attention to the applications of AI and explores how to realize its commercialization. So, the ecosystem will be divided into three layers in the future: foundation model, intermediate, and application.
Of course, the floor offers huge opportunities, while the ceiling is very high with the biggest risks, because only a few platform companies will get involved. For example, only iOS and Android stand out above the rest. However, at present, chip shortage, imperfect domestic framework ecosystem and lack of interdisciplinary talents lead to the absence of key technologies underlying intelligent computing.
The application layer is not that risky, and leading vertical companies will emerge from every productive area, but might not be as large as platform companies.
However, there are more foreign open-source foundation models than domestic ones, which causes some companies to cover and fine-tune their models with foreign open-source codes. This is not conducive to ecosystem construction.
Yi Jian AI Jue Jin Zhi: Nowadays, there are so many institutions developing foundation model, will they get stuck in the rat race as a result of homogenization?
CHEN Hongyang: Indeed, more and more institutions have begun to set foot in the R&D and application of foundation models. By the end of May this year, in terms of the number of foundation models, China ranks second in the world, second only to the United States. Furthermore, at least 79 foundation models containing over 1 billion parameters have been developed in China. In this case, a rat race may occur as a result of homogenization.
Natural language processing, computer vision, and recommendation system are all hot areas of research on foundation models at present. When all researches focus on these areas and similar training datasets and algorithms are selected, the developed foundation models are far less differentiated and creative.
It also consumes a lot of social resources. Overall, China is still in a catch-up phase regarding the development of foundation models and faces some challenges, such as limited core algorithms, low quality in training data, poor implementation, and deficient ecosystem.
Of course, some scholars have turned their attention to new directions, such as optimizing training algorithms and architectures, and exploring interpretability of foundation models, etc. Interpretability is particularly crucial for application scenarios such as automatic driving, smart home, financial risk control and life science.
Yi Jian AI Jue Jin Zhi: The "intelligent emergence " in foundation models is exciting, but does it lead to any development mistakes? Or is there a bubble in the market of foundation models?
CHEN Hongyang: Too much pursuit of parametric scale is a big mistake in the development of foundation models.
Scaling up alone does not necessarily improve model performance, as many other factors, such as network structure and data quality, could affect the performance. Stacking of parameters may cause some problems:
The rish of overfitting. It leads to a decreased ability to generalize. The model performs well on the training set but does not perform well on downstream tasks.
Lack of interpretability. The large number of parameters leads to the difficulty in explaining how the model makes decisions. The phenomenon, often referred to as the "black box" problem, makes the foundation model less interpretable and trustworthy.
Lack of resources. Increasing the number of parameters may place a load on storage, transmission and computing resources. Therefore, when choosing a model scale, it needs to be balanced against specific task requirements, available resources and training data size, etc.
However, the development of domestic foundation models should continue to move forward, and should not be stopped too early, which could result in the formation of bubbles.
Yi Jian AI Jue Jin Zhi: In what direction do foundation models for vertical biology applications technically develop in the next half year?
CHEN Hongyang: To some extent, a combination of "foundation model + knowledge + industrial application" will surely be applied in the future. Foundation models will be used as operating systems for AI products in the future, which will give birth to a new "Model-as-a-Service" (MaaS) industry.
The current foundation model provides users with basic knowledge services. Just like a less accurate knowledge base or search engine, it can provide only some very basic services with no guarantee of accuracy, controllability and interpretability, which greatly limits its application in practical scenarios.
Users cannot accept any nonsense, inaccurate or irresponsible services.Therefore, knowledge must be included in foundation models to achieve controllability, traceability and interpretability, and to solve more specialized problems more precisely.
Finally, intelligent algorithms and platforms could be implemented by combining foundation models and specific applications, and they would generate value only when users' various personalized needs are met.
The above information is synthesized from official WeChat accounts of the China Computational Power Conference and the Yi Jian AI Jue Jin Zhi, excluding some truncated questions.