当天下午，中国工程院院士，清华大学讲席教授、智能产业研究院(AIR)院长张亚勤出席了Generative AI: Friend or Foe(生成式人工智能：友或敌)分论坛并发言。一同出席的还有IBM公司董事长兼总经理陈旭东，斯洛文尼亚数字化转型部部长Emilija Stojmenova Duh，香港科技大学电子及计算机工程系讲席教授冯雁，可之科技创始人王冠，本次分论坛由世界经济论坛AI、数据和元宇宙业务负责人李琪主持。
随后张院士着重介绍了清华大学智能产业研究院(AIR)。AIR是一个面向第四次工业革命的一个国际化、智能化、产业化的研究机构。在产业方面，我们希望和产业合作来解决真正的问题、实际的问题。同时，作为清华大学的研究机构，我们肩负着人才培养的职责和使命，目标是培养未来的CTO、未来的架构师。AIR的科研方向包括三个，也是人工智能在未来五年十年具有巨大影响力的三个方向。第一个是机器人和无人驾驶，又称为智慧交通;第二是智慧物联，特别是面向双碳的绿色计算、小模型部署到端等;第三是智慧医疗，包括药物研发等。以机器人和自动驾驶研究为例，这方面的研究需要海量的数据，尽管我们与百度Apollo合作，同时也有自己的机器人，但收集的真实数据远远不够，所以我们提出了Real2Sim2Real 现实-仿真-现实 (RSR)的概念，用仿真技术来增强数据，模拟驾驶长尾现象，实现真实场景和仿真场景的双向连接。
分论坛上，其他嘉宾也从不同角度分享了对生成式人工智能的看法和见解。IBM公司董事长兼总经理陈旭东着重介绍了IBM在生成式人工智能方面的技术创新和商业应用，以及公司如何帮助企业实现自身AI的发展和数据安全管理。斯洛文尼亚数字化转型部部长Emilija Stojmenova Duh阐述了斯洛文尼亚政府在推动数字化转型和支持生成式人工智能发展方面的政策和举措，如将AI引入学校教育、提升公务员和公民的数字化能力、开辟与公民沟通的新渠道等。她也指出了AI可能带来的偏见问题，呼吁消除人工智能带来的偏见。香港科技大学电子及计算机工程系讲席教授冯雁深耕对话型AI领域研究近30年，她惊叹于如今大模型的智能涌现，同时也呼吁大家关注AI治理，并建议应该探寻如何与机器更好地合作，而非对立。可之科技创始人王冠则重点探讨了AI大模型在教育领域的应用，致力于提升优质教育的规模化，降低甚至消除教育资源获取的不平等。
Cathy Li: You are a industry veteran, with your experience with Vidua, Microsoft and now you're working at Tsinghua University. Can you tell us a bit more about, in particular the generative AI landscape in China?
Ya-Qin Zhang: It's quite interesting; we had a similar panel about 7 years ago at the winter Davos, and now we're here in China. The whole technology has completely transformed the industry, including in China. I'll talk about China a little bit more later, but I'd like to spend one minute summarizing my observations regarding ChatGPT and Stable-diffusion over the last couple of years.
1.ChatGPT is the first software that actually passed the Turing test. For a computer scientist this has been a major endeavor to develop something that can pass the Turing test.
2.This leads to AGI. It's not exactly AGI yet but it does provide them a pathway towards artificial general intelligence that is another goal that we've been trying to pursue.
3.More importantly, for industry, I consider GPT as an operating system for AI. Back in the PC days, we had Windows and Linux. In the mobile days, we had iOS and Android. So, this is the new operating system for the era of AI. It will completely reshape the whole ecosystem, whether it's the semiconductor or the application ecosystem. For example, Professor Wang just talked about education, which is actually a vertical model based on the large operating system. The data he used to train the exams is not the same data used to train the GPT, but it really works out because you can have an operating system that is a large language model, and then you're going to have a number of vertical models for different industries. They will have applications built on top of that. So, the industry world will be very different. All the apps and models will be rewritten and completely restructured.
All these years, China has been doing some terrific work in basic research, algorithms, and industry applications in every sector. And even though ChatGPT was not invented in China, there are almost a hundred companies that have emerged in the last six months or so in the generative AI space. Some of these companies are developing large models, while others are diving into generative AI for vertical models that can generate not only language but also images, videos, robotics, and even in the biological computing space. There are tremendous activities going on in China, and Professor Wang's company is one of them.
Cathy Li: I wanted to go back to you in your new capacity as a professor at Tsinghua University and also the dean of Institute for AI Industry Research. Can you elaborate how your research has integrated and incorporated genital AI and what are some of the significant outcomes so far that you're allowed to share.
Ya-Qin Zhang: I started this lab when I retired from Baidu about 3 years ago. We are obviously doing basic research, but a lot of our work involves applying that research to real-world problems. We use general AI for almost everything we do.
One of our research focuses is on robotics and autonomous driving. Obviously, we need to collect a lot of data. We work with Baidu Apollo, which has hundreds of cars driving around in China, collecting a lot of data. We also have robots that collect data. However, the data we currently have is still very small compared to what we need. So, we use general AI to augment some of this data. Additionally, we use general AI for simulations because there's a dilemma. When you put a car on the street, you want to avoid accidents, but the goal of model training and algorithms is to minimize accidents, which means we don't have enough accident data. This is where stable-diffusion and the techniques we use come in handy. They allow us to generate long-tail cases, which have been extremely helpful. Furthermore, it enables us to establish end-to-end connectivity, from real-world scenarios to simulation and back to real-world scenarios. I call this "RSR," which stands for "real scenario to simulation and simulation back to real scenario."
The second example is in biological computing, which is also one of our major efforts. We have built a GPT called BiomedGPT, similar to the education model, but focused on the biological and medical field. It doesn't have trillion parameters; rather, it has only 1.6B parameters. This model gathers data from various sources, including the protein structure, molecus structure in cells, genetic structure, literature, and patent data. The advantage of this model is that once you have it, you can easily generate downstream tasks, such as predicting and generating protein structures, performing molecular docking, and determining binding structures. We also have individuals working on multi-models, large models, and model-model interactions.
Xudong just mentioned the ability to use a large model to train more models. In the future, when you attempt to accomplish a task, you can utilize a federation of different models, obtained from different companies and sources, including open-source and closed-source, as well as various verticle models. Additionally, we have people working on reinforcement learning. Moreover, we are deploying large models onto edge devices such as phones, robots, and IoT devices. However, I must note that this poses significant risks. When connecting the information world to the physical and biological world, there will be a plethora of safety issues and risks.