PolyU welcomes AI pioneer who is revolutionising large language models
PolyU is delighted to welcome a new AI scientist to support its mission of advancing AI research and fostering global collaboration. With over a decade of AI research experience, Professor Yang Hongxia, newly appointed to the Department of Computing, has had a distinguished career at Yahoo, Alibaba, and ByteDance US, where she led cutting-edge technology development, including the creation of Alibaba’s ten-trillion-parameter M6 multimodal model.
A “model-over-models” approach
Leaning on her exceptional experience, Professor Yang will spearhead a new research direction on generative AI by introducing a decentralised paradigm to reduce costs and barriers associated with Large Language Model (LLM) training. Training LLMs has been prohibitively expensive for many universities, given the need for significant computational resources, creating an elitist barrier to entry and centralising research among a select few, thereby potentially hindering progress in AI research. Additionally, LLMs are often trained on large datasets with limited domain-specific knowledge, making it difficult to understand the nuances and complexities of specific fields.
Professor Yang has also observed that while fine-tuning pre-trained Language Models or Retrieval Augmented Generations (RAG) are widely adopted approaches to inject knowledge for specific tasks or domains, they will rely on the original architecture and hyperparameters of the pre-trained model.
Her team advocates a “model-over-models” approach to LLM development, enabling domain experts to train smaller models that can evolve into comprehensive LLMs. This distributive approach not only substantially lowers the cost of training but also empowers a wider community of domain experts to participate in the development process.
“Our research has verified that small LLMs, once put together, can outperform the most advanced LLMs in specific domains,” Professor Yang noted.
As barriers to entry decrease, researchers can build smaller LLMs that capture diverse representations while maintaining control over proprietary data. The next milestone will be establishing an infrastructure that enables domain experts to develop their own LLMs with ease, through a streamlined and automated pipeline for pre-training, thus democratising access to AI development.
In line with PolyU’s motto, “To learn and apply, for the benefit of mankind”, Professor Yang invites the global research community to join forces with the University to transform LLM development, unlock knowledge potential, and amplify positive societal impacts.