top of page
This research direction focuses on the internal representations of multilingual, multimodal, and multi-agent large foundation models (LFMs), aiming to uncover “latent” or “dark” knowledge that is learned but not explicitly expressed in model outputs. Understanding these internal mechanisms is key to improving the interpretability, controllability, robustness, and cross-lingual/modal transferability of LFMs. Our work follows a unified paradigm of Representation → Alignment → Transfer→ Enhancement. We study how latent knowledge is structured across language-, modality-, and agent-specific as well as shared representations, and how it underlies key capability dimensions, including understanding, reasoning, safety, cultural awareness, and forgery/anomaly detection. Building on this, we investigate how to activate and align latent capabilities in LFMs, protect them under adversarial and distribution shifts, as well as transfer and enhance them across languages, modalities, agents, and model architectures, especially from high-resource settings to low-resource scenarios and weaker models/agents. Ultimately, this line of research aims to support the development of trustworthy, controllable, and scalable LFMs.
-
Specific and Shared Representations Analysis: We investigate how latent knowledge is internally organized in multilingual, multimodal, and multi-agent foundation models, focusing on both language-, modality-, and agent-specific as well as shared representations. Using probing and intervention techniques, we identify neurons/pathway/experts associated with key capability dimensions, including understanding, safety, reasoning, cultural awareness, forgery/anomaly detection, across different languages, modalities, and agents, and distinguish those that are specific from those that are broadly shared. Our analysis reveals that high-resource languages and modalities (e.g., English, Chinese, and images) tend to exhibit richer shared representations, while low-resource settings rely more heavily on language- or modality-specific structures. This suggests the emergence of a partially language-, modality-, and agent-agnostic conceptual space, where representations from diverse sources align at higher semantic levels. From a broader perspective, this phenomenon resonates with Plato’s theory of Forms, which posits that diverse concrete instances are grounded in abstract and universal essences. In the context of multilingual, multimodal, and multi-agent LFMs, such an idea can be interpreted as a shared latent semantic space that unifies representations across languages, modalities, and agents, providing a foundation for analysis cross-cultural commonalities and enabling knowledge discovery beyond explicit symbolic representations.
-
Alignment, Transfer, and Enhancement: We study how latent capabilities can be aligned, transferred, and enhanced in multilingual, multimodal, and multi-agent foundation models, enabling consistent and robust behavior across languages, modalities, and agents. Building on latent representation analysis, we focus on key capability dimensions, including alignment, safety, understanding, reasoning, cultural awareness, and forgery/anomaly detection, and investigate how these capabilities can be coordinated and shared across heterogeneous settings. Specifically, we explore how to activate and align latent capabilities within LFMs, enabling models/agents to better leverage latent knowledge for reasoning, safety, alignment, culturally aware, forgery/anomaly detection. We further investigate how capabilities learned in high-resource languages, modalities, or agents can be effectively transferred to low-resource settings and weaker models, allowing stronger models or agents to support weaker ones through capability alignment and knowledge reuse. In addition, we study how to enhance the protection of these capabilities under adversarial attacks, prompt perturbations, and distribution shifts, ensuring that critical neurons, pathways, and capability structures remain robust and cannot be easily bypassed or manipulated. From a broader perspective, this process resonates with the empiricist and inductive traditions of John Locke and David Hume, where knowledge is accumulated through experience and generalized across contexts. In multilingual, multimodal, and multi-agent LFMs, this is reflected in the reuse, adaptation, and amplification of latent knowledge, enabling capabilities to be aligned, transferred, and enhanced across languages, modalities, agents, and model architectures.
bottom of page