[R] Transformer²: Self-Adaptive LLMs

old.reddit.com

[R] Transformer²: Self-Adaptive LLMs

old.reddit.com

Lemmit.Online botMAB to

Machine LearningEnglish · 6 months ago

Paper: https://arxiv.org/abs/2501.06252 **Abstract** Self-adaptive large language models (LLMs) aim to solve the challenges posed by traditional...

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/hardmaru on 2025-01-15 00:41:09+00:00.

Paper:

Abstract

Self-adaptive large language models (LLMs) aim to solve the challenges posed by traditional fine-tuning methods, which are often computationally intensive and static in their ability to handle diverse tasks. We introduce Transformer², a novel self-adaptation framework that adapts LLMs for unseen tasks in real-time by selectively adjusting only the singular components of their weight matrices. During inference, Transformer² employs a two-pass mechanism: first, a dispatch system identifies the task properties, and then task-specific “expert” vectors, trained using reinforcement learning, are dynamically mixed to obtain targeted behavior for the incoming prompt. Our method outperforms ubiquitous approaches such as LoRA, with fewer parameters and greater efficiency. Transformer² demonstrates versatility across different LLM architectures and modalities, including vision-language tasks. Transformer² represents a significant leap forward, offering a scalable, efficient solution for enhancing the adaptability and task-specific performance of LLMs, paving the way for truly dynamic, self-organizing AI systems.

Blog Summary:

GitHub:

You must log in or register to comment.

Chat

Machine Learning

machinelearning

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Community locked: only moderators can create posts. You can still comment on posts.

This subreddit is temporarily closed in protest of Reddit killing third party apps, see /r/ModCoord and /r/Save3rdPartyApps for more information.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
6 users / 6 months
1 local subscriber
20 subscribers
2.39K Posts
1 Comment
Modlog

mods:
Lemmit.Online bot