[R] Log-Linear Attention

old.reddit.com

[R] Log-Linear Attention

old.reddit.com

Lemmit.Online botMAB to

Machine LearningEnglish · 18 days ago

Super new research, from the authors of FlashAttention and Mamba(2): [https://arxiv.org/abs/2506.04761](https://arxiv.org/abs/2506.04761) Long...

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/machinelearning by /u/Potential_Duty_6095 on 2025-06-07 08:34:07+00:00.

Super new research, from the authors of FlashAttention and Mamba(2):

https://arxiv.org/abs/2506.04761

Long Story Short: They extend Mamba2 to have state that can is not fixed and can grow in time, directly increasing Long Range Performance. This seem a sweet point between traditional Mamba2 where the state is fixed sized, being an bottleneck for long sequences, and Attention which is stateless, but need to store past KV pairs! All with specialised Triton kernels!

You must log in or register to comment.

Chat

Machine Learning

machinelearning

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Community locked: only moderators can create posts. You can still comment on posts.

This subreddit is temporarily closed in protest of Reddit killing third party apps, see /r/ModCoord and /r/Save3rdPartyApps for more information.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
8 users / 6 months
1 local subscriber
19 subscribers
2.33K Posts
1 Comment
Modlog

mods:
Lemmit.Online bot