Skip to content

[attention backends] Add Neuron backend for context parallel#13473

Open
aws-zhenguo wants to merge 1 commit intohuggingface:mainfrom
aws-zhenguo:neuron_backend
Open

[attention backends] Add Neuron backend for context parallel#13473
aws-zhenguo wants to merge 1 commit intohuggingface:mainfrom
aws-zhenguo:neuron_backend

Conversation

@aws-zhenguo
Copy link
Copy Markdown

What does this PR do?

This PR adds SDPA forward and backward calls for neuron backend to support ring attention.

User would need torch_neuronx to run diffuser models with this change on Neuron device (trainium device).

The torch_neuronx is not publicly available yet. Will provide an example to once it is released.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul @yiyixuxu Please let me know if any additional information is required. Thanks!

@github-actions github-actions bot added models size/M PR with diff < 200 LOC labels Apr 14, 2026
@sayakpaul
Copy link
Copy Markdown
Member

Thanks! Let us know once it's publicly released on the PR so that we can review it. Feel free to also provide some visual examples with popular models like Flux, QwenImage, etc.

@sayakpaul sayakpaul added the performance Anything related to performance improvements, profiling and benchmarking label Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models performance Anything related to performance improvements, profiling and benchmarking size/M PR with diff < 200 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants