SubQ.ai claims its SubQ 1.1 model runs 56x faster than FlashAttention-2 but omits core architectural details from its report

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

Reading this report and hoping to see them explain their architecture was like they do go into details of MLA and DeepSeek-V4 though the "donor model" has a 262K context, they use YaRN I guess the donor is Kimi no you can't see the model

Alexander Whedon@alex_whedon

Here is the technical report on SubQ 1.1 Small. https://subq.ai/subq-1-1-small-technical-report

This is the second iteration on our Subquadratic Sparse Attention (SSA) model, and the first to be deployed with design partners in the coming weeks.

The results are compelling and verified by @AppenResearch.

- Near-perfect long-context retrieval up to 12M tokens on the needle-in-a-haystack test, with up to nearly 1,000x attention compute reduction.

- A balance of long-context optimization and general reasoning ability, with strong performance retained across knowledge, coding, and non-coding enterprise agent benchmarks.

- At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2.

These results highlight a significant scaling advantage thanks to the efficiency gains from the SSA architecture.

We included some details and learnings from the development process which may be helpful to the community.

Comment with questions, I’ll try to respond!

8:29 AM · Jun 16, 2026 · 6.3K Views

VIEWS2.5KBOOKMARKS1LIKES10REPLIES3

Zephyr@zephyr_z9

I love these SubQ scammers

Alexander Whedon@alex_whedon

Here is the technical report on SubQ 1.1 Small. https://subq.ai/subq-1-1-small-technical-report

This is the second iteration on our Subquadratic Sparse Attention (SSA) model, and the first to be deployed with design partners in the coming weeks.

The results are compelling and verified by @AppenResearch.

- Near-perfect long-context retrieval up to 12M tokens on the needle-in-a-haystack test, with up to nearly 1,000x attention compute reduction.

- A balance of long-context optimization and general reasoning ability, with strong performance retained across knowledge, coding, and non-coding enterprise agent benchmarks.

- At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2.

These results highlight a significant scaling advantage thanks to the efficiency gains from the SSA architecture.

We included some details and learnings from the development process which may be helpful to the community.

Comment with questions, I’ll try to respond!

1d2.5K101

RETWEETS1

Nyanpasu@NyanpasuKA

Le Chaton Fat who ? subhuman AI breaks all records with Subhuman1.1-small bolstering a whopping 67% retreival accuracy over 1 gorillion token window. subhuman intelligence >>>> chaton intelligence.

Alexander Whedon@alex_whedon

Here is the technical report on SubQ 1.1 Small. https://subq.ai/subq-1-1-small-technical-report

This is the second iteration on our Subquadratic Sparse Attention (SSA) model, and the first to be deployed with design partners in the coming weeks.

The results are compelling and verified by @AppenResearch.

- Near-perfect long-context retrieval up to 12M tokens on the needle-in-a-haystack test, with up to nearly 1,000x attention compute reduction.

- A balance of long-context optimization and general reasoning ability, with strong performance retained across knowledge, coding, and non-coding enterprise agent benchmarks.

- At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2.

These results highlight a significant scaling advantage thanks to the efficiency gains from the SSA architecture.

We included some details and learnings from the development process which may be helpful to the community.

Comment with questions, I’ll try to respond!

1d3.2K198

Zoqhvn@ARoger69

@zephyr_z9 My strategy analysis !

🔽as follow

1d37

𝑨𝒎𝒎𝒂𝒓@qwen56125254

@NyanpasuKA you speak exactly like teor, in the same style... unbelievable

1d91

Nyanpasu@NyanpasuKA

@qwen56125254 he's a very good poaster.

1d81

Zephry@bladiah

@zephyr_z9 My market analysis is as follows:

👇Strategy as follows

👇

1d22

Revxnge@myisdimitry

@teortaxesTex Call this a tech report is an interesting choice

1d91

BBLHQ ⭐️@BluesBallersHQ

@zephyr_z9 huhh

1d28

𝑨𝒎𝒎𝒂𝒓@qwen56125254

@NyanpasuKA yes, but you can compete with him... I'll cheer you on :)

1d41

Zeus@ZeusSkils

@teortaxesTex they name MLA, YaRN, 262K donor, then drop "no u cant see it"

lore drop with a door slam is kind of funny

1d13

Zephry@bladiah

@zephyr_z9 I will share my detailed trading plan (including entry and exit points, investment analysis, etc.) on WA. This might be helpful to you. Get it for free!

👉Copy and reply with "777" to my WA to get it for free👉+18456624895

My WA link:http://wa.me/18456624895/?text=TRADING

1d1

Zoqhvn@ARoger69

@zephyr_z9 I share my real-time (entry and exit points) on WhatsApp. Join for free!✅

➡️Copy search input Reply “777” to WhatsApp: +13692375376

Here’s the link : https://api.whatsapp.com/send?phone=13692375376&text=777

1d1