/Tech2h ago

Builder Creates PaperWiki As Automated LLM Knowledge Base For Research

22117317120K

#684

Original post

elvis@omarsar0#684inTech

LLM Wikis are being slept on.

I argue that creating knowledge bases with LLMs or coding agents is one of the most valuable applications of AI today.

It's about being intentional in building and scaling your intelligence stack.

To showcase this, I wanted to share an LLM Wiki I have built over the last couple of months.

It's called PaperWiki, and I use it across all my research workflows, along with my research agents.

In fact, I also use it to curate papers I share with my communities, newsletter, and on X.

The PaperWiki is updated regularly with automations, so I basically have agents on a loop maintaining it. All the entries are ingested from different sources and stored in a vault (Obsidian) and further indexed using qmd. And then further presented via an HTML artifact. So all of it is easily accessible to all my agents and easily searchable through full-text search and rich semantic search. The structure of the wiki has proven significantly useful to start interesting and exciting cutting-edge research projects with my research agents (from building tiny and more efficient gpt/difussion llms to building out SoTA harnesses and memory systems). It turns out that agents love markdown files and can more easily navigate the papers given the rich metadata structure of the wiki.

I am just getting started on this, but it's clear to me that we should all be experimenting with LLM Wikis.

Here's why:

Building LLM knowledge bases gets you into the habit of leveraging AI outputs in all kinds of creative ways. It's the good kind of tokenmaxxing we should all be pushing for.

LLM Wikis can be maintained automatically in a loop. I use an automation that updates the wiki every day based on papers I curate. The curation is another automation I run in a loop (with a bit of human in the loop), so I get to build on all my previous knowledge and expertise, and all of it compounds the deeper the integration/layers.

One interesting result of this process is that I feel like I can better spot high-quality papers and remove noise more easily. Social media could never solve that. And most paper aggregators use metrics I simply don't trust. I like that agents can help with the noise vs. signal problem. This is important for research. Lots of people consider agents to produce mostly slop. But it doesn't have to be that way. Careful curations, prompts, automations, verifiers, and human-in-the-loop can produce some astonishing results.

And you really don't need frontier models for this. I use a combination of frontier models (opus-4.8) and open-weight models (deepseek-v4-flash) to maintain this. An exciting future work (we are working on this @dair_ai) is to tune specialized models on top of this to allow LLMs to quickly understand cutting-edge research ideas and can better conceptualize research strategies that further accelerate scientific research agents.

I plan to open-source a bunch of this work, including the artifact, but this is currently work in progress, and I was excited to share some thoughts as I continue working on it. Sharing more as I go. Stay tuned!

10:35 AM · Jul 2, 2026 · 11.5K Views

Sentiment

Positive users praise PaperWiki as a next-level LLM knowledge base that saves time and enables intelligence stacking, while negative users worry hallucinations will require extra fact-checking and add to content slop.

Pos

76.2%

Neg

23.8%

10 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS7.1KBOOKMARKS43LIKES34RETWEETS1REPLIES7

elvis@omarsar0

On top of it all, the PaperWiki automatically generates and maintains survey papers on all the AI topics I am interested in.

All up-to-date.

There simply doesn't exist anything like it.

Just insanely useful.

elvis@omarsar0

LLM Wikis are being slept on.

I argue that creating knowledge bases with LLMs or coding agents is one of the most valuable applications of AI today.

It's about being intentional in building and scaling your intelligence stack.

To showcase this, I wanted to share an LLM Wiki I have built over the last couple of months.

It's called PaperWiki, and I use it across all my research workflows, along with my research agents.

In fact, I also use it to curate papers I share with my communities, newsletter, and on X.

I am just getting started on this, but it's clear to me that we should all be experimenting with LLM Wikis.

Here's why:

Building LLM knowledge bases gets you into the habit of leveraging AI outputs in all kinds of creative ways. It's the good kind of tokenmaxxing we should all be pushing for.

1h7.1K3443

elvis@omarsar0

On top of it all, the PaperWiki automatically generates and maintains survey papers on all the AI topics I am interested in.

All up-to-date.

There simply doesn't exist anything like it.

Just insanely useful.

1h1.3K43

elvis@omarsar0

@emeeliojohann I am planning to get this out as a skill soon so everyone can build their own.

2h721

elvis@omarsar0

@aimanhasnoname i highly doubt that. and it they do (in some general form), i don't think it will ever outperform the specialized models that are possible to build with the wiki

2h661

Emilio Johann@emeeliojohann

@omarsar0 I have never built one but I agree with you in that they offer a ton of value.

2h661

AImanhasnoname@aimanhasnoname

@omarsar0 Labs will release models that will make this unnecessary in due time... like managed agents did with harnesses

2h62

atilab@atilab

@omarsar0 I have been following this for a couple of months now. My implementation is already deployed on Vercel, and I will share it in the community in DAIR; still a little course and not as elegant. I am on board with this. Especially the patterns you have engineered, top-flight!

31m8

atilab@atilab

@omarsar0 Some screenshots....

26m7

Kaustab Pal@kaustabpal

@omarsar0 I am using LLM wikis to organise my journals and build a wiki of my life. The catch is I am trying to use a local model to do this. You can read about it here.

1h501

Hussain Hashim | Building SundayBack@itsthedonhashim

@omarsar0 @omarsar0 totally agree. tried using LLMs for our internal docs and it's a game changer. saves so much time.

1h122

Alpha Batcher@alphabatcher

@omarsar0 well now i'm so bullish on LLM Wikis

1h58

NTK AI@NtokozoAI

Enterprises have chased this for thirty years. Lotus Notes, SharePoint, Confluence.

The tooling was rarely the failure point. Documentation was labour that mostly rewarded everyone except the person doing it, so it decayed.

This flips the incentive. The knowledge base builds from work already happening.

The scarce skill becomes deciding what deserves to enter the map.

1h35

bananablazer@bananablazer

@omarsar0 wikis sound cool until you realize the LLM hallucinates citations and you spend more time fact-checking than if you just wrote the thing yourself 🤖

1h27

Simple AI@Simple_AI_00

@omarsar0 PaperWiki: agents finally doing the reading so humans can keep pretending. Bold of you to automate signal in a sea of slop.

1h23

chetansawai@c_s_a_w

@omarsar0 Been building one of these for our internal docs and the compounding is real. Hardest part for me is keeping it from going stale as the code moves under it. Are you doing refresh with manual passes or an agent that re-crawls on changes?

1h14

Ricardo Dias@theslowtell

@omarsar0 the gap between "stays current" and "stays useful" is where most AI tools collapse into noise. if PaperWiki holds that line without turning into a feed you ignore, that's the entire problem solved. what makes the surveys stay signal instead of just becoming another pile?

56m13

Hira@Hiraweb3

@omarsar0 this is next-level knowledge stacking fr

1h8

MaatWork@MaatWorkX

@omarsar0 Maintenance is the product, generation is just the demo. The real system design challenge is the feedback loop that detects when new evidence contradicts the previous consensus and triggers a reconciliation without drift.

57m7

Emilio Johann@emeeliojohann

@omarsar0 That would be awesome elvis.

Question: what is the difference between a wiki and a directory?

1h7

atilab@atilab

@omarsar0

25m6