28 Oct 2025

What I Learned About Graphs Setting Up an LLM Memory Server

Technical

Every engineer who’s played with large language models eventually hits the same wall: there’s only so much you can do with prompt improvements.

A few months ago, I was there. I’d read all the blog posts about setting up good instructions, and feeding the LLM proper context, and yada yada. I was desperate to learn something more.

Then, I happened to come across a conversation in one of the network automation Slack forums about writing code using an LLM, and one person shared his tool setup.

I messaged him on the spot, and he graciously started spilling out all his tricks. That was the first time I heard about memory servers for LLMs. For me, this was a radically new, almost plugin type of thing to help move beyond prompt optimization. It felt like magic.

This is my story about getting started with a memory server, and what it taught me. I share this in the hopes of helping another engineer in their AI journey.

What is a memory server?

No matter how smart it seems, an LLM can only hold a limited number of tokens in its memory at once. Once that buffer fills up, older context falls out the back.

A memory server is like an external hard drive for the LLM’s short-term brain. It’s essentially a knowledge-graph store of everything your LLM learns about you or your work.

With a memory server connected, anything you chat about with your LLM gets parsed and added to the long-term data bank. The memory server is constantly listening, filling itself with data, and mapping connections between those data points all by itself.

The LLM can then draw on that stored data to inform future conversations with you. Imagine being able to reference “Pete” in a fresh chat with your LLM and have it automatically know that you mean Pete Crocker who works at OpsMill and lives in the UK. Powerful!

Setting up my first memory server

A memory server setup needs three parts: an LLM, a database, and the knowledge graph memory server itself.

Right now, a memory server can only connect to a single LLM at a time. For my experiment, I chose Claude because it had good support for MCP servers.

For the database, I went with a version of Neo4j that came with its own knowledge graph memory server.

For the database, I went with the Neo4j knowledge graph memory server because it was open source. Neo4j is also the database we use to power Infrahub so it was a great opportunity to get more hands-on in learning about how the technology works.

I started with an empty database. The first thing I fed it was an image of the OpsMill org chart. I wanted to see if Claude could recognize people and relationships from an image, and then build them as nodes and edges in the graph.

It could! The memory server built itself a social graph of my colleagues, my relationships to them, and their relationships to each other. I didn’t even need to give it a schema. I was off to the races.

Performance and benefits I’ve seen

Though it’s hard to quantify the results from using a memory server, it definitely feels like a big improvement in efficiency and accuracy. I spend less time writing prompts and I get richer context in the answers.

I can start a session and reference a person or project by name, and Claude already knows who or what I mean.

It can even connect to my CRM through the MCP server. Mention a prospect or customer, and it quietly pulls fresh data like recent calls, renewal dates, and notes, and uses that as background for its responses.

Limits and privacy considerations

The biggest limitation is visibility. I can see when the memory server performs a lookup, but I can’t easily inspect which facts it used to build the answer. Transparency tools are still immature here.

Privacy-wise, I’m comfortable because everything runs locally on my laptop. The only external trust boundary is the LLM itself, and that’s why I use a paid plan with Claude that allows me to block my data from being used for model training.

From human graphs to infrastructure graphs

Experimenting with this system changed how I think about infrastructure data. What the memory server does for me—recording relationships between people, companies, and facts—isn’t far from what Infrahub does for infrastructure.

In the memory server, relationships are inferred and sometimes fuzzy. In infrastructure, they’re explicit and deterministic: either two devices are linked or they aren’t. But the model is the same: nodes, edges, and context you can query.

Using Neo4j under the hood gave me a clearer intuition for why a graph database fits our world so well. Infrastructure is deeply interdependent. Seeing those connections visualized in my memory server made that interdependence incredibly clear for me.

Side projects for the win

Experimenting with the memory server started as a curiosity project, but it’s been a great education in how context, relationships, and schema design affect intelligence, both human and artificial.

I found that playing with the memory server forced me to think about things like data structure, persistence, and trust boundaries, which are some of the same issues we face in automation.

If you’re an engineer, try building one. You’ll get hands-on insight into how knowledge graphs work and why the world is moving toward systems that remember.

(Or, if you’d rather jump straight to an infrastructure graph, Infrahub is already that memory layer for managing automation. Our MCP server even lets you connect to an LLM of your choice.)

What I Learned About Graphs Setting Up an LLM Memory Server

What is a memory server?

Setting up my first memory server

Performance and benefits I’ve seen

Limits and privacy considerations

From human graphs to infrastructure graphs

Side projects for the win

Pete Crocker

REQUEST A DEMO

See what Infrahub can do for you

Fantastic! 🙌

Fantastic! 🙌