I am building a graph based semantic search engine. We can use low cost LLMs, like Haiku, or local models to extract semantics (named entity recognition).
Then the nodes in the graph maintain types (things like people, date, currency) as extracted and allow queries.
Very interesting, I've been thinking about this kind of approach but haven't had the time to really work on it. So what kind of business model do you have? Is it a kind of drop-in replacement for vector dbs?
Out of curiosity, if it's not a trade secret, how do you plan to handle conflicting data (two sources saying different things on the same topic/data)?
It will perhaps not be a drop in replacement for vector dbs in every situation but yes it will be so when you want accurate results that follow the semantic chains (entities and relations).
At this moment, we have not entered the territory of conflict resolution but I know what you mean. Interestingly I just came across this: https://arxiv.org/pdf/2410.18415 (released on Oct 24, 2024).
Then the nodes in the graph maintain types (things like people, date, currency) as extracted and allow queries.
https://github.com/pixlie/PixlieAI
Currently building a demo where we crawl startup investment data to build a knowledge graph that can be filtered for patterns.
The engine can guide the crawl process, to keep crawl limited to the problem statement.