Complex Query Answering with Neural Link Predictors

By Miniml, March 6, 2026

Many enterprises already store large amounts of information in structured formats such as tables, databases, and knowledge graphs. These systems are good at storing relationships between things, like customers, products, suppliers, or medical data. But asking complex questions across all that information can still be difficult.

The paper shows that well-engineered AI can break complicated questions into smaller steps and answer them by combining simple relationships in the data. Instead of needing massive amounts of specialized training data, the approach reuses models trained on basic relationships and applies them to more complex queries.

The result is a system that can reason over structured datasets more effectively while remaining easier to train and understand, showing that AI systems are able to work directly with existing structured data, helping businesses extract answers from complex datasets without rebuilding their entire data infrastructure.

Paper: https://openreview.net/pdf?id=Mos9F9kDwkz

Abstract

Neural link predictors are immensely useful for identifying missing edges in large scale Knowledge Graphs. However, it is still not clear how to use these models for answering more complex queries that arise in a number of domains, such as queries using logical conjunctions ( ), disjunctions ( ) and existential quantifiers ( ), while accounting for missing edges. In this work, we propose a framework for efficiently answering complex queries on incomplete Knowledge Graphs. We translate each query into an end-to-end differentiable objective, where the truth value of each atom is computed by a pre-trained neural link predictor. We then analyse two solutions to the optimisation problem, including gradient-based and combinatorial search. In our experiments, the proposed approach produces more accurate results than state-of-the-art methods --- black-box neural models trained on millions of generated queries --- without the need of training on a large and diverse set of complex queries. Using orders of magnitude less training data, we obtain relative improvements ranging from 8% up to 40% in Hits@3 across different knowledge graphs containing factual information. Finally, we demonstrate that it is possible to explain the outcome of our model in terms of the intermediate solutions identified for each of the complex query atoms. All our source code and datasets are available online, at https://github.com/uclnlp/cqd.

Stay ahead with research-backed solutions

From papers to production, we translate cutting-edge AI research into practical systems that give your business a competitive edge.

See how we work