RoboData

Toward Trustable Question Answering over Ontologies through Metacognitive Agentic Epistemology

1Sapienza University of Rome, Department of Computer, Control and Management Engineering, Italy 2University of Basilicata, Department of Engineering, Italy

Architecture

RoboData Architecture

The KGQA agentic framework follows the DIKW architecture. The "Data + Information" layer is implemented as a knowledge graph. The "Knowledge" layer consists of a LLM-based orchestrator, using tools to inspect local and remote data. The "wisdom" layer provides feedback on the QA strategy.

Abstract

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP), simplifying knowledge extraction from structured and unstructured sources. Despite pervasive usage in various applications, from Question Answering (QA) to goal-driven reasoning, they tend to produce hallucinations, factually incorrect responses that hindering accuracy and explainability of Knowledge-Based QA (KBQA) tasks. To address this limitation, we introduce RoboData, an Agentic AI approach to verifiable knowledge extraction and reasoning over structured ontologies like Wikidata. Through metacognitive self-reflection and goal-directed commonsense reasoning, a LLM-based epistemic agent dynamically self-orchestrates a query answering process. Unlike traditional information retrieval systems, the proposed architecture incrementally builds a local knowledge graph from remote knowledge sources to answer a natural language query with traceable facts, highlighting a "support set" for each claim, a set of nodes and edges in the local knowledge graph that backs the generated claims. The resulting accumulated knowledge provides an intermediate explainability layer, providing a reliable epistemic substrate for using trusted ontologies in goal-driven query answering applications, such as robotic planning and semantic map enrichment.

Contributions

We present an architecture where a Large Language Model-based agent orchestrates a Query Answering process. This process produces both a local Knowledge Graph, built from remote reliable data sources like Wikidata, and a natural language answer. Crucially, each claim in the answer is supported by "support sets"—subsets of the local KG that provide verifiable evidence, enhancing the trustability of the output.

Our system features an agent capable of reasoning through self-reflection on its past actions. It can dynamically correct its strategy by analyzing its own course of action. This metacognitive capability allows the agent to adapt its exploration strategy, overcome the limitations of LLM context windows, and manage long-term memory more effectively, which is particularly useful for complex, multi-hop queries.

Methodology

The system is designed to self-orchestrate, exploring the remote Wikidata ontology or the local KG, or to update it, progressively constructing the local graph, enabling explainable query resolution. Through self-reflection and metacognition, the agent is capable of correcting its own strategy, by reasoning on the past execution. The framework is organized in layers, following the DIKW paradigm.

Epistemic Substrate: Acting as the Data+Information layer in the DIKW taxonomy, the accumulated local graph, initially empty, is progressively populated during the Query Answering. The local KG represents data using a common schema, to potentially accomodate multiple remote knowledge sources.

Agentic Layer: Acts as the main operational component, representing the Knowledge layer in the DIKW paradigm. This layer comprises several modules. The main one is the Agentic Orchestrator, modeled as a Finite-State Machine (FSM), managing how the Agent interacts with the LLM. Each state has a specific prompt and set of tools that the agent can use.

Metacognition: FSM-based orchestration balances freedom of agentic action with guidance over reasoning paths. Hallucinations induced by excessive context lengths are mitigated by a corrective strategy generated by the metacognition module and embedded in subsequent Orchestrator prompts.

Stateful Orchestrator

RoboData Orchestrator

Initially, the query is evaluated in Local Data Evaluation and the KG is empty: the orchestrator transitions to Remote Data Evaluation. Here, the proper tool calls are executed. Remote data is evaluated in Remote Data Evaluation. Then the self-orchestration determines the remaining steps. In the orchestrator, the agent dynamically determines the order in which states are visited and tools are called and performs self-reflection, by observing previous successes or failures or evaluating data completeness. Workflows are replaced with a dynamic transition system, as the state sequence is therefore determined by the commonsense reasoning capabilities of the LLM. The orchestrator is allowed to run a predefined number of turns, after which it is forced to produce an answer.

Local Data Evaluation: Evaluates whether the local knowledge graph contains enough information to answer the query. If the graph is empty, the system proceeds directly to the Remote Exploration state. If there is enough data to answer the original query, the orchestrator visits the Produce Answer state. Finally, if the local KG contents exceed the maximum allowed token count, the Local Graph Exploration is visited.

Local Graph Exploration: The agent uses tools to explore the local KG, gathering additional information relevant to the query but not initially visible in the truncated graph. The agent can choose to transition again to the Local Data Evaluation state or to keep exploring.

Remote Data Exploration: The agent can invoke tools to collect data from the remote ontology. It can then decide to transition to the Local Data Evaluation state or to keep exploring.

Remote Data Evaluation: The relevance of remote retrieved information to the original query is evaluated. If the data is considered relevant, the system transitions Local Graph Update state to integrate it into the local KG. Otherwise, the system can decide whether to keep exploring or go back to the Local Data Evaluation state.

Local Graph Update: The selected remote entities and relations related to the query are fetched into the local graph, then the agent transitions back to the Local Data Evaluation state.

Answer Production: Final state, where the agent generates an answer. Each sentence is associated to a set of entities and relations in the local data, acting as proof for that claim.

Metacognition

FSM-based orchestration balances freedom of agentic action with guidance over reasoning paths. Hallucinations induced by excessive context lengths, due to the need to provide the KG to the LLM in data-related prompts, increase with complex queries requiring long answers and big underlying knowledge graphs. As a mitigation, a corrective strategy is generated by the metacognition module and embedded in subsequent Orchestrator prompts. The metacognition happens in evaluation states at two distinct levels: the first one is performed inline in the orchestrator. A self-reflection behavior in the LLM is elicited, by instructing the Agent to reflect on the past execution trace to determine corrective advices for the next selected state in the orchestrator, with the desired outcome of correcting the short-term strategy in reaction to failures or stagnation in the query answering process. Then, each time the agent reaches an evaluation state, in the ”Wisdom” layer, explicit metacognition is performed to provide a higher-order strategic evaluation of the overall agentic strategy, by analyzing the sequence of past states, tool calls and graph updates. If a previous metacognitive observation is available, the inferred strategy is compared against it. The module finally produces a metacognitive observation, a corrective plan to optimize suboptimal tendencies in the agent strategy. This observation is then injected into the prompts of the orchestrator evaluation states. As turns run out, a variable ”turn urgency” message (depending on how many orchestrator turn are left), urges the LLM to consolidate the KG, avoiding isolated nodes and unconnected components, leading to a better consistency in output KGs.

Metacognition components:

  1. Strategic Assessment. Infer the current strategy from statistics and memory.
  2. Meta-Observation. The strategy is compared to previous ones, and corrective feedback is generated, enabling the agent to navigate complex queries and reorient sub-optimal behavior.

Experimental Evaluation

This section evaluates the proposed system across varying query complexity and ambiguity. By default, metacognition is disabled while self-reflection is enabled, as it is an established practice in the current state of the art. Experiments are conducted using GPT-4o as a Reasoning Model. The orchestrator is allowed to run for a limited number of turns, after which it is forced to produce an answer with the collected data. The only remote ontology used in these experiments is Wikidata, accessed through the official API.

In the first batch of queries, Batch A (A1-A7), we test RoboData on a curated set of single and multi-hop queries, at an increasing level of complexity and scope, to assess its basic KGQA capabilities. The results are reported in Table 1, containing for each experiment, the original query, the answer with its supporting set, and several statistics, including number of iterations and tool calls, exchanged tokens and execution time, number of nodes and edges added. In this batch, we allow the orchestrator to run for at most 30 turns. Queries A1-A4 assess the Question Answering capability of this system at an increasing level of complexity. In particular, A1-A3 require a single hop to answer correctly (basically a local exploration around the entity Q42 in Wikidata as all the required elements for the answer are located one relation away from the main subject of the question). A2 and A3 though require multiple relations to answer and A2 is contained in A3. The result shows consistency in the intersection of the answers to A2 and A3. The answer to A3 is only partial: the part of the answer in common with A2 is only partially found; therefore, the answer is considered wrong. Query A4 instead requires multiple hops to answer correctly, showing longer-term task execution capabilities. Queries A5-A7 are more complex both in the number of hops required and in scope. A5 requires a chain of 4 hops to answer but only 3 are found: this is due to the fact that all entities along the chain contain part of the answer already in the textual description of the Wikidata entity (in our case, the name of the german federal district containing the Veltins brewery), tricking the LLM into assuming that the answer is fully supported by the node. In A6, the answer is incomplete. Both queries A5 and A6 highlight the main limitation of this system, when operating without a metacognitive strategy observer: the orchestrator alone tends to lose track of tasks involving multiple hops or big sets of entities and properties. This is confirmed by query A7, showing how the limited context window negatively affects tasks where more working memory is required to keep track of longer-term tasks over big sets of entities (in this case, the big sets of cities to compare to find twin cities). In this query, in fact, even if the full 30 iterations are reached and 148 tool calls are performed, collecitng 480 nodes and 623 edges, the limitations due to the working memory and context window prevent the agent from returning an answer.

Batch A: Basic KGQA Functionality

Query Answer Support Sets
Query A1: Who is the author of 'The Hitchhiker's Guide to the Galaxy?'
OK
The 1979 novel 'The Hitchhiker's Guide to the Galaxy' is authored by Douglas Adams. (1) The series 'The Hitchhiker's Guide to the Galaxy,' which spans from 1979 to 1992, is also authored by Douglas Adams. (2) Additionally, Eoin Colfer is an author of the series 'The Hitchhiker's Guide to the Galaxy.' (3) (1) <The Hitchhiker's Guide to the Galaxy (Q3107329), author (P50), Douglas Adams (Q42)>, (2) <The Hitchhiker's Guide to the Galaxy (Q25169), author (P50), Douglas Adams (Q42)>, (3) <The Hitchhiker's Guide to the Galaxy (Q25169), author (P50), Eoin Colfer (Q211893)>
Iterations: 6, Tools: 4, Tokens: 15043, Time (s): 12.54, Nodes: 4, Edges: 3
Query A2: Who are the relatives of Douglas Adams?
OK
Douglas Adams' father is Kristofers Duglass Adamss. (1) His mother is Janet Adams. (2) Douglas Adams was married to Jane Belson. (3) He has a daughter named Polly Adams. (4) (1) <Douglas Adams (Q42), father (P22), Kristofers Duglass Adamss (Q14623675)>, (2) <Q42, mother (P25), Janet Adams (Q14623678)>, (3) <Q42, spouse (P26), Jane Belson (Q14623681)>, (4) <Q42, child (P40), Polly Adams (Q14623683)>
Iterations: 8, Tools: 7, Tokens: 17652, Time (s): 14.88, Nodes: 5, Edges: 4
Query A3: Among the founders of tencent company, who has been member of national people's congress?
OK
Ma Huateng is one of the founders of Tencent. (1) Ma Huateng has held the position of National People's Congress deputy. (2) (1) <Tencent (Q860580), founded by (P112), Ma Huateng (Q1739008)>, (2) <Ma Huateng (Q1739008), position held (P39), National People's Congress deputy (Q10891456)>
Iterations: 28, Tools: 32, Tokens: 101825, Time (s): 84.02, Nodes: 11, Edges: 9
Query A4: Who are the relatives of Douglas Adams and what books did he write?
INCOMPLETE
Polly Adams is a relative of Douglas Adams, specifically his child. (1) Douglas Adams wrote "The Hitchhiker's Guide to the Galaxy." (2) Douglas Adams is also the author of the "Dirk Gently series." (3) Additionally, Douglas Adams co-authored "The Private Life of Genghis Khan." (4) (1) <Douglas Adams (Q42), child (P40), Polly Adams (Q14623683)>, (2) <Q42, notable work (P800), The Hitchhiker's Guide to the Galaxy (Q25169)>, (3) <Q42, P800, Dirk Gently series (Q20736364)>, (4) <Q42, P800, The Private Life of Genghis Khan (Q7758404)>
Iterations: 6, Tools: 5, Tokens: 14166, Time (s): 14.06, Nodes: 5, Edges: 4
Query A5: In which federal state is the Veltins brewery headquarter?
INCOMPLETE
The Veltins Brewery has its headquarters in Meschede. (1) Meschede is a town located in the Hochsauerland district. (2) The Hochsauerland district is in North Rhine-Westphalia, Germany. (3) (1) <Veltins Brewery (Q608350), headquarters location (P159), Meschede (Q5632)>, (2) Meschede (Q5632), (3) Q5632
Iterations: 6, Tools: 4, Tokens: 10303, Time (s): 11.36, Nodes: 2, Edges: 1
Query A6: Which philosophers influenced by Kant were also mathematicians?
OK
David Hilbert was a philosopher influenced by Immanuel Kant and was also a mathematician. (1) Hermann Weyl was influenced by Immanuel Kant and is described as a German mathematician. (2) Luitzen Egbertus Jan Brouwer was influenced by Immanuel Kant and is identified as a Dutch mathematician and logician. (3) Shem-Tov Gefen was influenced by Immanuel Kant and is described as a philosopher and mathematician. (4) (1) <David Hilbert (Q41585), influenced by (P737), Immanuel Kant (Q9312)>, Q41585, (2) <Hermann Weyl (Q71029), P737, Q9312>, Q71029, (3) <Luitzen Egbertus Jan Brouwer (Q155887), P737, Q9312>, Q155887, (4) <Shem-Tov Gefen (Q6992366), P737, Q9312>, Q6992366
Iterations: 22, Tools: 29, Tokens: 124613, Time (s): 70.51, Nodes: 32, Edges: 33
Query A7: Which Italian cities are twinned with Japanese cities?
WRONG
The agent failed to produce an answer. N/A
Iterations: 30, Tools: 148, Tokens: 818300, Time (s): 480.21, Nodes: 480, Edges: 623
Query Final Answer Support Sets

Consistency Experiments

These experiments where conducted on the query "What is the capital of Canada?", a simple single-hop query that succeeded every time, to evaluate the consistency of the execution trace in terms of execution statistics. As we can see the resulting final knowledge graph is always the same (the triple <Canada (Q16), capital (P36), Ottawa (Q1930)>), while statistics show that the execution trace has a variable nature in term of number of Tool calls and orchestrator iterations.

Metric Min Max Avg
Final KG <Canada (Q16), capital (P36), Ottawa (Q1930)>
Iterations 5 9 5.8
Tool Calls 2 6 3
Tokens 10,751 19,007 15,044
Time (s) 11.47 23.48 16.48
KG Nodes Added 2 2 2
KG Edges Added 1 1 1

Acknowledgments

Thanks to the community of the WikiProject Ontology for their valuable contributions and support, and for organizing the amazing Wikidata Ontology Course.