Yuwei Hu

Renmin University of China

huyuweiyisui@ruc.edu.cn

&Runlin Lei

Renmin University of China

runlin_lei@ruc.edu.cn

&Xinyi Huang

Renmin University of China

2022201342@ruc.edu.cn

\ANDZhewei Wei

Renmin University of China

zhewei@ruc.edu.cn

&Yongchao Liu

Alibaba Group

yongchao.ly@antgroup.com

Yuwei Hu^{1}, Runlin Lei^{1}, Xinyi Huang^{1}, Zhewei Wei^{1}, Yongchao Liu^{2}^{1}Renmin University of China, Beijing, China^{2}Alibaba Group, Beijing, China

huyuweiyisui@ruc.edu.cn, runlin_lei@ruc.edu.cn, 2022201342@ruc.edu.cn

zhewei@ruc.edu.cn, yongchao.ly@antgroup.com

Corresponding Author.

###### Abstract

Recent research has explored the use of Large Language Models (LLMs) for tackling complex graph reasoning tasks. However, due to the intricacies of graph structures and the inherent limitations of LLMs in handling long text, current approaches often fail to deliver satisfactory accuracy, even on small-scale graphs and simple tasks. To address these challenges, we introduce GraphAgent-Reasoner, a fine-tuning-free framework that utilizes a multi-agent collaboration strategy for explicit and precise graph reasoning. Inspired by distributed graph computation theory, our framework decomposes graph problems into smaller, node-centric tasks that are distributed among multiple agents. The agents collaborate to solve the overall problem, significantly reducing the amount of information and complexity handled by a single LLM, thus enhancing the accuracy of graph reasoning. By simply increasing the number of agents, GraphAgent-Reasoner can efficiently scale to accommodate larger graphs with over 1,000 nodes. Evaluated on the GraphInstruct dataset, our framework demonstrates near-perfect accuracy on polynomial-time graph reasoning tasks, significantly outperforming the best available models, both closed-source and fine-tuned open-source variants. Our framework also demonstrates the capability to handle real-world graph reasoning applications such as webpage importance analysis.

## 1 Introduction

Graphs, as a crucial data structure for modeling complex real-world relationships, are ubiquitous across various scenarios, e.g. citation networks, recommendation networks. Many important applications like drug discovery(Stokes etal., 2020), traffic forecasting(Jiang & Luo, 2022), and financial detection(Motie & Raahemi, 2024), require reasoning over graphs to be realized.Noticing the powerful general knowledge and language processing capabilities of Large Language Models (LLMs)(Brown etal., 2020), a significant amount of works have focused on using LLMs to perform various reasoning tasks, such as mathematical formula derivation(Meadows etal., 2023), commonsense reasoning(Madaan etal., 2022), and multi-hop question answering(Creswell etal., 2023). However, most of them primarily involve shallow or sequential reasoning. To bring the LLM reasoning closer to human thinking, it is necessary for LLMs to master deeper and more complex reasoning, such as graph reasoning.

Despite significant efforts by researchers to enable LLMs to memorize, comprehend, and perform basic reasoning on graph structures, several issues still persist: 1) The scale of graphs that can be handled is limited.Describing graph structures in natural language inevitably leads to excessively long inputs.Due to context length limitations and the shortcomings of LLMs in handling lengthy text(Liu etal., 2023), previous works(Chai etal., 2023; Fatemi etal., 2024; Perozzi etal., 2024) could only handle graphs of very limited size (e.g. fewer than 20 nodes and 100 edges).2) The performance on graph reasoning tasks is relatively poor. Unlike text, which can tolerate some degree of semantic deviation, reasoning and computation on graphs must be highly precise. However, current works demonstrate poor accuracy (average 20$\sim$60%) in various graph reasoning tasks like connectivity and shortest path.3) Lacking explicit reasoning paths. Taking the shortest path as an example, the responses of existing models resemble a heuristic search approach to finding the shortest path on a graph, rather than strictly executing an algorithm. This makes it difficult to determine whether LLMs are genuinely deriving the answer through correct reasoning or merely making educated guesses.Although GraphWiz(Chen etal., 2024a) attempts to generate explicit reasoning paths through fine-tuning, it often fails due to the presence of incomplete or wrong reasoning paths in its training data.Furthermore, GraphWiz exhibits overfitting, where it tends to treat new or unrelated questions as one of the fine-tuned problems, which will be detailed in Section5.3.

Motivation. The ultimate goal of graph reasoning is to enable LLMs to leverage graph-related knowledge or algorithms to solve real-world graph problems. However, with the development of information science and hardware storage, the scale of graphs and information per node become too large for a single LLM to handle. To address this, a natural idea is to use distributed approaches, where a large graph is stored across multiple LLMs separately and compute collaboratively. Therefore, just as graph algorithms have generally evolved from non-distributed to distributed forms(Meng etal., 2024b)), we hope that LLMs can also learn the concept of distributed processing, thereby harnessing the power of swarm intelligence to solve graph problems in real-world scenarios.

Our Contribution. To address the above limitations, in this paper, we propose the GraphAgent-Reasoner(GAR) framework, which leverages the power of swarm intelligence to solve graph reasoning problems, as shown in Figure1. We follow a node-centric approach, assigning an agent to each node, allowing it to focus on processing its own information and communicate with neighbors.Thus, we can easily scale up the size of graphs that can be processed by simply increasing the number of agents.At the same time, under the direction of a Master LLM, graph problems are decomposed into smaller, node-centric tasks, which are assigned to agents for collaborative resolution.This approach significantly reduces the scale and complexity of information each agent needs to process, thereby greatly improving the overall accuracy.Furthermore, since agents must clearly transmit the processed information to neighboring agents, the reasoning process becomes transparent, demonstrating the framework solves graph reasoning problems through clear and correct reasoning, rather than lucky guessing. In summary, our contributions are as follows:

- •
We propose GraphAgent-Reasoner, the first LLM-based multi-agents framework for graph reasoning, which requires no fine-tuning and can utilize any LLM as the underlying reasoning model. Our framework achieves near-perfect accuracy on various polynomial-time tasks, significantly surpassing the performance of existing methods.

- •
Our framework expands the scale of graph reasoning tasks handled by LLMs from 100 nodes to 1,000 nodes, demonstrating exceptional scalability. Furthermore, as the graph size increases, our framework does not exhibit the significant performance degradation seen in other methods and maintains robust accuracy.

- •
We explore the performance of our framework in real-world applications like webpage importance analysis, showcasing its potential for addressing complex graph reasoning problems in real-life situations.

## 2 Preliminaries and Related Works

Preliminaries. In general scenarios, when discussing LLMs solving graph reasoning problems, the input is a ($\mathcal{G}$,$\mathcal{Q}$) pair. $\mathcal{G}$ is a graph represented as $\mathcal{G}=(\mathcal{V},\mathcal{E},\{s_{i}\},\{t_{i}\})$, where $\mathcal{V}$ is the node set and $\mathcal{E}$, the edge set. For each node $v_{i}\in\mathcal{V}$, a sequential text node feature $s_{i}$ is associated; similarly, for each edge $e_{i}\in\mathcal{E}$, a sequential text edge feature $t_{i}$ is assigned. The graph $\mathcal{G}$ is described in natural language, typically using edge or adjacency list representation. $\mathcal{Q}$ is a task-specific instruction or problem description. LLMs will process the ($\mathcal{G}$,$\mathcal{Q}$) pair and return an answer string $A$.

Large Language Models for Graph Reasoning.To further enhance the reasoning capabilities of LLMs, many works have attempted to improve the performance of LLMs in graph reasoning. Wang etal. (2023) first introduces the NLGraph Benchmark to evaluate the performance of LLMs on various graph reasoning tasks. Fatemi etal. (2024) explores the impact of different graph encoding methods and graph structure types on the performance of LLMs in graph reasoning tasks. Additionally, it introduces another benchmark called GraphQA. Considering the lengthy nature of describing graph structures in text, Chai etal. (2023) and Perozzi etal. (2024) respectively use Transformers and GNNs to encode graph structures and attempt to align them with LLMs. Inspired by how humans understand structural information through the visual modality, Wei etal. (2024) generates corresponding visual images based on graph structures and provides them to visual LLMs for graph reasoning. Chen etal. (2024a) conducted Supervised Fine-Tuning and Directly Prefered Optimization on LLMs, enhancing the performance of LLMs and encouraging them to output explicit reasoning paths.

Large Language Model based Multi-Agents.Recent advancements in LLMs have spurred interest in their application within multi-agent systems. LLM-based multi-agent frameworks leverage the natural language understanding and reasoning capabilities of LLMs to enable agents to collaborate, communicate, and solve complex tasks in a distributed manner. Existing multi-agents works for problem solving primarily focuses on applications such as Software Development(Dong etal., 2023; Hong etal., 2024; Qian etal., 2024), Embodied Agents(Zhang etal., 2024; Mandi etal., 2024; Chen etal., 2024b) and Science Debate(Xiong etal., 2023; Chan etal., 2024). However, using LLM-based multi-agents to handle graph data has been less explored, especially in the areas of graph reasoning and graph computation tasks. This may be due to the hallucination issue inherent in LLMs(Huang etal., 2023), where their responses are factually incorrect. This problem becomes more complex in a multi-agent setting, as the hallucinations of a single agent may propagate to other nodes by communication(Guo etal., 2024). This requires the performance of individual agents be sufficiently stable to ensure the correct operation of the entire multi-agent system.

## 3 Limitations of Single LLM in Graph Reasoning

Although LLMs exhibit strong language processing and logical reasoning capabilities, problems with the Transformer architecture and Attention mechanism(Vaswani etal., 2017) still limit the scale and accuracy when they process graph problems. There are two primary limitations:

The graph structure is too complex to memorize and understand for a single LLM.Using adjacency or edge lists to describe graph structures in natural language is the most intuitive and direct method, facilitating the processing of graph data by LLMs through text. However, this approach inevitably leads to a lengthy context, as the number of edges can grow quadratically with the number of nodes.As the graph scales up and becomes denser, the graph structure becomes highly complex, requiring a large amount of tokens to describe the edge relationships. When the text becomes too lengthy, it becomes difficult for LLMs to properly allocate attention, and they may even struggle with simple tasks such as key-value pair matchingLiu etal. (2023). This presents significant challenges for LLMs in identifying key information for graph reasoning tasks from the lengthy context. Figure2 shows the performance of a single LLM in memorizing one-hop neighbor nodes. We observe that as the number of nodes in the graph increases, various LLMs exhibit a significant decline in accuracy.If a single LLM cannot even correctly recall basic graph structural information like node neighbors, it becomes difficult to proceed with more complex graph reasoning or computation.

Furthermore, the graph structure is described in a sequential manner.LLMs have to identify implicit graph structures from sequential text. Since the processing of LLMs is a black-box operation, it is difficult to assert that they truly construct graph structures implicitly and thereby understand them. Huang etal. (2024) conducted extensive experiments to explore whether LLMs treat the input prompts as graphs or merely as paragraphs with keywords on TAGs. The results show that the performance of LLMs in handling TAGs primarily stems from the context rather than the graph structure. LLMs tend to process the graph description as linearized paragraphs rather than graphs.

A single LLM struggles to solve reasoning problems in real-world scenarios.Researchers train LLMs on graph reasoning tasks to empower them to utilize learned graph-related knowledge or algorithms to tackle real-world graph problems. However, in practical scenarios, the amount of information associated with each node can be enormous. Take citation networks as an example: a single node represents a paper, and its node information includes the title, abstract, and references, which could amount to several thousand tokens. In addition to the complexity of graph structures, the need to handle a large amount of node information further exacerbates the burden on a single LLM and highlights its shortcomings in processing long contexts. Moreover, using a single LLM to handle the entire network is inefficient, as it cannot coherently process the entire network’s problems. Typically, it is necessary to manually compress or summarize the information for each node and then feed local subgraphs to the LLM for processing(Guo etal., 2023; Chen etal., 2023).

Furthermore, many current works(Chen etal., 2024a; Perozzi etal., 2024) require training GNNs or fine-tuning LLMs on individual or multiple graph reasoning tasks. However, when transferring to other graph tasks, a certain degree of performance degradation occurs, and retraining or fine-tuning for new graph tasks consumes a significant amount of time and resources. Whether LLMs can apply the graph knowledge and algorithms learned during the training process to actual graph reasoning also remains an open question. We explored this question in5.3 and observed significant overfitting in LLMs fine-tuned on specific graph reasoning tasks. Therefore, the ideal solution would be to leverage the powerful general knowledge acquired during the pre-training phase of LLMs through an appropriate approach, enabling them to handle graph reasoning tasks as naturally as they do with natural language problems.

## 4 GraphAgent-Reasoner

To solve the limitations above, we propose a novel framework based on multi-agent collaboration called GraphAgent-Reasoner as shown in Figure3, aiming to solve graph reasoning problems explicitly and correctly.The interface of the framework is a Master LLM, which is responsible for processing the textual input of graph problems, constructing the agent network, directing them to collaboratively solve the problem, and finally aggregating the states of all agents to derive the solution.Its implementation is based on the React Agent proposed by Yao etal. (2023), which is capable of reasoning based on the environment and executing corresponding actions, as detailed later.The pipeline of GAR consists of four steps: Graph Construction, Algorithm Establishing, Distributed Execution and Master Summarization.

Graph Construction.Given an input pair ($\mathcal{G}$, $\mathcal{Q}$), the Master LLM first extracts the node and edge information from the textual description of graph $\mathcal{G}$.It then constructs an agent for each node and initializes the node’s state and neighbor information, forming an interconnected network of agents.Each agent independently maintains its state and neighbor data, communicates with adjacent agents based on instructions from the Master LLM, and updates its state in each round.

Algorithm Establishing.To accommodate diverse graph tasks and fully exploit the knowledge embedded in LLMs during pre-training, we propose a unified solution approach framed within a distributed paradigm as shown in Algorithm1.This approach requires the Master LLM to specify six core components for each problem: State, Message, Initialization, Send, Update, and Termination.

- •
State: The local information maintained by each node, representing its current state. This can include attributes like node features, labels, or any other task-specific data. The states evolve as nodes receive messages and update their information.

- •
Message: The data transmitted between nodes during the communication phase. Messages typically contain information that neighboring nodes need to perform updates, such as feature values, distances, or other task-relevant information.

- •
Initialization: At the start of the execution, each node initializes its state with predefined values, which may be based on node IDs, input features or task-specific requirements. This step ensures that the graph is ready to begin the communication process.

- •
Send: After initialization, each node generates messages based on its current state and sends them to its neighboring nodes. This step is repeated in each iteration, allowing nodes to continuously exchange information with their neighbors.

- •
Update: Upon receiving messages from its neighbors, each node updates its state by aggregating the incoming messages and combining them with its current state. This iterative process enables nodes to refine their information over time.

- •
Termination: The algorithm halts when a predefined stopping condition is met, such as reaching a fixed number of iterations, achieving convergence, or satisfying a task-specific criterion. Once the termination condition is reached, each node will send its final state to the Master LLM, and the execution terminates.

Since LLMs lack prior knowledge of this distributed paradigm, to facilitate the Master LLM’s understanding and application of the framework, we develop a distributed algorithm library that adheres to this distributed paradigm, from which the Master LLM can query relevant algorithm templates to generate distributed solutions within this paradigm. Specifically, we selected classic distributed graph algorithms and documented their implementations under this distributed paradigm. Some examples are presented in AppendixA.1. Drawing on prior work(Zheng etal., 2024; Meng etal., 2024a), we endeavor to write detailed reasoning steps of each part in the algorithm to encourage the agent to think step by step as much as possible, which plays an important role in enhancing the success rate of individual agents.

When receiving a problem input, the Master LLM first retrieves the $k$ algorithms most relevant to the problem description from the distributed algorithm library. If there are algorithms suitable for handling the problem, the Master LLM will adjust the algorithm according to the problem description, such as changing the initialization and termination conditions (e.g., the source node in the shortest path problem). If there are no appropriate algorithms, the Master LLM will design a distributed algorithm following the distributed paradigm based on the examples of the retrieved algorithms. For some generated examples, see AppendixA.2.

Distributed Execution. After the distributed algorithm is designed, the Master LLM will relay the approach to each agent node for execution according to the process outlined in Algorithm1. Each agent will first initialize its state based on node information and algorithm rules and then send an initial message to neighboring agents. Subsequently, each agent will iteratively execute the operations of receiving messages, updating its state, and sending messages according to the algorithm rules, synchronizing progress after each communication round.Communication will continue until the maximum number of iterations is reached or the termination condition is met.

Master Summarization. Finally, the final state of all agent nodes will be aggregated to the Master LLM, which will summarize the results conclude based on the problem and return the final answer in natural language form.

## 5 Experiments

In this section, we summarize the key experiments conducted with GAR. We begin by highlighting some of the most exciting results from our analysis here:

- •
R1: GAR achieves near-perfect accuracy on polynomial-time graph reasoning problems, significantly surpassing existing closed-source models and open-source models fine-tuned on extensive data.

- •
R2: GAR maintains high accuracy on larger-scale graphs (up to 1000 nodes), demonstrating superior scalability.In contrast, as the number of nodes increases, other models exhibit a significant decline in performance or become incapable of handling the problem at all due to the context length limitation.

- •
R3: GAR showcases a robust understanding and application of graph algorithms in real-world graph reasoning scenarios, highlighting its potential for addressing complex graph problems encountered in daily life. In contrast, other open-source models that have undergone extensive fine-tuning on graph reasoning datasets fail to apply the learned graph reasoning knowledge when confronted with rephrased real-world graph problems.

Datasets. We conduct our experiments on the graph reasoning tasks proposed in GraphInstruct(Chen etal., 2024a). This dataset contains nine graph reasoning problems with different time complexity, ranging from linear and polynomial complexity to NP-complete.

- •
Linear. Cycle Detection (Detect if a given graph $\mathcal{G}$ contains any cycles), Connectivity (Assess if two nodes $u$ and $v$ in a given graph $\mathcal{G}$ are connected via a path), Bipartite Graph Check (Judge if a given graph $\mathcal{G}$ is bipartite), and Topological Sort (Find a topological ordering of vertices in a directed acyclic graph $\mathcal{G}$).

- •
Polynomial. Shortest Path (Compute the shortest path between two specific nodes $u$ and $v$ in a given graph $\mathcal{G}$), Maximum Triangle Sum (Find the maximum sum of weights for any connected triplet of vertices in a given graph $\mathcal{G}$), and Maximum Flow (Calculate the maximum flow from a source node $s$ to a sink node $t$ in a directed graph $\mathcal{G}$).

Due to the complexity of NP-complete problems, there are currently no mature exact distributed algorithms available for their solution. Consequently, the Master LLM is unable to design correct and effective distributed algorithms based on the knowledge acquired during pre-training. Therefore, in our experiments, we only consider linear and polynomial-time problems. Detailed information of the dataset and partial test results for NP-complete problems will be presented in AppendixB.

Setting. The underlying reasoning LLM of Agent Node used in our framework is ChatGPT-4o-mini-2024-07-18, and the base model of Master LLM is ChatGPT-4-turbo(OpenAI, 2023). The temperature is consistently set to 0. Our framwork is built upon AgentScope(Gao etal. (2024)), an innovative platform to easily build reliable, high-performance multi-agent applications.

### 5.1 Experiment 1: Performance on GraphInstruct

In this experiment, we evaluate the performance of GAR on polynomial-time tasks of the GraphInstruct dataset. The results are shown in Table1.We see GAR exhibits near-perfect results on these tasks, significantly outperforming other models. Especially on shortest and triangle tasks with high time complexity, GAR substantially improves the performance of LLMs. Problems that a single LLM struggles to solve have been effectively resolved through collaboration by agents after being decomposed into smaller, node-centric tasks.

As the number of nodes increases, the graph structures become more complex, making the solution of graph problems increasingly difficult. To investigate how the performance of models varies with increasing problem complexity, we conduct experiments on cycle detection and shortest path problems, gradually increasing the number of nodes from 5 to 100. The results are presented in Figure4.

We see with the number of nodes increasing, both ChatGPT-4 and Graphwiz exhibit a significant decline in performance. However, the accuracy of GAR remains stable, almost unaffected by the graph size, demonstrating robust scalability. Although the scale of the graph is increasing, the information processed by each agent has not significantly increased. Each agent still only handles its own information and communicates with neighboring agents. We observe that GAR occasionally makes errors in specific cases, likely due to the increasing communication rounds as the number of nodes and edges grows. Even when handling simple node-centric tasks, a single agent still has the potential to make mistakes.Therefore, as the number of agents and communication rounds increases, the overall likelihood of errors also rises. This can be improved by enhancing the capability of individual agents (such as using stronger LLMs as the underlying reasoning model) or by more finely designed prompts.

Models | Linear | Polynomial | Average | ||||
---|---|---|---|---|---|---|---|

cycle | connect | bipartite | topology | shortest | triangle | ||

Closed-source Models | |||||||

GPT-4 (zero-shot) | 38.75 | 17.00 | 65.25 | 5.00 | 9.25 | 5.75 | 23.50 |

GhatGPT (2-shot) | 51.25 | 43.75 | 70.75 | 4.50 | 3.50 | 17.25 | 31.83 |

GPT-4 (2-shot) | 52.50 | 62.75 | 74.25 | 25.25 | 18.25 | 31.00 | 44.00 |

Fine-tuned Open-source Models | |||||||

Naive SFT (LLaMA 2-7B) | 73.75 | 83.50 | 41.25 | 4.00 | 9.50 | 30.00 | 40.17 |

Naive SFT (Mistral-7B) | 73.75 | 83.50 | 78.50 | 1.00 | 23.00 | 47.00 | 51.13 |

GraphWiz (LLaMA 2-7B) | 91.50 | 87.00 | 74.00 | 18.00 | 28.00 | 38.25 | 56.13 |

GraphWiz (Mistral-7B) | 92.00 | 89.50 | 72.00 | 19.00 | 31.25 | 38.75 | 57.08 |

GraphWiz-DPO (LLaMA 2-7B) | 89.00 | 82.50 | 84.75 | 46.75 | 24.00 | 52.75 | 63.29 |

GraphWiz-DPO (Mistral-7B) | 85.50 | 79.50 | 85.50 | 85.25 | 12.50 | 29.00 | 62.88 |

GraphAgent-Reasoner | 99.50 | 100.00 | 100.00 | 96.50 | 99.75 | 93.25 | 98.00 |

### 5.2 Experiment 2: Performance on Large-ScaleGraphs

In this experiment, we evaluate the performance of current LLMs on large-scale graphs. The largest graph size handled by existing graph reasoning work is 100 nodes(Chen etal., 2024a), which is still far from sufficient for real-world graph reasoning scenarios. To evaluate the reasoning performance of existing models on larger graphs, we conduct shortest path experiments on graphs with 100, 200, 500, and 1000 nodes. Due to the excessively long input text (reaching 16,000 tokens for 1000 nodes) and the money cost, we only create 20 test samples for each graph size. The results are shown in Table2.

Graph Size | 100 | 200 | 500 | 1000 |
---|---|---|---|---|

Graphwiz (LLaMA 2-7B) | 0/20 | 0/20 | NA | NA |

Graphwiz (LLaMA 2-7B-DPO) | 0/20 | 0/20 | NA | NA |

Chatgpt-3.5-turbo-16k | 0/20 | 0/20 | 0/20 | 0/20 |

Chatgpt-4-32k | 0/20 | 1/20 | 0/20 | 0/20 |

GraphAgent-Reasoner | 20/20 | 20/20 | 20/20 | 18/20 |

We see the two GraphWiz models fine-tuned on the LLaMA2-7B(Touvron etal., 2023) base model are unable to handle graphs with 500 or more nodes due to the context length limitation (the context length limit for Llama2 is 4096 tokens). Although ChatGPT-3.5-turbo-16k and ChatGPT-4-32k can manage longer contexts, they output wrong answers in almost all test samples, with only ChatGPT-4-32k being correct in one 200 nodes test sample. In contrast, GAR maintains a high accuracy in large-scale graph, only failed in two 1000-node test samples, further demonstrating its robust scalability.

### 5.3 Experiment 3: Case Study

In this experiment, we explore the application of two graph reasoning models, Graphwiz and GAR, in real-world graph reasoning scenarios. We present a case study of webpage importance analysis in Figure5.

Although GraphWiz performed well on fine-tuned tasks, it exhibits severe overfitting when faced with real-world graph problems, failing to apply the graph reasoning knowledge learned during the fine-tuning phase.Since GraphWiz uses a consistent graph node description, the sentence "The nodes are numbered from 0 to …" appears across all datasets during the mixed-task instruction tuning.When the actual problem has nodes numbered from 1 to 20, it still assumes the existence of node 0. As a result, both GraphWiz models first output that the graph has 21 nodes and an incorrect number of edges.Furthermore, neither of the two GraphWiz models recognizes that this is a problem associated with web page importance ranking.Instead, they approach it as the bipartite graph check or topological sort problems they had been fine-tuned on. Additionally, neither model generates an explicit and correct reasoning path.These observations indicate that there is still a significant gap between excelling in classic graph reasoning tasks and effectively solving real-world graph reasoning problems. In contrast, GAR correctly identifies that the problem should be solved using knowledge related to PageRank(Yang etal., 2024) and designs an algorithm that adhered to the distributed paradigm (Note: the distributed algorithm library does not contain a PageRank algorithm template).GAR then assigns the algorithm to agent nodes for execution, ultimately obtaining the PageRank value for each node and arriving at the correct conclusion.Through the distributed paradigm, GAR effectively bridges the powerful knowledge learned by LLMs with the solving of real-world graph reasoning problems, which enables it to flexibly handle practical issues in a distributed manner.This case study demonstrates the feasibility of using GAR to solve real-world graph reasoning problems, indicating its substantial practical applicability and offering researchers and practitioners a powerful framework to address such tasks.

## 6 Conclusion

We first summarize three key issues faced by existing LLMs in graph reasoning tasks: limited graph scale, poor performance, and the lack of explicit reasoning paths. We then reflect on the limitations of a single LLM in addressing graph reasoning problems, such as the graph structures being too complex to memorize and understand and the overwhelming information in real-world graph reasoning scenarios. To address these challenges, we propose GraphAgent-Reasoner, a framework based on multi-agent collaboration to solve graph reasoning problems. This framework demonstrates superior accuracy and scalability, significantly surpassing existing closed-source and fine-tuned open-source models. Our experiments show its robust scalability, maintaining high accuracy on large graphs (up to 1,000 nodes). Our case study on webpage importance analysis further illustrates its capability to handle real-world graph reasoning problems. Future work will focus on designing more accurate and scalable LLM-based multi-agent graph reasoning frameworks, aiming to apply them to larger and more complex real-world reasoning scenarios.

## References

- Brown etal. (2020)Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, JaredD Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, etal.Language models are few-shot learners.In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.),
*Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual*, 2020.URL https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html. - Chai etal. (2023)Ziwei Chai, Tianjie Zhang, Liang Wu, Kaiqiao Han, Xiaohai Hu, Xuanwen Huang, and Yang Yang.Graphllm: Boosting graph reasoning ability of large language model.
*CoRR*, abs/2310.05845, 2023.URL https://doi.org/10.48550/arXiv.2310.05845. - Chan etal. (2024)Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, and Zhiyuan Liu.Chateval: Towards better llm-based evaluators through multi-agent debate.In
*The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024*. OpenReview.net, 2024.URL https://openreview.net/forum?id=FQepisCUWu. - Chen etal. (2024a)Nuo Chen, Yuhan Li, Jianheng Tang, and Jia Li.Graphwiz: An instruction-following language model for graph computational problems.In Ricardo Baeza-Yates and Francesco Bonchi (eds.),
*Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024, Barcelona, Spain, August 25-29, 2024*, pp. 353–364. ACM, 2024a.URL https://doi.org/10.1145/3637528.3672010. - Chen etal. (2024b)Yongchao Chen, Jacob Arkin, Yang Zhang, Nicholas Roy, and Chuchu Fan.Scalable multi-robot collaboration with large language models: Centralized or decentralized systems?In
*IEEE International Conference on Robotics and Automation, ICRA 2024, Yokohama, Japan, May 13-17, 2024*, pp. 4311–4317. IEEE, 2024b.URL https://doi.org/10.1109/ICRA57147.2024.10610676. - Chen etal. (2023)Zhikai Chen, Haitao Mao, Hang Li, Wei Jin, Hongzhi Wen, Xiaochi Wei, Shuaiqiang Wang, Dawei Yin, Wenqi Fan, Hui Liu, and Jiliang Tang.Exploring the potential of large language models (llms)in learning on graphs.
*SIGKDD Explor.*, 25(2):42–61, 2023.URL https://doi.org/10.1145/3655103.3655110. - Creswell etal. (2023)Antonia Creswell, Murray Shanahan, and Irina Higgins.Selection-inference: Exploiting large language models for interpretable logical reasoning.In
*The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023*. OpenReview.net, 2023.URL https://openreview.net/forum?id=3Pf3Wg6o-A4. - Dong etal. (2023)Yihong Dong, Xue Jiang, Zhi Jin, and GeLi.Self-collaboration code generation via chatgpt.
*CoRR*, abs/2304.07590, 2023.URL https://doi.org/10.48550/arXiv.2304.07590. - Fatemi etal. (2024)Bahare Fatemi, Jonathan Halcrow, and Bryan Perozzi.Talk like a graph: Encoding graphs for large language models.In
*The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024*. OpenReview.net, 2024.URL https://openreview.net/forum?id=IuXR1CCrSi. - Gao etal. (2024)Dawei Gao, Zitao Li, Weirui Kuang, Xuchen Pan, Daoyuan Chen, Zhijian Ma, Bingchen Qian, Liuyi Yao, Lin Zhu, Chen Cheng, Hongzhu Shi, Yaliang Li, Bolin Ding, and Jingren Zhou.Agentscope: A flexible yet robust multi-agent platform.
*CoRR*, abs/2402.14034, 2024.URL https://doi.org/10.48550/arXiv.2402.14034. - Guo etal. (2023)Jiayan Guo, Lun Du, and Hengyu Liu.Gpt4graph: Can large language models understand graph structured data ? an empirical evaluation and benchmarking.
*CoRR*, abs/2305.15066, 2023.URL https://doi.org/10.48550/arXiv.2305.15066. - Guo etal. (2024)Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, NiteshV. Chawla, Olaf Wiest, and Xiangliang Zhang.Large language model based multi-agents: A survey of progress and challenges.
*CoRR*, abs/2402.01680, 2024.URL https://doi.org/10.48550/arXiv.2402.01680. - Hong etal. (2024)Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven KaShing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu, and Jürgen Schmidhuber.Metagpt: Meta programming for A multi-agent collaborative framework.In
*The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024*. OpenReview.net, 2024.URL https://openreview.net/forum?id=VtmBAGCN7o. - Huang etal. (2024)Jin Huang, Xingjian Zhang, Qiaozhu Mei, and Jiaqi Ma.Can llms effectively leverage graph structural information through prompts, and why?
*Trans. Mach. Learn. Res.*, 2024, 2024.URL https://openreview.net/forum?id=L2jRavXRxs. - Huang etal. (2023)Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, and Ting Liu.A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.
*CoRR*, abs/2311.05232, 2023.URL https://doi.org/10.48550/arXiv.2311.05232. - Jiang & Luo (2022)Weiwei Jiang and Jiayun Luo.Graph neural network for traffic forecasting: A survey.
*Expert Syst. Appl.*, 207:117921, 2022.URL https://doi.org/10.1016/j.eswa.2022.117921. - Liu etal. (2023)NelsonF. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang.Lost in the middle: How language models use long contexts.
*Transactions of the Association for Computational Linguistics*, 12:157–173, 2023.URL https://api.semanticscholar.org/CorpusID:259360665. - Madaan etal. (2022)Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, and Graham Neubig.Language models of code are few-shot commonsense learners.In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (eds.),
*Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022*, pp. 1384–1403. Association for Computational Linguistics, 2022.URL https://doi.org/10.18653/v1/2022.emnlp-main.90. - Mandi etal. (2024)Zhao Mandi, Shreeya Jain, and Shuran Song.Roco: Dialectic multi-robot collaboration with large language models.In
*IEEE International Conference on Robotics and Automation, ICRA 2024, Yokohama, Japan, May 13-17, 2024*, pp. 286–299. IEEE, 2024.URL https://doi.org/10.1109/ICRA57147.2024.10610855. - Meadows etal. (2023)Jordan Meadows, Marco Valentino, and André Freitas.Generating mathematical derivations with large language models.
*CoRR*, abs/2307.09998, 2023.URL https://doi.org/10.48550/arXiv.2307.09998. - Meng etal. (2024a)Lingkai Meng, YuShao, Long Yuan, Longbin Lai, Peng Cheng, Xue Li, Wenyuan Yu, Wenjie Zhang, Xuemin Lin, and Jingren Zhou.A survey of distributed graph algorithms on massive graphs.
*CoRR*, abs/2404.06037, 2024a.URL https://doi.org/10.48550/arXiv.2404.06037. - Meng etal. (2024b)Lingkai Meng, YuShao, Long Yuan, Longbin Lai, Peng Cheng, Xue Li, Wenyuan Yu, Wenjie Zhang, Xuemin Lin, and Jingren Zhou.A survey of distributed graph algorithms on massive graphs.
*CoRR*, abs/2404.06037, 2024b.URL https://doi.org/10.48550/arXiv.2404.06037. - Motie & Raahemi (2024)Soroor Motie and Bijan Raahemi.Financial fraud detection using graph neural networks: A systematic review.
*Expert Syst. Appl.*, 240:122156, 2024.URL https://doi.org/10.1016/j.eswa.2023.122156. - OpenAI (2023)OpenAI.GPT-4 technical report.
*CoRR*, abs/2303.08774, 2023.URL https://doi.org/10.48550/arXiv.2303.08774. - Perozzi etal. (2024)Bryan Perozzi, Bahare Fatemi, Dustin Zelle, Anton Tsitsulin, SeyedMehran Kazemi, Rami Al-Rfou, and Jonathan Halcrow.Let your graph do the talking: Encoding structured data for llms.
*CoRR*, abs/2402.05862, 2024.URL https://doi.org/10.48550/arXiv.2402.05862. - Qian etal. (2024)Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu, and Maosong Sun.Chatdev: Communicative agents for software development.In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.),
*Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024*, pp. 15174–15186. Association for Computational Linguistics, 2024.URL https://doi.org/10.18653/v1/2024.acl-long.810. - Stokes etal. (2020)JonathanM Stokes, Kevin Yang, Kyle Swanson, Wengong Jin, Andres Cubillos-Ruiz, NinaM Donghia, CraigR MacNair, Shawn French, LindseyA Carfrae, Zohar Bloom-Ackermann, etal.A deep learning approach to antibiotic discovery.
*Cell*, 180(4):688–702, 2020. - Touvron etal. (2023)Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, etal.Llama 2: Open foundation and fine-tuned chat models.
*CoRR*, abs/2307.09288, 2023.URL https://doi.org/10.48550/arXiv.2307.09288. - Vaswani etal. (2017)Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, AidanN. Gomez, Lukasz Kaiser, and Illia Polosukhin.Attention is all you need.In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, HannaM. Wallach, Rob Fergus, S.V.N. Vishwanathan, and Roman Garnett (eds.),
*Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA*, pp. 5998–6008, 2017.URL https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html. - Wang etal. (2023)Heng Wang, Shangbin Feng, Tianxing He, Zhaoxuan Tan, Xiaochuang Han, and Yulia Tsvetkov.Can language models solve graph problems in natural language?In Alice Oh, Tristan Naumann, Amir Globerson, Kate Saenko, Moritz Hardt, and Sergey Levine (eds.),
*Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023*, 2023.URL http://papers.nips.cc/paper_files/paper/2023/hash/622afc4edf2824a1b6aaf5afe153fa93-Abstract-Conference.html. - Wei etal. (2024)Yanbin Wei, Shuai Fu, Weisen Jiang, JamesT. Kwok, and YuZhang.Gita: Graph to visual and textual integration for vision-language graph reasoning.2024.URL https://api.semanticscholar.org/CorpusID:267413180.
- Xiong etal. (2023)Kai Xiong, Xiao Ding, Yixin Cao, Ting Liu, and Bing Qin.Examining inter-consistency of large language models collaboration: An in-depth analysis via debate.In Houda Bouamor, Juan Pino, and Kalika Bali (eds.),
*Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023*, pp. 7572–7590. Association for Computational Linguistics, 2023.URL https://doi.org/10.18653/v1/2023.findings-emnlp.508. - Yang etal. (2024)Mingji Yang, Hanzhi Wang, Zhewei Wei, Sibo Wang, and Ji-Rong Wen.Efficient algorithms for personalized pagerank computation: A survey.
*IEEE Trans. Knowl. Data Eng.*, 36(9):4582–4602, 2024.URL https://doi.org/10.1109/TKDE.2024.3376000. - Yao etal. (2023)Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, KarthikR. Narasimhan, and Yuan Cao.React: Synergizing reasoning and acting in language models.In
*The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023*. OpenReview.net, 2023.URL https://openreview.net/forum?id=WE_vluYUL-X. - Zhang etal. (2024)Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, JoshuaB. Tenenbaum, Tianmin Shu, and Chuang Gan.Building cooperative embodied agents modularly with large language models.In
- Zheng etal. (2024)Xin Zheng, Qiming Zhu, Hongyu Lin, Yaojie Lu, Xianpei Han, and LeSun.Executing natural language-described algorithms with large language models: An investigation.In Nicoletta Calzolari, Min-Yen Kan, Véronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue (eds.),
*Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024, 20-25 May, 2024, Torino, Italy*, pp. 6752–6837. ELRA and ICCL, 2024.URL https://aclanthology.org/2024.lrec-main.596.

## Appendix A Distributed algorithms under the distributed paradigm

### A.1 Example of distributed algorithms in distributed algorithm library

Shortest Path: See Figure6.

Connectivity: See Figure7.

### A.2 Example of distributed algorithms designed by the Master LLM

PageRank: See Figure8.

Hamilton Path: See Figure9.

Subgraph Matching: See Figure10.

## Appendix B The GraphInstruct Dataset

The statistics and detailed information of GraphInstruct are shown in Table3. Hamilton Path and Subgraph Matching are NP-complete problems.

Problem | Definition | Node Range | Test Size |
---|---|---|---|

Cycle Detection | Detect if a given graph $\mathcal{G}$ contains any cycles. | [2, 100] | 400 |

Connectivity | Assess if two nodes $u$ and $v$ in a given graph $\mathcal{G}$ are connected via a path. | [2, 100] | 400 |

Bipartite Graph Check | Judge if a given graph $\mathcal{G}$ is bipartite. | [2, 100] | 400 |

Topological Sort | Find a topological ordering of vertices in a directed acyclic graph $\mathcal{G}$. | [2, 50] | 400 |

Shortest Path | Compute the shortest path between two specific nodes $u$ and $v$ in a given graph $\mathcal{G}$. | [2, 100] | 400 |

Maximum Triangle Sum | Find the maximum sum of weights for any connected triplet of vertices in a given graph $\mathcal{G}$. | [2, 25] | 400 |

Maximum Flow | Calculate the maximum flow from a source node $s$ to a sink node $t$ in a directed graph $\mathcal{G}$. | [2, 50] | 400 |

Hamilton Path | Determine if a given graph $\mathcal{G}$ has a Hamiltonian path that visits each vertex exactly once. | [2, 50] | 400 |

Subgraph Matching | Verify if there exists a subgraph in $\mathcal{G}$ that is isomorphic to a given graph $\mathcal{G}^{\prime}$. | [2, 30] | 400 |

Hamilton Path Execution Example.

Problem Description:

Determine whether or not there is a Hamiltonian path in an undirected graph. In an undirected graph, (i,j) means that node i and node j are connected with an undirected edge. Given a graph, you need to output Yes or No, indicating whether there is a Hamiltonian path in the graph. Q: The nodes are numbered from 0 to 5, and the edges are: (0, 3) (0, 1) (0, 2) (0, 4) (1, 5) (1, 4) (1, 2) (1, 3) (2, 4) (2, 5) (3, 5) (3, 4). Is there a Hamiltonian path in this graph?

Execution Process:

### State1. ‘visited‘: Boolean indicating if the node has been visited.2. ‘path_length‘: Integer representing the current length of the path.3. ‘max_path_length‘: Integer tracking the longest path found. ### InitializationStep 1: Set ‘visited = False‘, ‘path_length = 0‘, and ‘max_path_length = 1‘ for all nodes.Step 2: For the initial node (e.g., Node Id: 0), set ‘visited = True‘ and‘path_length = 1‘.Distributed_Algorithm_Designer: ### Message1. ‘path_length‘: The current path length from the sender.2. ‘max_path_length‘: The maximum path length known to the sender.3. ‘visited_flag‘: Boolean indicating whether the sender has visited the node.### SendStep 1: For each neighbor (Node A, Node B), if ‘visited‘ is ‘False‘, construct the message as follows:- Message to Node A: ‘(Node Id: <node_id>, path_length + 1, max(max_path_length, path_length + 1))‘.- Message to Node B: ‘(Node Id: <node_id>, path_length + 1, max(max_path_length, path_length + 1))‘.Step 2: Send the constructed messages to both neighbors.### Example## InputNode Id: 0State:1. visited: False2. path_length: 23. max_path_length: 3Neighbor Information:Connected to:Node ANode B## ProcessStep 1: Check neighbors Node A and Node B. Since ‘visited‘ is‘False‘, proceed to construct messages.- For Node A: - Calculate new ‘path_length‘: ‘2 + 1 = 3‘ - Calculate ‘max_path_length‘: ‘max(3, 2 + 1) = 3‘- For Node B: - Same calculations apply as for Node A.Step 2: Construct messages:- Message to Node A: ‘(Node Id: 0, 3, 3)‘- Message to Node B: ‘(Node Id: 0, 3, 3)‘## OutputMessage sent to Node A:1. path_length: 32. max_path_length: 33. visited_flag: FalseMessage sent to Node B:1. path_length: 32. max_path_length: 33. visited_flag: False### UpdateStep 1: For each received message, check the ‘visited_flag‘. If it is‘False‘, proceed to the next steps.Step 2: Update ‘visited‘ to ‘True‘ for the current node.Step 3: Update ‘path_length‘ to the maximum of its current value andthe received ‘path_length‘ from the message, incrementing by 1 if themessage indicates a potential continuation of the path.Step 4: Update ‘max_path_length‘ to the maximum of its current valueand the received ‘max_path_length‘ from the message.Step 5: If any messages were processed, trigger a re-evaluation ofoutgoing messages to neighbors, if applicable.### Example## InputNode Id: 1State:1. visited: False2. path_length: 13. max_path_length: 2Received Messages:Message 1:1. path_length: 22. max_path_length: 33. visited_flag: FalseMessage 2:1. path_length: 32. max_path_length: 43. visited_flag: True## ProcessStep 1: Process Message 1. Since ‘visited_flag‘ is ‘False‘, proceed:- Update ‘visited‘ to ‘True‘.- Update ‘path_length‘: ‘max(1, 2) + 1 = 3‘.- Update ‘max_path_length‘: ‘max(2, 3) = 3‘.Step 2: Process Message 2. Since ‘visited_flag‘ is ‘True‘, ignore thismessage and do not make any changes.## OutputState:1. visited: True2. path_length: 33. max_path_length: 3### TerminationThe algorithm continues until:- A node reaches a ‘path_length‘ equal to the total number of nodes(indicating a Hamiltonian path exists).- Or until a predefined number of iterations equal to the total number- of nodes (‘N‘) is reached without discovering a Hamiltonian path.Master: ##################### ITER 3, STEP 1: REASONING ######################Master: {’thought’: "The algorithm is designed to check for aHamiltonian path. Now let’s execute it to see if such a path exists.",’speak’: "The algorithm is ready. Let’s execute it to determine if there isa Hamiltonian path.", ’function’: [{’name’: ’Distributed_Execution’,’arguments’: {}}]}Master: ####################### ITER 3, STEP 2: ACTING #######################0: State: 1. visited: True 2. path_length: 1 3. max_path_length: 12: State: 1. visited: False 2. path_length: 0 3. max_path_length: 15: State: 1. visited: False 2. path_length: 0 3. max_path_length: 13: State: 1. visited: False 2. path_length: 0 3. max_path_length: 14: State: 1. visited: False 2. path_length: 0 3. max_path_length: 11: State: 1. visited: False 2. path_length: 0 3. max_path_length: 1Node 5 Send Message to Node 1: 1. path_length: 1 2. max_path_length:1 3. visited_flag: FalseNode 5 Send Message to Node 2: 1. path_length: 1 2. max_path_length:1 3. visited_flag: FalseNode 5 Send Message to Node 3: 1. path_length: 1 2. max_path_length:1 3. visited_flag: False...All agents’ state unchanged, terminating early...>>> ENDsystem: 1. Execute function Distributed_Execution [ARGUMENTS]: [STATUS]: SUCCESS [RESULT]: Node: 0 State: 1. visited: True 2. path_length: 5 3. max_path_length: 1Node: 1 State: 1. visited: True 2. path_length: 2 3. max_path_length: 1Node: 2 State: 1. visited: True 2. path_length: 4 3. max_path_length: 3Node: 3 State: 1. visited: True 2. path_length: 5 3. max_path_length: 3Node: 4 State: 1. visited: True 2. path_length: 4 3. max_path_length: 3Node: 5 State: 1. visited: True 2. path_length: 5 3. max_path_length: 3Master: ##################### ITER 4, STEP 1: REASONING ######################Master: {’thought’: ’Since the maximum path length found is 5, which isless than the total number of nodes (6), there is no Hamiltonian path inthe graph.’, ’speak’: ’No, there is no Hamiltonian path in this graph.’,’function’: []}

## Appendix C Execution examples of GraphAgent-Reasoner

Shortest Path Execution Example.

Problem Description:

Find the shortest distance from a source node to other nodes in an undirected graph. In an undirected graph, (i,j,k) means that node i and node j are connected with an undirected edge with weight k. The graph has 8 nodes, and the edges are: (0,7,9) (0,1,7) (0,4,9) (1,7,1) (2,7,7) (2,6,5) (2,5,8) (3,5,9) (3,4,8) (3,6,1) (4,7,7) (4,5,6) (5,7,6). Give the weight of the shortest distance from node 1 to other node.

Execution Process:

Master: ##################### ITER 1, STEP 1: REASONING ######################Master: {’thought’: ’I need to find the shortest distances from node 1 to allother nodes in the given undirected graph. First, I will build the graph usingthe provided edges and then design a distributed algorithm to find theshortest paths.’, ’speak’: ’Building the graph and designing the distributedalgorithm to find the shortest distances from node 1.’, ’function’: ’[{"name":"Build_Graph", "arguments": {"Edges": [[0, 7, 9], [0, 1, 7], [0, 4, 9], [1, 7, 1], [2,7, 7], [2, 6, 5], [2, 5, 8], [3, 5, 9], [3, 4, 8], [3, 6, 1], [4, 7, 7], [4, 5, 6], [5, 7, 6]],"Nodes_Num": 8, "Directed": false}}]’}Master: ####################### ITER 1, STEP 2: ACTING #######################>>> Edges: [[0, 7, 9], [0, 1, 7], [0, 4, 9], [1, 7, 1], [2, 7...>>> Nodes_Num: 8>>> Directed: False>>> ENDsystem: 1. Execute function Build_Graph [ARGUMENTS]: Edges: [[0, 7, 9], [0, 1, 7], [0, 4, 9], [1, 7, 1], [2, 7, 7], [2, 6, 5], [2, 5, 8], [3, 5, 9], [3, 4, 8], [3, 6, 1], [4, 7, 7], [4, 5, 6], [5, 7, 6]] Nodes_Num: 8 Directed: False [STATUS]: SUCCESSMaster: ##################### ITER 2, STEP 1: REASONING ######################Master: {’thought’: ’The graph has been successfully built. Now, I will designa distributed algorithm to find the shortest distances from node 1 to allother nodes.’, ’speak’: ’Designing a distributed algorithm to find theshortest distances from node 1.’, ’function’: ’[{"name":"Design_Distributed_Algorithm", "arguments": {"Problem_Str": "Find theshortest distance from node 1 to all other nodes in an undirected graphwith 8 nodes and given edges."}}]’}Master: ####################### ITER 2, STEP 2: ACTING #######################>>> ENDMaster: ##################### ITER 3, STEP 1: REASONING ######################Master: {’thought’: ’The distributed algorithm has been successfullydesigned. Now, I will execute the algorithm to find the shortest distancesfrom node 1 to all other nodes.’, ’speak’: ’Executing the distributedalgorithm to calculate the shortest distances from node 1.’, ’function’:’[{"name": "Distributed_Execution", "arguments": {}}]’}Master: ####################### ITER 3, STEP 2: ACTING #######################Initialization:1: State: 1. distance: 00: State: 1. distance: \infinity2: State: 1. distance: \infinity4: State: 1. distance: \infinity5: State: 1. distance: \infinity7: State: 1. distance: \infinity3: State: 1. distance: \infinity6: State: 1. distance: \infinityNode 6 Send Message to Node 2: 1. new_distance: \infinityNode 6 Send Message to Node 3: 1. new_distance: \infinityNode 1 Send Message to Node 0: 1. new_distance: 7Node 1 Send Message to Node 7: 1. new_distance: 1Node 3 Send Message to Node 5: 1. new_distance: \infinity....All agents’ state unchanged, terminating early...>>> ENDsystem: 1. Execute function Distributed_Execution [ARGUMENTS]: [STATUS]: SUCCESS [RESULT]: Node: 0 State: 1. distance: 7Node: 1 State: 1. distance: 0Node: 2 State: 1. distance: 8Node: 3 State: 1. distance: 14Node: 4 State: 1. distance: 8Node: 5 State: 1. distance: 7Node: 6 State: 1. distance: 13Node: 7 State: 1. distance: 1Master: ##################### ITER 4, STEP 1: REASONING ######################The shortest distances from node 1 are as follows: Node 0: 7, Node 1: 0,Node 2: 8, Node 3: 14, Node 4: 8, Node 5: 7, Node 6: 13, Node 7: 1.Master:{’thought’: ’The distributed algorithm has been successfully executed, and Ihave obtained the shortest distances from node 1 to all other nodes.’,’speak’: ’The shortest distances from node 1 are as follows: Node 0: 7, Node1: 0, Node 2: 8, Node 3: 14, Node 4: 8, Node 5: 7, Node 6: 13, Node 7: 1.’,’function’: []}