Plan-Augmented Multi-Agent LLM Systems for Enterprise Workflow Automation: A Thinking Fast and Slow Decision Framework

Francesco M. Ferguson; Brendan Young

Authors

Francesco M. Ferguson Department of Computer Science, George Mason University, Fairfax, VA, USA. Author
Brendan Young Department of Computer Science, University of Houston, Houston, TX, USA. Author

Keywords:

multi-agent systems, large language models, workflow automation, thinking fast and slow, decision framework, enterprise architecture, governance

Abstract

The integration of large language models into enterprise workflows has introduced unprecedented capabilities in natural language understanding, generation, and reasoning. However, the deployment of single-agent LLMs for complex, multi-step, and coordination-intensive business processes reveals significant limitations in reliability, consistency, and adherence to organizational constraints. This paper proposes a plan-augmented multi-agent LLM architecture that leverages a dual-process decision framework inspired by Kahneman’s thinking fast and slow model. In this system, a set of specialized LLM agents operate under the supervision of a planning module that distinguishes between rapid reactive decisions and deliberative analytical reasoning. The architecture supports enterprise workflow automation by dynamically assigning tasks to fast or slow reasoning pathways based on task complexity, risk level, and temporal constraints. We discuss structural trade-offs, governance mechanisms, robustness considerations, and fairness implications. Through cross-domain comparisons and illustrative case studies, we demonstrate how the proposed framework enhances operational efficiency while maintaining accountability. The paper further examines deployment challenges, sustainability metrics, and policy implications for large-scale socio-technical infrastructures. Our analysis suggests that plan-augmented multi-agent systems offer a promising path toward reliable and scalable enterprise automation, provided that careful attention is given to interpretability, bias mitigation, and human oversight.

References

1. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.

2. Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., ... & Zhang, Y. (2023). Sparks of artificial general intelligence: Early experiments with GPT-4. arXiv preprint arXiv:2303.12712.

3. Park, J. S., O'Brien, J. C., Lai, C. J., & Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (pp. 1-22).

4. Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus and Giroux.

5. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730-27744.

6. Wu, Q., Bansal, G., Zhang, J., Wu, Y., Zhang, S., Zhu, E., ... & Wang, W. Y. (2023). AutoGen: Enabling next-gen LLM applications via multi-agent conversation. arXiv preprint arXiv:2308.08155.

7. Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L., Wiegreffe, S., ... & Clark, P. (2023). Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651.

8. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023). ReAct: Synergizing reasoning and acting in language models. Proceedings of the 11th International Conference on Learning Representations.

9. Georgievski, I., & Aiello, M. (2015). HTN planning: Overview, comparison, and beyond. Artificial Intelligence, 222, 124-156.

10. Dohan, D., Xu, W., Lewkowycz, A., Austin, J., Bieber, D., Lozhkov, A., ... & Tenenbaum, J. B. (2022). Language model cascades. arXiv preprint arXiv:2207.10342.

11. Zhou, D., Schärli, N., Hou, L., Wei, J., Scales, N., Wang, X., ... & Chi, E. (2023). Least-to-most prompting enables complex reasoning in large language models. Proceedings of the 11th International Conference on Learning Representations.

12. Shinn, M., Cassano, F., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36.

13. Dou, Z., Cui, D., Yan, J., Wang, W., Chen, B., Wang, H., ... & Zhang, S. (2025). Dsadf: Thinking fast and slow for decision making. arXiv preprint arXiv:2505.08189.

14. Gao, H., Zeng, W., Zhang, J., & Liang, Y. (2025, December). A large model API response quality prediction model based on least squares vector machine and SHAP interpretability analysis. In 2025 5th International Symposium on Artificial Intelligence and Big Data (AIBDF) (pp. 438-442). IEEE.

15. Shih, K., Deng, Z., Chen, X., Zhang, Y., & Zhang, L. (2025, May). DST-GFN: A Dual-Stage Transformer Network with Gated Fusion for Pairwise User Preference Prediction in Dialogue Systems. In 2025 8th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE) (pp. 715-719). IEEE.

16. Bansal, G., Chamola, V., & Sikdar, B. (2024). Metacognition in AI systems: A survey. ACM Computing Surveys, 56(4), 1-38.

17. Raji, I. D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., ... & Barnes, P. (2020). Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 33-44.

18. Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narasimhan, K., ... & Zhou, D. (2023). Self-consistency improves chain of thought reasoning in language models. Proceedings of the 11th International Conference on Learning Representations.

19. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Computing Surveys, 54(6), 1-35.

20. Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L. M., Rothchild, D., ... & Dean, J. (2021). Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350.

21. Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650.

22. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.

23. Schwarting, W., Alonso-Mora, J., & Rus, D. (2018). Planning and decision-making for autonomous vehicles. Annual Review of Control, Robotics, and Autonomous Systems, 1, 187-210.

24. European Commission. (2021). Proposal for a regulation laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). COM(2021) 206 final.

Plan-Augmented Multi-Agent LLM Systems for Enterprise Workflow Automation: A Thinking Fast and Slow Decision Framework

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Journal Information

Latest publications

Make a Submission

Information