Multi-Agent Reinforcement Learning for Labor Supply Optimization in Digital Platforms under Self-Set and Platform-Assigned Goals

Authors

  • Anil Parekh Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, USA. Author
  • Colin L. Ramos Department of Computer Science, George Mason University, Fairfax, VA, USA. Author
  • Arthur M. Lyons School of Computing, Clemson University, Clemson, SC, USA. Author
  • Hudson Pichards School of Information Technology, University of Cincinnati, Cincinnati, OH, USA. Author

Keywords:

multi-agent reinforcement learning, labor supply optimization, digital platforms, goal setting, platform governance, fairness, socio-technical systems

Abstract

Digital labor platforms increasingly mediate the supply of contingent work by matching independent workers with task demand through algorithmic allocation systems. These platforms face a fundamental tension between allowing workers to self-set their own labor goals and imposing platform-assigned targets to optimize aggregate output. This paper develops a multi-agent reinforcement learning framework to model and analyze labor supply optimization under both goal structures. We conceptualize each worker as an independent learning agent with private utility functions, while the platform acts as a centralized or decentralized goal-setting mechanism that shapes the reward environment. Through a system-level discussion, we examine how self-set goals foster worker autonomy and intrinsic motivation but may lead to suboptimal system-wide coordination, whereas platform-assigned goals can align individual behavior with global efficiency but risk fairness violations and worker disengagement. The architecture we propose integrates hierarchical reinforcement learning with meta-governance layers that dynamically adjust goal assignment based on worker state, platform congestion, and long-term sustainability metrics. We explore structural trade-offs between exploration and exploitation at both the agent and platform levels, and we draw parallels with other large-scale socio-technical systems such as ride-hailing fleets and crowd work markets. Governance implications are analyzed through the lenses of distributive justice, transparency, and algorithmic accountability. Furthermore, we discuss deployment challenges including computational scalability, communication overhead, and the need for robust mechanisms against adversarial worker behavior. Policy recommendations are offered for platform regulators and designers seeking to balance productivity with worker well-being. The paper concludes that hybrid goal structures, where platforms offer voluntary default targets while preserving opt-in self-set options, provide a promising middle ground that can be optimized using multi-agent reinforcement learning with appropriate fairness constraints.

References

1. Horton, J. J. (2017). The effects of algorithmic labor market matching on worker earnings. Journal of Labor Economics, 35(S1), S319-S351.

2. Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multi-agent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 38(2), 156-172.

3. Lee, M. K., Kusbit, D., Metsky, E., & Dabbish, L. (2015). Working with machines: The impact of algorithmic and data-driven management on human workers. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 1603-1612.

4. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.

5. Rogerson, R., Shimer, R., & Wright, R. (2005). Search-theoretic models of the labor market: A survey. Journal of Economic Literature, 43(4), 959-988.

6. Chen, M. K., Chevalier, J. A., Rossi, P. E., & Oehlsen, E. (2019). The value of flexible work: Evidence from Uber drivers. Journal of Political Economy, 127(6), 2735-2794.

7. Locke, E. A., & Latham, G. P. (2002). Building a practically useful theory of goal setting and task motivation: A 35-year odyssey. American Psychologist, 57(9), 705-717.

8. Ang, S., & Slaughter, S. A. (2001). Work outcomes and job design for contract versus permanent information systems professionals on software development teams. MIS Quarterly, 25(3), 321-350.

9. Rosenblat, A., & Stark, L. (2016). Algorithmic labor and information asymmetries: A case study of Uber’s drivers. International Journal of Communication, 10, 3758-3784.

10. Li, Y. (2017). Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274.

11. Shoham, Y., & Leyton-Brown, K. (2009). Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press.

12. Lin, K., Zhao, R., Xu, Z., & Zhou, J. (2018). Efficient large-scale fleet management via multi-agent deep reinforcement learning. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 1774-1783.

13. Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. Proceedings of the 11th International Conference on Machine Learning, 157-163.

14. Nash, J. (1951). Non-cooperative games. Annals of Mathematics, 54(2), 286-295.

15. Goldsmith, S., & Eggers, W. D. (2004). Governing by network: The new shape of the public sector. Brookings Institution Press.

16. Agrawal, A., Lacetera, N., & Lyons, E. (2016). Does information help or hinder job applicants from less developed countries in online markets? Journal of International Economics, 100, 129-141.

17. Min, X., Chi, W., Hu, X., & Ye, Q. (2024). Set a goal for yourself? A model and field experiment with gig workers. Production and Operations Management, 33(1), 205-224.

18. Tang, X., Qin, Z., Zhang, J., & Ye, Q. (2021). Dynamic goal setting in crowdsourcing: A multi-armed bandit approach. Manufacturing & Service Operations Management, 23(5), 1179-1196.

19. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012). Fairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, 214-226.

20. Selbst, A. D., Boyd, D., Friedler, S. A., Venkatasubramanian, S., & Vertesi, J. (2019). Fairness and abstraction in sociotechnical systems. Proceedings of the Conference on Fairness, Accountability, and Transparency, 59-68.

21. European Commission. (2021). Proposal for a regulation laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). COM(2021) 206 final.

22. McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 1273-1282.

23. Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., & Whiteson, S. (2018). QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning. Proceedings of the 35th International Conference on Machine Learning, 4295-4304.

24. Yang, Y., Luo, R., Li, M., Zhou, M., Zhang, W., & Wang, J. (2018). Mean field multi-agent reinforcement learning. Proceedings of the 35th International Conference on Machine Learning, 5571-5580.

25. Chalkiadakis, G., Elkind, E., & Wooldridge, M. (2011). Computational aspects of cooperative game theory. Morgan & Claypool Publishers.

26. Van Moffaert, K., & Nowé, A. (2014). Multi-objective reinforcement learning using sets of Pareto dominating policies. Journal of Machine Learning Research, 15(1), 3483-3512.

27. Taylor, M. E., & Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10, 1633-1685.

Downloads

Published

2026-05-30

How to Cite

Multi-Agent Reinforcement Learning for Labor Supply Optimization in Digital Platforms under Self-Set and Platform-Assigned Goals. (2026). Journal of Advanced Artificial Intelligence Research, 5(1). https://www.jaair.org/index.php/home/article/view/72