Abstract: Large Language Models (LLMs) have gained popularity due to their high performance in natural language processing. This capability is underpinned by their ability to contextualize relations ...
We introduce OfficeBench, one of the first office automation benchmarks for evaluating current LLM agents' capability to address office tasks in realistic office workflows. OfficeBench requires LLM ...