On Thursday, OpenAI introduced a groundbreaking new AI tool called Operator, designed to perform tasks on the web without human intervention. The company says that this new AI assistant can handle a variety of repetitive tasks, like filling out forms, ordering groceries, and even creating memes, all on its own.
The main goal behind Operator is to save users time on routine tasks and open up new possibilities for businesses. The tool works just like how people interact with websites—by clicking buttons, scrolling, and typing on pages.
OpenAI explained in a blog post, “Today we’re releasing Operator, an agent that can go to the web to perform tasks for you. It uses its own browser and can interact with web pages by typing, clicking, and scrolling. While this is a research preview, it’s a big step toward building AIs that can do tasks independently. You give it a job, and it gets to work.”
Currently, Operator is available to Pro users in the United States via a special site (operator.chatgpt.com). OpenAI has made it available as a research preview to gather feedback from users and improve the tool. In the future, OpenAI plans to extend access to other user levels, including Plus, Team, and Enterprise users, with plans to integrate Operator into ChatGPT.
Key Features of Operator:
- Powered by a new model called Computer-Using Agent (CUA), Operator combines GPT-4's vision abilities with advanced reasoning through reinforcement learning.
- It can “see” what’s on a screen using screenshots and can act using mouse clicks and keyboard input, which lets it navigate web pages easily.
- No need for special integrations: Operator works directly with the graphical user interfaces (GUIs) found on most websites, like buttons and text fields.
- If Operator encounters difficulties or makes a mistake, it uses reasoning to correct itself. If it can’t resolve the issue, it hands control back to the user to ensure the task gets done.
- Operator works on a collaborative model, allowing users to take control at any point—especially when handling tasks like logins, payments, or CAPTCHAs.
While CUA (the model behind Operator) is still in its early stages and has limitations, it has already shown impressive results in browser benchmarks like WebArena and WebVoyager.
How to Use Operator:
Using Operator is simple. You just describe the task you want completed, and the AI does the rest. If the task requires login information or payment details, Operator will ask you to take over. You can also step in anytime during the process if needed.
For now, Operator is in a research phase, but OpenAI plans to continue improving it based on feedback. This marks an exciting new direction for AI tools, promising to make everyday tasks easier and more efficient for users.
Stay tuned as OpenAI refines this tool and looks to expand access in the future.