Theres a new AI agent ready to browse the web and fill in forms without the need to touch your mouse
Date:
Sat, 10 May 2025 02:00:00 +0000
Description:
Hugging Faces new Open Computer Agent will surf the web and perform online tasks like a human.
FULL STORY ======================================================================Hugging Face has debuted an AI tool for navigating the web on your behalf The Open Computer Agent uses a real web browser to complete tasks like getting directions or booking tickets The agent and its open-source demo can see what's on screen, click buttons, fill out forms, and move step-by-step
through tasks like a human
Hugging Face has introduced its own take on the growing number of semi-independent AI agents that can run online errands for people. The new
and free (if limited) Open Computer Agent is like having a personal assistant living inside your web browser.
Part of the companys ongoing smolagents initiative, the Open Computer Agent can engage with websites and apps like you would, handling an invisible mouse and keyboard to complete requests. The AI can open a browser, type things
into forms, click buttons, and more. Ask it to find directions, and itll go
to Google Maps, enter the origin and destination, and show you the route like a dutiful digital chauffeur.
You can try it yourself with the live demo. Fair warning, its popularity is causing some delays and errors due to a backlog. We're launching Computer Use in smolagents! -> As vision models become more capable, they become able to power complex agentic workflows. Especially Qwen-VL models, that support built-in grounding, i.e. ability to locate any element in an image by its coordinates, thus to pic.twitter.com/mI8MuWZkIS May 6, 2025 Agent AI
The Open Computer Agent is a different philosophy of an idea that has led to similar tools like OpenAI's Operator , Browser Use, Proxy 1.0 , and Opera's Browser Operator . Like those tools, Hugging Face's AI agent is all about being an active participant instead of a passive source of information.
Like Browser Use, Open Computer Agent is open-source, meaning anyone can see how it works and build on top of it, or at least tweak it for niche use
cases. The agent is the start of something more flexible, not a finished product with a million legal disclaimers. That also means the demo is exactly that, a demonstration, not a polished package. It can get things wrong and require you to jump in for logins and CAPTCHA tests.
Booking tickets, checking store hours, doing searches, looking up directions, and clicking through menus are all things a lot of people would like to be able to do with a single natural language prompt. Its one thing to ask
ChatGPT how to find cheap flights. Its another to watch a tool go to a travel website, scroll through listings, and attempt to click book now.
It might be flawed and far from flashy, but Open Computer Agent represents an approach to AI that might become as common as the now ubiquitous AI image generators. You might also like OpenAI's first AI Agent is here, and Operator can make a dinner reservation and complete other tasks on the web for you I discovered a surprising difference between DeepSeek and ChatGPT search capabilities Opera brings its web browser and AI assistant to iOS
======================================================================
Link to news story:
https://www.techradar.com/computing/artificial-intelligence/theres-a-new-ai-ag ent-ready-to-browse-the-web-and-fill-in-forms-without-the-need-to-touch-your-m ouse
--- Mystic BBS v1.12 A47 (Linux/64)
* Origin: tqwNet Technology News (1337:1/100)