Holy Mary mother of everything! Chat GPT AI agent is a game changer!

By Stephen, 4 August, 2025

Forum
Assistive Technology

OK so I put this under accessible technology for reasons that you will learn.

I have recently been playing around with the new ChatGPT ai agent and my fellow blind folks, this is something else entirely! I am having it use the back end of Wix to design and format an entire website that I’ve been wanting to build. It’s created the text, it’s created the images, it’s place things in the proper location, it is uploading Audio files to the website for me, it’s changing all of the fonts to make it look appealing to the eyes, it’s changing layouts to what I’ve prompted it to do, this is nuts. This pretty much makes every single web design platform or website accessible if it can take the actions on your behalf.

Options

Comments

By JC on Tuesday, August 5, 2025 - 00:36

Hi, Is it free? or not. and can it work with, say, emailing a friend?

By xenacat3 on Tuesday, August 5, 2025 - 00:46

i’ve tried creating websites before with minimal success, so I would really love to know how you get ChatGPT to communicate with wix.

By Stephen on Tuesday, August 5, 2025 - 00:53

No you either get it with the plus or pro plan of ChatGPT.

By Stephen on Tuesday, August 5, 2025 - 00:58

I wish I had time to do podcasts because this one would’ve been a great one!

So what happens is chat GPT opens up its own internal computer, and you tell it what you wanted to do. You do have to provide login information so ChatGPT can go login for you and do what you’ve asked it to do. For example, I’ve been giving it quite extensive prompts in regards to color schemes for the website, creating search engine optimization, writing and inserting a terms of service, adding files to the website and my goodness it has been 6 hrs of omg lol. It will also tell you what it is doing and how it is doing it as it navigates webpages, clicks on buttons, and makes selections based off of your instructions. There’s way too much to put in here…

By Stephen on Tuesday, August 5, 2025 - 00:59

It will also go through and see if there’s any sort of design flaws, it will give you feedback and then you can tell it whether or not you want it to improve what it had suggested or if you’ve changed your mind and completely wanted to do something else.

By Stephen on Tuesday, August 5, 2025 - 01:07

The coolest thing is watching the system problem solve. It does move slower, but this only came out about a week ago so I don’t expect it to be as fast as the latest ChatGPT model but this thing has reasoning, problem-solving, navigation abilities etc etc.

By Winter Roses on Tuesday, August 5, 2025 - 01:23

I had access to the ChatGPT Agent before my plan expired, and honestly, at first it was difficult to even log into the portal. I remember trying to use it literally the second day it launched, and I needed extra help to get into the interface. I’m hearing a lot of mixed reviews—some users say it’s accessible, others say it isn’t. When I started using it, I went in with the expectation that I’d be able to book flights, pay the bills, schedule meetings, do a bit of shopping, maybe even order food. But in reality, it wasn’t able to handle most of those things. To be fair, I haven’t tested it extensively, especially since my plan has expired now. I keep meaning to renew it and mess around with it more once I have the free time. From what I experienced, it mostly felt like an upgraded version of the memory feature. That seems to be the main use case right now—handling files like documents, scheduling reminders, creating calendar events. But I’m wondering: can it do anything with media, like videos?
One important thing to remember is that when it comes to ChatGPT Agents, a lot depends on which websites are integrated or onboard. For example, I thought I’d be able to order groceries from Amazon through the agent, but that didn’t happen. Amazon, being the giant that it is, doesn’t really want third-party systems getting involved in their operations. So because of that limitation, those kinds of shopping features aren’t available—at least not with the bigger names like Amazon, Walmart, or Target. That said, I think there’s still potential here. If smaller independent retailers and grocery stores are smart, they could partner or integrate with systems like this and carve out a space. That might be the route forward. I’ve seen people doing some impressive tasks with the Agent—writing code, designing flyers, formatting documents, creating spreadsheets, prepping PowerPoint presentations. I have no idea how they’re doing all that, but it looks like there’s a lot of potential for power users. Still, for me personally, I found it a little underwhelming compared to what I was hoping for. It’s also a bit tricky to get the hang of.

I would love to see a full walkthrough or demo of what this thing can actually do. I’m surprised they haven’t done one yet, but maybe it’s because it’s still early—only been out for a couple of weeks, if I remember right. Ultimately, I think it needs more time to mature and really live up to the hype. But if someone wants to try it now just to get a sense of where it’s starting and compare that to where it’s heading later on, $20 isn’t a bad price—if you can afford it.

By Stephen on Tuesday, August 5, 2025 - 01:38

honestly, it just uses its own internal computer and accesses the website like you would if you were accessing it yourself. I mean, it works with the files, audio, video, but the coolest thing is it navigating the web and taking actions on your behalf. Those actions it’s taking obviously is based on your instructions. Using it as fully accessible. I use NVDA. To be fair though I have not tried it with jaws. It’s as accessible as a regular ChatGPT window. I’ve posted it here because I’ve just been using it to completely design an entire website from scratch. You do have to ensure your prompting properly and your prompts are as detailed as possible.

By Stephen on Tuesday, August 5, 2025 - 01:39

If for some reason, it can’t access the website, it does problem-solving and it works around that and it finds a solution to access the site.

By Stephen on Tuesday, August 5, 2025 - 02:21

ChatGPT AI agent can assist with a wide range of tasks. It can generate text, answer questions, write and rewrite copy, translate languages, summarize articles, brainstorm ideas, produce outlines or scripts, and even help with coding tasks. It can interact with websites and applications on your behalf, reading content aloud or filling in forms, making technology more accessible. Because it's conversational, you can refine the results by asking follow‑up questions until you get what you need. This makes ChatGPT an extremely versatile tool for productivity and accessibility.

By Gokul on Tuesday, August 5, 2025 - 02:43

I've been playing arount with it for a little while, and I've made it add stuff to my amazon shopping cart, book me movie tickets on bookmyshow, so... Haven't tried booking a flight because I haven't had to yet.
and @JC certainly you can make it email your friend so long as you are ready to give it your email cridencials.
Generally speaking, it's truly a game-changer as far as web accessibility is concerned... Should try it with, say, wordpress or youtube...

By JoĂŁo Santos on Tuesday, August 5, 2025 - 03:12

One of the problems with current large language models is that they get relatively good at tackling the so-called happy path in coding problems, which is when all the interactions are easily predictable, but fail to tackle even trivial edge cases sometimes, potentially resulting in security problems. Furthermore they are also prone to hallucinate, and this problem has actually been getting worse lately, with consequences like misspelling dependencies that don't really exist, opening a window of opportunity for bad actors to register them and perform supply chain attacks similar to typo squatting for humans. Beyond this there are also code quality problems, in which the AI tends to generate extremely verbose solutions to problems that experienced programmers can solve a lot more efficiently, which makes the generated code unnecessarily much harder to reason about.

All the above combined results in a huge pile of bloated code with lots of technical debt, skyrocketing costs from token usage, and since the time and memory complexity of context windows increases quadratically with their size, it's not even that hard for a medium-sized codebase to hit resource limits so the whole thing is extremely unsustainable. While I think it's perfectly possible to build hybrid models that take as much advantage of existing algorithmic solutions as possible to significantly improve their efficiency, and I have my own theories about them that I will start experimenting with soon, I think that doing so will require a huge paradigm shift that may not happen before the current AI hype bubble pops.

One potential time-bomb issue for this technology is a phenomenon in which training new AI models on the output of other AI is known to lead to model collapse due to a yet not understood increased tendency to hallucinate, which is becoming a problem given the proliferation of AI-generated content on the Internet, and might already be adversely affecting the latest frontier models significantly. This content is often called AI slop mostly because it's easy to generate without providing much in terms of actual value.

By kool_turk on Tuesday, August 5, 2025 - 05:23

If using this requires me to share my login credentials with ChatGPT, then that’s a definite no from me. Even if there’s a way to do it manually, I’d want to know if the process is accessible.

Until those concerns are addressed, I think I’ll be steering clear of it.

Also, once you're logged in to a service through ChatGPT, how do you end that session? Is it as simple as deleting the conversation?

By Kushal Solanki on Tuesday, August 5, 2025 - 08:34

Hey guys has anyone used chat gpt agent on the iphone and if so what has been your experiences?
Is it easier to use it on the computer or can you use it on iphone too.

By Winter Roses on Tuesday, August 5, 2025 - 09:48

Hi guys, so I wanted to let you know that I finally activated a ChatGPT plan to try out the Agent. What I attempted to do was to log into this website to post a comment—similar to the example shown above. Unfortunately, when using the iPhone, that doesn’t seem to be fully possible. I turned on Screen Recognition and was able to confirm that the username and password fields were on the screen, but they weren’t accessible with VoiceOver. If this is a bug or an accessibility oversight, it needs to be reported to OpenAI so it can be addressed as soon as possible.

On the bright side, ChatGPT did successfully manage to navigate to the website and locate the login page, which worked well. I was also able to type my username and password directly into the chat, and the agent was able to enter those details and log me in. That said, from what I’m seeing so far, you have to be extremely specific with your instructions. In some cases, you need to know exactly what you’re looking for in order to get the results you want. For example, I wanted to post a comment on this specific post, but I couldn’t remember the exact name or title. That ended up confusing the model a bit, so maybe websites with a clearer structure or layout might work better. I realized that ChatGPT doesn’t automatically recognize that a post is about itself, which makes sense, but it means you’ll need to be extra clear when giving instructions on sites with dynamic content.
Let me see if I can explain this a little clearer. So imagine you’re on a virtual supermarket website. You decide that for breakfast today, you want a box of Cocoa Puffs, a bottle of Pepsi, and a loaf of bread. Now, on these virtual supermarket shelves, ChatGPT is scanning through categories like “Cereals,” “Beverages,” and “Bakery.” If the Pepsi is sitting in the “Refrigerated Drinks” section or the bread is in “Bakery,” then ChatGPT will likely find those items pretty quickly because it knows where to look and what those categories typically mean. But let’s say there’s another person who owns a completely different website—like Mary, who runs a baking site. She sells chocolate chip cookies. Now you say, “ChatGPT, order me a box of chocolate chip cookies and a sugar-free glazed blackberry doughnut.” If the doughnut section is clearly labeled or easy to access, the model might find it right away. But if Mary filed her cookies under something more abstract like “Mary's Confectionaries” or “Sweet Bites,” ChatGPT might still be able to get there—it’ll just take a bit more time and work. That’s the part I’m trying to highlight. For the model to be most effective, you need to be specific. The reason I couldn’t post my comment on the site was literally because I didn’t remember the title of the post, and I couldn’t recall which section it was under. If you don’t have a good mental layout of the website, it can be much harder for the model to perform the task, even if it gets you in the right general area.

It was able to locate the username and password fields easily because those are common across websites and clearly labeled. ChatGPT understands those elements well—it knows, “This is the login box, and this is where I need to input credentials.” But if something is tucked away under an unusual label or section that isn’t visible on the screen directly, I don’t know how many places the model actually searches before it gives up or times out. Unfortunately, I didn’t get to explore that part much because, like a lot of people are discovering, there’s a time limit. Once you hit it, you’re no longer able to interact with the agent for the rest of the day, and I had already used up my window.
Right now, many of the more advanced features are limited. It looks like you only get 15 minutes per day—or maybe per session—with the browser, though I’m not entirely sure yet. I assumed I’d be able to talk to the agent hands-free in voice mode and have it carry out the tasks for me, but that doesn’t seem to be possible. I noticed that when the task is completed, my phone vibrates and I get a notification—which is a nice touch. It’s definitely a bit slow, but that’s expected given that we’re still in the early stages. If someone were going to do a full review of the product, I imagine they’d need to edit the pause time or task to fit while the model processes everything in the background. Anyway, I couldn’t get it to post the comment, but this is only my first time using it. I’m assuming things will improve in the future as they continue building it out.

By Tara on Tuesday, August 5, 2025 - 13:44

Hi,
So when using this on Windows, both through my browser and the desktop app,the virtual browser, the browser you can use to enter your username and password, plus takeover from the agent in general if you need to click something the agent won't do like a Captcha is totally inaccessible. I've tried with JAWS and NVDA, NVDA object navigation OCR, and the JAWS cursors and OCR but nothing works. And it won't go to amazon.co.uk or amazon.com at all, even if I tell it to go to this page without completing any task. There's a checkbox on audiogames.net that it won't click because it's a Captcha, and if I take over from the agent, I can't access the checkbox no matter what JAWS or NVDA commands I try. I mean I could give it my username and password for something, but I'd have to keep changing the password just in case it stores it and my security is compromised. I'd only give it my credentials to log into something if something was really inaccessible, but I'd be changing my password after logging out that's for sure.

By Stephen on Tuesday, August 5, 2025 - 14:04

You only have it for 15 minutes? That’s strange because I was using it for 6 hours yesterday editing my website and still have time left and I’m use the $20 a monthly but I might go to pro now. I’m loving this thing because it can work on my business while I work my regular job.

By Travis Roth on Tuesday, August 5, 2025 - 16:38

I was wondering too about how accessible interactions are, as it is using a VM. SO it seems Stephen is doing tasks that do not require him to interact with the virtual browser?...
As for usage, according to a chatGPT.com page, the Pro plan allows you 400 messages a month. So I guess try to pack those messages?
It is an interesting project for sure and I will keep monitoring it but need some more advances before it can help me with my job.
By the way, Claude has a similar agent but its not been in the news lately.

By Winter Roses on Tuesday, August 5, 2025 - 16:42

When I was using the ChatGPT agent this morning, it disconnected, and I couldn’t get it to reconnect again. I’m pretty sure I saw a time and date saying when it would be working again—though I could be totally wrong about that. But the second I saw the message, I instantly assumed the product was limited in some way, kind of like how the advanced voice feature is restricted. A lot of the more advanced features with ChatGPT seem to come with limitations, which makes sense. I mean, with the agent especially, it’s pretty obvious why—many members are trying to use it, and the system needs to keep up and handle all those tasks efficiently. I don’t even think anyone using the free version is going to get access to the agent. If they do , it’s gonna be extremely limited. So if I want to explore more of what it can do, I’m gonna need to play around with it some more when I have the time.

Now, regarding Amazon and shopping—based on what I’ve been reading online, Amazon is not one of the supported shopping websites you can use through the ChatGPT agent. And again, this isn’t that surprising. Amazon has worked hard to become one of the biggest names in online shopping, and the last thing they want is some third-party AI stepping in and acting as a middleman. They’re not going to give that kind of access freely. My thinking is this: smaller businesses, if they’re smart, will absolutely jump on this opportunity. If they can integrate with the agent, lower their prices, maybe offer free delivery or other perks to shoppers—then I could see customers choosing to shop with them instead. This could be a major advantage for smaller vendors looking to grow. As for whether there’s an official list of supported shopping partners, I’m not sure we have this feature as yet, but it certainly seems like the next logical step in the chain of evolution based on current trends.
I haven’t played around with the agent enough to speak definitively on everything. But I do think it depends on what you already know. ChatGPT can browse the internet and get relevant info, sure—but the more you understand about the site you’re trying to use, how it works, and what to look for, the more effective it seems to be. Some tasks are always gonna be easier because they’re direct and straightforward. Others, though, are going to be more obscure or ambiguous—and that’s probably where a lot of the confusion and inconsistency comes in.
I didn’t know that Claude had an agent-style product of its own. I might have to subscribe and check it out. I’ve never subscribed to any of Claude’s plans, and that’s mostly because I’ know the context window—like how many messages you can send in a chat—is limited. Even on the paid plan, I’ve heard it fills up quickly. And instead of starting a new thread when you hit the limit, you only find out when your message doesn’t go through. Another thing I don’t like about Claude is that if I’m typing a message and I accidentally close the app or something interrupts me, the entire message disappears. It’s not like ChatGPT, which keeps the text in the box, so when you reopen it, your content is still there. That’s one of those little actions that makes a big difference.

Don’t get me wrong—Claude gives grounded, logical responses. It's more human than ChatGPT in certain ways. But because of those limitations, I’ve been hesitant to give it a serious try. I’m going to take a closer look and do some research myself. My biggest issue with Claude has always been the censorship and restrictions—it’s more limited than ChatGPT in that sense. They're trying to be that “ethical, moral” AI, but in doing that, they might be missing the mark a bit. Not trying to knock them too hard—they do have a solid product. It just needs a bit of refinement… or loosening up.

By Gokul on Wednesday, August 6, 2025 - 02:54

First off, I haven't noticed any time limits per session as such. The limitation however is that for plus users, there are 40 chats using agent per month. that's like 40 tasks. Also, the virtual browser, as some of you mentioned, is inaccessible. I guess for it to work, the screenreader providers will have to work with open ai to implement a sollution. That's why as of now, we will have to provide the login cridencials to the agent. What Claud has is Claud compute, which is arkitecturally different from gpt agent. agents generally creates a VM in the cloud, whereas what claud compute does is take over your computer which means it can also access your files etc in the computer.