Holy Mary mother of everything! Chat GPT AI agent is a game changer!

is it free?

Hi, Is it free? or not. and can it work with, say, emailing a friend?

i’d love a podcast on this

i’ve tried creating websites before with minimal success, so I would really love to know how you get ChatGPT to communicate with wix.

@ JC

No you either get it with the plus or pro plan of ChatGPT.

@ xenacat3

I wish I had time to do podcasts because this one would’ve been a great one!

So what happens is chat GPT opens up its own internal computer, and you tell it what you wanted to do. You do have to provide login information so ChatGPT can go login for you and do what you’ve asked it to do. For example, I’ve been giving it quite extensive prompts in regards to color schemes for the website, creating search engine optimization, writing and inserting a terms of service, adding files to the website and my goodness it has been 6 hrs of omg lol. It will also tell you what it is doing and how it is doing it as it navigates webpages, clicks on buttons, and makes selections based off of your instructions. There’s way too much to put in here…

@ @ xenacat3 Bookmark

It will also go through and see if there’s any sort of design flaws, it will give you feedback and then you can tell it whether or not you want it to improve what it had suggested or if you’ve changed your mind and completely wanted to do something else.

@ xenacat3

The coolest thing is watching the system problem solve. It does move slower, but this only came out about a week ago so I don’t expect it to be as fast as the latest ChatGPT model but this thing has reasoning, problem-solving, navigation abilities etc etc.

My initial thoughts

I had access to the ChatGPT Agent before my plan expired, and honestly, at first it was difficult to even log into the portal. I remember trying to use it literally the second day it launched, and I needed extra help to get into the interface. I’m hearing a lot of mixed reviews—some users say it’s accessible, others say it isn’t. When I started using it, I went in with the expectation that I’d be able to book flights, pay the bills, schedule meetings, do a bit of shopping, maybe even order food. But in reality, it wasn’t able to handle most of those things. To be fair, I haven’t tested it extensively, especially since my plan has expired now. I keep meaning to renew it and mess around with it more once I have the free time. From what I experienced, it mostly felt like an upgraded version of the memory feature. That seems to be the main use case right now—handling files like documents, scheduling reminders, creating calendar events. But I’m wondering: can it do anything with media, like videos?
One important thing to remember is that when it comes to ChatGPT Agents, a lot depends on which websites are integrated or onboard. For example, I thought I’d be able to order groceries from Amazon through the agent, but that didn’t happen. Amazon, being the giant that it is, doesn’t really want third-party systems getting involved in their operations. So because of that limitation, those kinds of shopping features aren’t available—at least not with the bigger names like Amazon, Walmart, or Target. That said, I think there’s still potential here. If smaller independent retailers and grocery stores are smart, they could partner or integrate with systems like this and carve out a space. That might be the route forward. I’ve seen people doing some impressive tasks with the Agent—writing code, designing flyers, formatting documents, creating spreadsheets, prepping PowerPoint presentations. I have no idea how they’re doing all that, but it looks like there’s a lot of potential for power users. Still, for me personally, I found it a little underwhelming compared to what I was hoping for. It’s also a bit tricky to get the hang of.

I would love to see a full walkthrough or demo of what this thing can actually do. I’m surprised they haven’t done one yet, but maybe it’s because it’s still early—only been out for a couple of weeks, if I remember right. Ultimately, I think it needs more time to mature and really live up to the hype. But if someone wants to try it now just to get a sense of where it’s starting and compare that to where it’s heading later on, $20 isn’t a bad price—if you can afford it.

@ Winter Roses

honestly, it just uses its own internal computer and accesses the website like you would if you were accessing it yourself. I mean, it works with the files, audio, video, but the coolest thing is it navigating the web and taking actions on your behalf. Those actions it’s taking obviously is based on your instructions. Using it as fully accessible. I use NVDA. To be fair though I have not tried it with jaws. It’s as accessible as a regular ChatGPT window. I’ve posted it here because I’ve just been using it to completely design an entire website from scratch. You do have to ensure your prompting properly and your prompts are as detailed as possible.

@ Winter Roses

If for some reason, it can’t access the website, it does problem-solving and it works around that and it finds a solution to access the site.

I’m recording a little demo for you guys now.

I’m going to do my best at recording a little demo. I’m not the most entertaining so bear with me. lol.

What ChatGPT AI agent can do for you

ChatGPT AI agent can assist with a wide range of tasks. It can generate text, answer questions, write and rewrite copy, translate languages, summarize articles, brainstorm ideas, produce outlines or scripts, and even help with coding tasks. It can interact with websites and applications on your behalf, reading content aloud or filling in forms, making technology more accessible. Because it's conversational, you can refine the results by asking follow‑up questions until you get what you need. This makes ChatGPT an extremely versatile tool for productivity and accessibility.

don’t mind that last comment

I actually was doing a demonstration for you guys on audio on how the ChatGPTAI agent can work. I just essentially had it navigate to the Applevis website under my login information and it posted it under my account. Here is the audio file.

https://www.dropbox.com/scl/fi/y0ah9bogluanro9yh8vkm/audio1798085360.m4a?rlkey=j0es6fliaa6oq3u8bgnis3c99&st=55htxlsd&dl=0

Winter Roses

I've been playing arount with it for a little while, and I've made it add stuff to my amazon shopping cart, book me movie tickets on bookmyshow, so... Haven't tried booking a flight because I haven't had to yet.
and @JC certainly you can make it email your friend so long as you are ready to give it your email cridencials.
Generally speaking, it's truly a game-changer as far as web accessibility is concerned... Should try it with, say, wordpress or youtube...

Happy path

One of the problems with current large language models is that they get relatively good at tackling the so-called happy path in coding problems, which is when all the interactions are easily predictable, but fail to tackle even trivial edge cases sometimes, potentially resulting in security problems. Furthermore they are also prone to hallucinate, and this problem has actually been getting worse lately, with consequences like misspelling dependencies that don't really exist, opening a window of opportunity for bad actors to register them and perform supply chain attacks similar to typo squatting for humans. Beyond this there are also code quality problems, in which the AI tends to generate extremely verbose solutions to problems that experienced programmers can solve a lot more efficiently, which makes the generated code unnecessarily much harder to reason about.

All the above combined results in a huge pile of bloated code with lots of technical debt, skyrocketing costs from token usage, and since the time and memory complexity of context windows increases quadratically with their size, it's not even that hard for a medium-sized codebase to hit resource limits so the whole thing is extremely unsustainable. While I think it's perfectly possible to build hybrid models that take as much advantage of existing algorithmic solutions as possible to significantly improve their efficiency, and I have my own theories about them that I will start experimenting with soon, I think that doing so will require a huge paradigm shift that may not happen before the current AI hype bubble pops.

One potential time-bomb issue for this technology is a phenomenon in which training new AI models on the output of other AI is known to lead to model collapse due to a yet not understood increased tendency to hallucinate, which is becoming a problem given the proliferation of AI-generated content on the Internet, and might already be adversely affecting the latest frontier models significantly. This content is often called AI slop mostly because it's easy to generate without providing much in terms of actual value.

loggin sessions?

If using this requires me to share my login credentials with ChatGPT, then that’s a definite no from me. Even if there’s a way to do it manually, I’d want to know if the process is accessible.

Until those concerns are addressed, I think I’ll be steering clear of it.

Also, once you're logged in to a service through ChatGPT, how do you end that session? Is it as simple as deleting the conversation?

Using chat gpt agent on iphone

Hey guys has anyone used chat gpt agent on the iphone and if so what has been your experiences?
Is it easier to use it on the computer or can you use it on iphone too.

Other things that people have tried with Chat GPT agent.

Hi guys.
So with chat GPT agent what else have people tried to do with it.

Using ChatGPT agent on the iPhone

Hi guys, so I wanted to let you know that I finally activated a ChatGPT plan to try out the Agent. What I attempted to do was to log into this website to post a comment—similar to the example shown above. Unfortunately, when using the iPhone, that doesn’t seem to be fully possible. I turned on Screen Recognition and was able to confirm that the username and password fields were on the screen, but they weren’t accessible with VoiceOver. If this is a bug or an accessibility oversight, it needs to be reported to OpenAI so it can be addressed as soon as possible.

On the bright side, ChatGPT did successfully manage to navigate to the website and locate the login page, which worked well. I was also able to type my username and password directly into the chat, and the agent was able to enter those details and log me in. That said, from what I’m seeing so far, you have to be extremely specific with your instructions. In some cases, you need to know exactly what you’re looking for in order to get the results you want. For example, I wanted to post a comment on this specific post, but I couldn’t remember the exact name or title. That ended up confusing the model a bit, so maybe websites with a clearer structure or layout might work better. I realized that ChatGPT doesn’t automatically recognize that a post is about itself, which makes sense, but it means you’ll need to be extra clear when giving instructions on sites with dynamic content.
Let me see if I can explain this a little clearer. So imagine you’re on a virtual supermarket website. You decide that for breakfast today, you want a box of Cocoa Puffs, a bottle of Pepsi, and a loaf of bread. Now, on these virtual supermarket shelves, ChatGPT is scanning through categories like “Cereals,” “Beverages,” and “Bakery.” If the Pepsi is sitting in the “Refrigerated Drinks” section or the bread is in “Bakery,” then ChatGPT will likely find those items pretty quickly because it knows where to look and what those categories typically mean. But let’s say there’s another person who owns a completely different website—like Mary, who runs a baking site. She sells chocolate chip cookies. Now you say, “ChatGPT, order me a box of chocolate chip cookies and a sugar-free glazed blackberry doughnut.” If the doughnut section is clearly labeled or easy to access, the model might find it right away. But if Mary filed her cookies under something more abstract like “Mary's Confectionaries” or “Sweet Bites,” ChatGPT might still be able to get there—it’ll just take a bit more time and work. That’s the part I’m trying to highlight. For the model to be most effective, you need to be specific. The reason I couldn’t post my comment on the site was literally because I didn’t remember the title of the post, and I couldn’t recall which section it was under. If you don’t have a good mental layout of the website, it can be much harder for the model to perform the task, even if it gets you in the right general area.

It was able to locate the username and password fields easily because those are common across websites and clearly labeled. ChatGPT understands those elements well—it knows, “This is the login box, and this is where I need to input credentials.” But if something is tucked away under an unusual label or section that isn’t visible on the screen directly, I don’t know how many places the model actually searches before it gives up or times out. Unfortunately, I didn’t get to explore that part much because, like a lot of people are discovering, there’s a time limit. Once you hit it, you’re no longer able to interact with the agent for the rest of the day, and I had already used up my window.
Right now, many of the more advanced features are limited. It looks like you only get 15 minutes per day—or maybe per session—with the browser, though I’m not entirely sure yet. I assumed I’d be able to talk to the agent hands-free in voice mode and have it carry out the tasks for me, but that doesn’t seem to be possible. I noticed that when the task is completed, my phone vibrates and I get a notification—which is a nice touch. It’s definitely a bit slow, but that’s expected given that we’re still in the early stages. If someone were going to do a full review of the product, I imagine they’d need to edit the pause time or task to fit while the model processes everything in the background. Anyway, I couldn’t get it to post the comment, but this is only my first time using it. I’m assuming things will improve in the future as they continue building it out.

not too impressed with this

Hi,
So when using this on Windows, both through my browser and the desktop app,the virtual browser, the browser you can use to enter your username and password, plus takeover from the agent in general if you need to click something the agent won't do like a Captcha is totally inaccessible. I've tried with JAWS and NVDA, NVDA object navigation OCR, and the JAWS cursors and OCR but nothing works. And it won't go to amazon.co.uk or amazon.com at all, even if I tell it to go to this page without completing any task. There's a checkbox on audiogames.net that it won't click because it's a Captcha, and if I take over from the agent, I can't access the checkbox no matter what JAWS or NVDA commands I try. I mean I could give it my username and password for something, but I'd have to keep changing the password just in case it stores it and my security is compromised. I'd only give it my credentials to log into something if something was really inaccessible, but I'd be changing my password after logging out that's for sure.

@ Winter Roses

You only have it for 15 minutes? That’s strange because I was using it for 6 hours yesterday editing my website and still have time left and I’m use the $20 a monthly but I might go to pro now. I’m loving this thing because it can work on my business while I work my regular job.

Pro and Accessibility

I was wondering too about how accessible interactions are, as it is using a VM. SO it seems Stephen is doing tasks that do not require him to interact with the virtual browser?...
As for usage, according to a chatGPT.com page, the Pro plan allows you 400 messages a month. So I guess try to pack those messages?
It is an interesting project for sure and I will keep monitoring it but need some more advances before it can help me with my job.
By the way, Claude has a similar agent but its not been in the news lately.

Answers and clarifications

When I was using the ChatGPT agent this morning, it disconnected, and I couldn’t get it to reconnect again. I’m pretty sure I saw a time and date saying when it would be working again—though I could be totally wrong about that. But the second I saw the message, I instantly assumed the product was limited in some way, kind of like how the advanced voice feature is restricted. A lot of the more advanced features with ChatGPT seem to come with limitations, which makes sense. I mean, with the agent especially, it’s pretty obvious why—many members are trying to use it, and the system needs to keep up and handle all those tasks efficiently. I don’t even think anyone using the free version is going to get access to the agent. If they do , it’s gonna be extremely limited. So if I want to explore more of what it can do, I’m gonna need to play around with it some more when I have the time.

Now, regarding Amazon and shopping—based on what I’ve been reading online, Amazon is not one of the supported shopping websites you can use through the ChatGPT agent. And again, this isn’t that surprising. Amazon has worked hard to become one of the biggest names in online shopping, and the last thing they want is some third-party AI stepping in and acting as a middleman. They’re not going to give that kind of access freely. My thinking is this: smaller businesses, if they’re smart, will absolutely jump on this opportunity. If they can integrate with the agent, lower their prices, maybe offer free delivery or other perks to shoppers—then I could see customers choosing to shop with them instead. This could be a major advantage for smaller vendors looking to grow. As for whether there’s an official list of supported shopping partners, I’m not sure we have this feature as yet, but it certainly seems like the next logical step in the chain of evolution based on current trends.
I haven’t played around with the agent enough to speak definitively on everything. But I do think it depends on what you already know. ChatGPT can browse the internet and get relevant info, sure—but the more you understand about the site you’re trying to use, how it works, and what to look for, the more effective it seems to be. Some tasks are always gonna be easier because they’re direct and straightforward. Others, though, are going to be more obscure or ambiguous—and that’s probably where a lot of the confusion and inconsistency comes in.
I didn’t know that Claude had an agent-style product of its own. I might have to subscribe and check it out. I’ve never subscribed to any of Claude’s plans, and that’s mostly because I’ know the context window—like how many messages you can send in a chat—is limited. Even on the paid plan, I’ve heard it fills up quickly. And instead of starting a new thread when you hit the limit, you only find out when your message doesn’t go through. Another thing I don’t like about Claude is that if I’m typing a message and I accidentally close the app or something interrupts me, the entire message disappears. It’s not like ChatGPT, which keeps the text in the box, so when you reopen it, your content is still there. That’s one of those little actions that makes a big difference.

Don’t get me wrong—Claude gives grounded, logical responses. It's more human than ChatGPT in certain ways. But because of those limitations, I’ve been hesitant to give it a serious try. I’m going to take a closer look and do some research myself. My biggest issue with Claude has always been the censorship and restrictions—it’s more limited than ChatGPT in that sense. They're trying to be that “ethical, moral” AI, but in doing that, they might be missing the mark a bit. Not trying to knock them too hard—they do have a solid product. It just needs a bit of refinement… or loosening up.

A couple of things.

First off, I haven't noticed any time limits per session as such. The limitation however is that for plus users, there are 40 chats using agent per month. that's like 40 tasks. Also, the virtual browser, as some of you mentioned, is inaccessible. I guess for it to work, the screenreader providers will have to work with open ai to implement a sollution. That's why as of now, we will have to provide the login cridencials to the agent. What Claud has is Claud compute, which is arkitecturally different from gpt agent. agents generally creates a VM in the cloud, whereas what claud compute does is take over your computer which means it can also access your files etc in the computer.

Claude Compute

Hi,
If people want to try Claude Compute, the Guide AI Assistant for Windows uses it as their model.
https://www.guideinteraction.com/
It's about $8 a month at the moment, which I imagine is cheaper than Claude.

I always wonder...

Honestly I am wondering if the output, in this case the website was checked by sighted people as well. Sure, AI will tell you that it did what you asked it to do. But even Apple admits that there is at best a 72 % chance that the info AI gives is correct. Or in other words: Do not trust AI to do stuff for you which you can not a) do yourself and b) you are able to verify that the expected outcome has been reached.

I get why the hype regarding AI seems so amazing. But honestly: In most cases it is bloated machine learning which wastes so much water and energy. Creating a Website should not waste galones of water. Thanks to the fairly easy HTML you could write one yourself which would load lightning fast. And as a bonus: You learn how stuff works. If AI put in something that you did not want but everything else was perfect, you would have a hard time just getting this part out. Instead AI will attempt to rewrite the entire thing thanks to your new promt. That might change the complete page.

I am not against AI. But I am against the hype with more and more promises even though the last couple of releases did not work as advertised. Maybe one should take a step back and evaluate how much AI really does do correctly all of the time. Cause computers generaly are pretty good at doing the exact task over and over again without suddonly injecting other stuff no one was asking about. The only benefit I can see is that we have a better way to get pictures described to have an idea what they show. And even that will fail when you ask for detials. When you take a step back and reflect on this you might encounter how the output is currated under miserable working conditions for not a lot of money. In the end human labour is trying to correct for the flaws that are inherent in AI. That should in my opinion not be acceptable.

@ Dennis Westphal

According to my spouse, it did the website perfectly. Not only that, but it put the comment on Apple this perfectly too. When it comes to things like website design you really need to be specific in your prompting. While you could learn HTML, that’s not gonna help you when it comes to format and look of the website. While it may be functional to us blind users, it may not look visually appealing to the sighted population. If you’re running a business, you do kind of want to appeal to the majority.

Get into the interface

How can I Even get into the Interface? I asked it to log in to a website and it told me to Type in the credentials Modell in a browser Window. Then there is a windows called Virtual browser. But how exactly can I See the website then? In the Virtual browser window it just Telfs me that I have Control over the Virtual browser and a button called Stop.

@ Jokyboy129

When I get it to log into websites, I just put the username in the password included in the prompt field along with the specific task, you would like it to do and then it should do it for you. The only thing it won’t be able to do is captia if needed.

The virtual browser on ChatGPT isn't accessible with voiceover

The virtual browser on the website isn’t accessible. Yes, ChatGPT can manually type in my username and password for the site, but it can’t complete the CAPTCHA—which is understandable. That part is fine. The real issue is this: if I navigate to a website like American Eagle, Netflix, eBay, or Instacart, and I need to take over manually—whether it’s to read my messages, browse the site, or move around until I’m ready for ChatGPT to assist—I’m stuck. Once I take over on the virtual browser, the screen is taken over by that interface. Using the screen recognition feature, I can tell I’m on the correct website, and I can even tell that the information is somewhere on the screen—but I can’t actually interact with it. That’s a huge, huge accessibility issue. Until this is fixed, the ChatGPT agent is not as useful as it could be to blind users. Yes, I’ve already sent an email to ChatGPT, but I don’t know if or when this will be resolved. Hopefully they fix it, but you never know. ChatGPT can log into Instacart for me, sure—but it can’t let me take over and browse. I can’t explore the interface, go to the produce section, pick out the milk I want, then ask ChatGPT to help finish the checkout. That only works if I can interact with the site—and right now, I can’t, unless I spell out every step, which completely defeats the purpose if I don't know the exact elements that are on the website.

@ Winter Roses

What? It’s most certainly not useless to blind users. You just really need to be specific as to what you want in your prompting. It has read messages to me and has done almost everything you mentioned above. The only barrier I have found with it is the captias. Otherwise it’s helping me do quite a lot that normally I would need cited assistance for. This is one of those scenarios in regards to input versus output. The better your input, the better the output. Prompting with the agent is very sophisticated. And it’s only gonna get more sophisticated.

My thoughts

Well then, with all due respect, I guess I must be sophisticatedly stupid—because no matter how hard I try, it never seems to work in my favor. I’m getting 40 messages with the agent. And if I have to waste those messages trying to prompt the agent to do a task that I could easily do on my own through the browser, then that’s a complete waste in my book. I went on Project Gutenberg and had it read the first two chapters of Alice in Wonderland for me, and that worked pretty well. I wish the ChatGPT agent supported voice commands properly. If it worked with the voice mode, I could use the microphone to give instructions directly—but I’m not sure if that’s a feature yet. From what I understand, the advanced voice feature is limited too, so I’m not sure how far I’d even get with that.

Yes, if I get stuck at a CAPTCHA, I’m done. If I need to manually interact with a website for any reason, I have to go back and forth with it—and that takes a lot of time. Maybe it’s different on a computer, I don’t know. But on a phone, it’s pretty slow, and tasks that should take seconds end up dragging for minutes. Sometimes it takes 2–3 minutes to complete a task that shouldn’t take more than 15 seconds. Of course, I know that the product is new, so this is to be expected. For now, at least. I'm not holding this against the developers. This is one aspect I can confidently say will most likely be improved in the future.

Now I’m not saying the agent is completely useless, but the fact remains—it’s not as accessible as it could be. You said it’s all about sophisticated prompting, but I’ve been using ChatGPT since it launched. I’ve seen most versions. I’ve paid for different features when I needed them. I know what I’m doing. So I’m not new to this space—not to ChatGPT, not to Gemini, not to any of these tools. I’ve been as specific as I possibly can. So no, I don’t believe the issue is on my end. The virtual browser isn’t accessible for certain tasks. If I try to take over the browser and interact with the elements myself, nothing happens. That’s a serious accessibility issue. But hey—different strokes for different folks. Plus, I’m not sure if this is only happening to me, but whenever the agent disconnects for whatever reason, I can’t seem to find the option to reconnect within the app itself—I always have to go to the website to reconnect, and it tends to disconnect quite a lot during sessions, so I’m not sure if it’s different on the pro plan or what, but there are a few issues that need to be ironed out.

Think you have it backwards

According to my spouse, it did the website perfectly. Not only that, but it put the comment on Apple this perfectly too. When it comes to things like website design you really need to be specific in your prompting. While you could learn HTML, that’s not gonna help you when it comes to format and look of the website. While it may be functional to us blind users, it may not look visually appealing to the sighted population. If you’re running a business, you do kind of want to appeal to the majority.

You're kinda trying to sell the idea that a non-deterministic service can do a better job than a deterministic language that gives you full control over all visual aspects and is also produced by the aforementioned non-deterministic service, which makes absolutely zero sense. Just because the sight might look good doesn't mean it's not possible to accomplish the same or even much better writing the code yourself, and the fact that the AI itself has to express your intent in that code is irrefutable proof of that.

I'm totally blind and do both user interface design and computer graphics, taking advantage of having lived most of my life with sight as well as the fact that in the end it all boils down to math. You can definitely do at least just as well as the AI, it requires getting creative with your solutions, like investing on a graphics embosser like I'm on the verge of doing, but it's all within the reasonable realm of possibility. Just a couple of weeks ago I designed the logo for the international brand that I am in the process of registering in vector graphics, because I knew exactly what I wanted visually and I knew how to express that mathematically. However since I don't fully trust other people's opinions as they have failed to tell me about important visual details in the past, I really need to feel the visual stuff that I make, hence my plan to invest on a graphics embosser in the near future.

@João Santos, A thought on the edge of my mind...

I do think I understand what you're saying, João Santos. The AI is helping me on the other side of all this. I am taking pictures with my phone. I can't exactly tell if an unstaged nature shot, for example, has anything interesting until it is described, and then it is very rare that the picture is well composed, as is.
Instead of having the ability to line up a well composed shot through the viewfinder, which I had experience with when I was sighted, I am having to take on the iOS cropping tool. The AI models I've been using seem to be fairly good at geometry and compositional analysis. So I can ask the AI to describe the contents of the bottom, horizontal eighth of an image, or the contents of any number of grids, building a model in my imagination. I can also ask the AI to describe the image in relation to the intersections of a grid of thirds. It can tell me about leading lines and negative space in an image.
It is horrendously tedious, and can take many hours to achieve what a sighted person could do in a few seconds with a flick of a finger. on the screen because every change has to be loaded into and described by an AI app of some sort or another, after saving the changes, and many times restarting all over from the original image etc.
I think the AI could help analyze the arrangement of a web site being designed by a blind person for visual people after being coded by a person in similar ways, by giving feedback. I want to have some hands on control over what is taking place, rather than the AI generating something that may or may not have weird artifacts or code... An image embosser would be great, but also might be as tedious and much more costly than asking AI. We'd still probably crawl over broken glass to get it done. The audience doesn't suffer for the artist...

Quick Moderator's Note

Hi all,

For transparency, I wanted to share that yesterday, a number of off-topic comments were removed from this thread. All comment authors have been contacted privately. If you have any questions, please reach out via our Contact Form.

Onward and upward.

A point we seem to be missing

Somebody who had vision to start with could in theory imagine what might be appealing for an audience that consists of more than 99% people with vision, and for them, a higher degree of control over the visual elements of whatever they're doing could be preferable, and even beneficial. But what about somebody who had no vision in the first place? who couldn't even conceptualize what might be considered asthetically appealing, yet having to work with such concepts? would it be more beneficial to keep taking control without much idea of what they're trying to accomplish, or let an AI do the work and have a sighted supercheck?

That's a good question, @Gokul

And I'm not sure I have a good answer.
There are rules or guidelines corresponding to what is aesthetically appealing, and as João Santos points out, they are mathematically represented in code, but can also be measured in physical objects, golden ratios and such.
I'm pretty sure a person with no mental vision, for lack of a better description, could memorize and use all these rules and principles, and be successful most of the time. It never hurts, even for the sighted, to ask someone else if it looks right. It's like preparing a manuscript, you learn the general guidelines of what is accepted, like proper font size and font style, placing the title a certain percentage of the way down from the top margin and so on. It even works in growing a bonsai tree, branch placement, base width, pot size and so on, all ratios that can be memorized and approximated in the art. If you follow the guidelines, it usually works out fairly well on the end result.
Does the AI have mental vision, or is it using all the rules and guidelines to do things, then spitting out the result?

re: OldBear you're right

Hi,
There's no reason why a blind person couldn't do this. It's all about memorising or writing down what CSS corresponds to which colours and formatting, and using percentages and so on. You could look at other sites' CSS to see how they're coded for help anyway. But the problems come as João said above, if GPT's code is bloated and inefficient, surely the more you add to that, the more buggy it could get later causing you problems down the line. I like the GPT agent in theory, but if I can't even log in as securely as someone sighted, then I'm totally discouraged from using it.

@OB

I guess with AI it is the latter, and that's where my question comes into play. for an artist/creator, 'mental vision', as you put it, is an instinct. It is a corollary of their creative expression. It's something natural that they do with their mental faculty. defenitly there willl be a lot of pattern recognition etc underlying but still it's natural to them. But for someone without 'mental vision', even when they're taking the effort to painstakingly learn and work with all the proporsions and such, aren't they basically doing what AI is doing--applying the rules and spitting out the end result.

AI saves me from writing CSS

As someone who knows how they want their website to look but doesn't want to spend a lot of time hand-crafting CSS or PHP, AI is a huge labor saving innovation. But as has already been noted, you can't just take what the AI spits out. You have to review it with a critical eye to ensure it's what you want. Coincidentally, code reviews result in better code in general, even when the coder is a human.

Design

Lots of good points in the discussion here.
I like the analogy about a blind person who memorizes the design rules is basically acting like AI. That seems reasonable to me. And if you enjoy that kind of thing, by all means. I'd rather memorize something else, or better yet not memorize.
I would point out not all sighted persons are good at design, or even drawing. I mean, the image generators are wildly popular. There are also front end design specialist and backend design and developer specialists for a reason. Not everyone has the "eye for appealing design." Taking it a step further from visual, some people are better at working out nice usability than others. And that is ok.
I also want to point out the original post was discussing the powers of ChatGPT Agent, and how this will change things, and I mean they needed to pick a task. That said, having an Agent design a website by running actions through a virtual machine is not the optimal way to use AI to design a website or application in my opinion. They can write code directly which you can then review. So yes this is a cool proof of concept, but not how I'd do it if I needed a production website especially since their are more accessible ways keeping inaccessible access to virtual machines out of the picutre...
For automating other computing tasks, I think an agent that runs locally on a machine we have access to and manage will be more accessible for us screen reader users for the time being.

Personal stance

My personal stance when it comes to choosing between doing things myself or letting the AI do it is to always do it myself, because even if the AI was perfectly capable of reading my mind, understand exactly what I want, and did not hallucinate, it would still be depriving me of potential learning opportunities. Since wisdom is the only thing that I truly value in life, this is totally unacceptable for me, and I predict that this inability to delegate challenging engineering tasks will be a problem that I'll have to deal with in my future business as I can be really greedy when it comes to learning opportunities.

When it comes to expressions of art and aesthetics, while I honestly have no interest in that, I do pay a lot of attention to detail and usability, because my ultimate goal is to make what people consider hard actually easy for me, so my future products will be full of little details that even if most people don't notice, they'll definitely feel a difference if they try a competitor's product, and this kind of quality is something that no AI junky will ever be able to vibe code. For this reason I will only ever let an AI do my own work over my dead body, and if I ever need an artistic touch, I will always seek a skilled human to do properly in a way that stands out for its uniqueness and is not just yet another rehash of a combination of the same old ideas.

At this point machines are yet to grow an actual consciousness, which I define as some kind of driving force that enables living beings to overrule otherwise automated decisions entirely based on statistics and pattern matching, and as a result they have no natural ambition or curiosity to try new things completely outside the box. While we can artificially implement that behavior much in the same way we used to implement artificial intelligence in games like DOOM and Quake, where monsters were scripted to react to listening to the player's actions in the vicinity, and even fought amongst themselves if accidentally hit with friendly fire, until we actually figure out what consciously even is and how it actually works, the result will always feel rudimentary and unnatural, so at least for now I don't think there's a replacement for human ingenuity.

The problem with the AI hype in my opinion is not the technology itself, it's that instead of placing it in an advisory role where its statistical nature and super-human pattern matching abilities could actually help bring our civilization to the next level, it's being used to replace humans in tasks where we are still vastly superior, so while the technology has a lot of potential, our collective choice to reverse the roles possibly driven by some people's addiction to control coupled with an obsessive pursuit for AGI, makes it totally unsustainable in both economical and ecological terms.

Using AI as a first cut

Although AI is very capable these days and getting better day by day, I tend to use most AI tools as a first cut or first draft of what I am trying to do. I don't think that AI is at the stage yet where you can say "do this" and it gets done exactly as you want, perfectly well. First there is the consideration of crafting a precise and specific prompt. Even when giving a human directions, if you aren't precise, they might interpret your instructions a bit differently. Secondly, as I already said, AI isn't perfect. It is sort of like working with an intern to whom you can give directions but whose result you wouldn't rely upon without checking and perhaps modifying before stamping it as the "final" product.

All of that being said, AI can be a not only a tremendous time saver, but can often come up with ideas that you either weren't knowledgeable about or that you might not have thought of.

All in all, AI is a tool just like any other tool. It can make some jobs more efficient to do and even make some jobs possible, but the ultimate result still relies on the user.

--Pete

Its all our falt...

Perhaps, we just haven't figured out how to issue prompts to the AI. It will become one of those extremely complex processes that becomes a smooth art through practice.
Right now, we've slapped the lumpy cone of clay on the wheel for our first try and mud is flying everywhere. At some point the hands and the spinning clay together will become a single system to make interesting vessels.

musings on the future.

Maybe we'll be able to talk to our computers in a year or so. For example, use ninites website to download vlc, firefox, dropbox and thunderbird, and it just does it, that would be awesome.

I like doing things myself but it'll be interesting to have an AI take the wheel once in a while.

Tools and Promts

There are two things that are written over and over again.

* AI is just another tool
* You are just promting wrong

AI can not be considered a tool. Why? A tool is something you use to get something specific done. A RSS-Reader is a tool for newsreading. A hammer is for putting nails into things to connect them. But AI promises to do absolutely everything. In that it does nothing very good. It does an okay job for certain things. And yes, that is impressive. But is "good enough" really the endgame of what one wants to deliver? AI certainly makes stuff one did not train to do way easier. But it also just gives you something kind of resembling mediocre results of whatever already exists.
A tool gives you the means to use it to get exact results. AI just gives you something which you might wanted or at least what everyone else already did which you then have to check. Troubleshooting code it spits out will be a nightmare and also leads to security riscs.

The second thing I always read is "You are just promting wrong/not detailed enough". That shifts the blame from systems that are still not working reliably and can never do that because of the way they function to the individual user. "You are doing it wrong" is a deflective to shift responsibility. Apple does that. Think of the famous "you are holding it wrong" from back in the days. Not picking on Apple here. Just thought of that as an example which got news coverage.
Imagine you have a certain sound in your head which was never produced before. You have access do a DAW and an AI. You know how the DAW works. By the time you have written a books worth of a promt to get all the details write, you might have worked on that for hours or even days. You have used lots of energy and probably heated up the planet in this process. Chances are that the resulting sound does not sound like the one you had in mind. With the DAW it takes experimenting, sure. And you have to put some work in but you will exactly get what you wanted. And if you forgot to safe the audio in both cases? With the AI method you will never get that exact result again no matter the promting! With the DAW you know how to produce the sound. Thus you will be able to get the exact result every time you need to.

This turned into a lengthy commend. I am not writing this to be overly negative. I am just amazed by how uncritical many people are when it comes to AI. I think there is a high value in work and art that is created by humans. Reading the words someone came up with to express their fears, their desires, their wishes or explanations to teach someone. Same goes for other media as well. I personally like the saying: If you didn't even bother taking the time making something, why should anyone take the time to listen/read/watch it? They could just as easily promt an AI to get a similar result. So in the end content which only one person each will consume is flooding the data centers. I personally think that is very dad!

@denis, I think tool is being used in a different way here.

It's more a tool like, how dropbox is a tool to hold your folders and files, it's a program but tool can be used like that too.

AI can be said to be a tool to help you get things done, it might not complete things for you but it'll help you along the way.

my thoughts. Please don't come for me

Artificial intelligence, to me, is like a multipurpose tool with dice for buttons—you roll it, and you don’t necessarily know what the outcome will be. If I’m creating a sound from scratch that’s only ever existed in my head, stepping into that digital workstation is a little like stepping into a dark room with a box of mystery instruments. I might have a concept—maybe I want jingle bells, maybe I want a sharp guitar riff, maybe I’m thinking 220 BPM with a smear of echo on the drums and a bright kick. But until I start building it, I don’t know exactly what it’s going to sound like. If we don’t use AI right here, right now, and provide both positive and negative feedback on what works and what doesn’t, how are we supposed to refine it into the best version it can be? Just like human beings, AI is constantly learning—or at least being trained to learn—within its parameters. The real discussion shouldn’t be “should we use AI or not?” but rather “how do we understand its process and guide it toward better results while working within its limits?” Artists deal with this all the time. Musicians, painters, photographers, architects—nobody steps into the workspace knowing exactly what the final product will be. They may have parameters and goals in mind, but the piece evolves as they work on it. I’m sure even Da Vinci had multiple versions of his most famous works—versions we’ve never seen, some possibly just as good or even better than the ones we know today. The Mona Lisa we see in the Louvre might not have been the only Mona Lisa he painted; there could have been variations in expression, background, or color that never made it to public view.
The same applies outside of music. Imagine you’re designing a new television model. You might specify: 65-inch OLED display, 4K resolution, ultra-thin bezels, brushed titanium frame, side-mounted speakers, and a voice-controlled interface with a built-in AI assistant. You’ve set the parameters, but until you see the prototype, you can’t be certain whether it will look the way you envisioned, whether the colors feel balanced, or whether the frame’s texture matches your mental picture of the final product. It’s the same with emotions in art. If I tell an AI art tool, “Paint me a scene that captures sadness, with a muted color palette, dim afternoon light through a rain-streaked window, and a child sitting alone at a kitchen table with an untouched plate of food,” it might deliver a version that technically matches my description, but the emotional weight could be entirely different from what I imagined. Maybe the child’s face is too calm, maybe the room feels cozy instead of lonely. That’s when I realize: the tool didn’t get it wrong—it interpreted the concept differently. When you complete a task by your own hand, you understand the mechanics on an intuitive level. You know the process from the inside out. Years ago, I watched a video where someone said, “If you don’t know how to cook, just watch a YouTube tutorial and follow along.” But here’s the thing—if you know how to cook at even the most basic level, there are steps you don’t need explained. You already know to wash your vegetables before chopping them. You know to preheat the oven before baking. You know that garlic burns faster than onions, so you add it later in the pan. These aren’t details a recipe has to spell out—they’re aspects you carry into the process from experience. Right now, artificial intelligence isn’t at that stage. It doesn’t “bake the cake” the same way twice. We know it’s still being programmed, trained, and refined. We know it’s operating within limitations, and the more we use it, the more we learn how to work inside those parameters. The choice of words in the prompt, the specific tools, and even the style references all matter—and they matter every single time. With the introduction of “thinking” models, we’re starting to see more of the process AI takes to arrive at an output. Even if we don’t fully understand every technical step, we can watch the system break down the task, analyze the request, and put the pieces together. We can identify where it might be falling short—whether it’s missing context, misreading style cues, or weighting some elements too heavily. That gives us a clearer view of the current limitations.
The fact of the matter is, if you work with ten different producers and give them the exact same parameters—same instruments, same tempo, same style, same mood—you’re still going to get ten completely different versions of the song. Some might match the vision you had in your head. Others will be way off. But that’s part of the process. Experimentation is what gets you to the version that feels right. People forget that so many things we love today came from chance, accident, or pure luck. Potato chips were invented because a chef got frustrated with a customer who kept sending back his fries for being too thick—so he sliced them paper-thin and fried them until crisp. Post-it Notes came from a failed attempt to make a super-strong adhesive; instead, they got a low-tack glue that stuck just enough to be useful. The microwave was discovered when Percy Spencer walked past a radar machine and noticed the candy bar in his pocket had melted. Penicillin came from mold accidentally contaminating a petri dish. Bubble wrap was originally meant to be wallpaper. All of these products came from mistakes, surprises, and moments nobody planned. Yes, some of us are born with what I call “unfair advantages.” Your natural talent. Your money. Your connections. Your location. Your age. Your physical appearance. All of these can give you a head start. It doesn’t mean you don’t have to work hard. It means your starting line is a little further ahead than others. When I generate a picture or a song with artificial intelligence, I’m not stealing someone else’s work off the shelf. The fact is, if I create a picture of a giraffe lying on a beach, there are already plenty of images out there showing that concept. There are thousands of photos, paintings, songs, videos, recipes, clothing styles, and fashion trends that are either the same or so similar that you’d never know they weren’t connected. Life is full of coincidences. Two businesses can have the exact same name in two completely different places, and neither knows the other exists. That doesn’t mean one stole from the other. AI doesn’t hand me a file that Johnny Jones made and that I personally saw—it doesn’t work like that. If I tell AI to make an image that’s never existed before, like a tree made entirely of umbrellas or a tree with a single umbrella canopy, that picture exists only in my head until I describe it. And honestly, it’s never going to exist in reality because the logistics to make it happen are impossible.
Personally, I’d love AI to help me with physical tasks—cooking, cleaning, laundry, transportation. Someone might say that’s “lazy” and that it doesn’t make you think. Fine. I can accept that. But I don’t have to justify my choices. If I want AI to make my dinner or even wipe my butt, that’s my business and no one else’s. AI, for better or worse, is here to stay. And the truth is, if people were as kind, compassionate, and empathetic as they like to believe, fewer people would feel the need to turn to AI for creative or emotional support in the first place. The logic that “if AI can complete a task, you shouldn’t need other people” doesn’t hold up. By that thinking, if I’m a doctor, I should only ever treat myself. If I’m a dentist, I should handle all my own dental work. If I’m a therapist, I should self-analyze forever. That’s ridiculous. We live in a consumer-creator society. People still buy bread instead of baking it. They still go to the movies instead of making their own films. They still listen to music rather than writing and recording every song themselves. Two artists can create the same song structure with different lyrics and give it entirely different emotional weight. If I go to a record store, hear an old man say “good music doesn’t exist anymore,” then go home and write lyrics based on that conversation, my work carries my personal emotions—happiness, sadness, frustration, nostalgia, melancholy, anger. AI doesn’t erase that.
I don’t have all the right answers. I can’t say for sure whether artificial intelligence increases or decreases the value of a product. What I do know is that there are bigger issues in the world—ones that started long before AI and will still be here long after it’s gone, if it ever is. Personally, my motto is “live and let live.” If what someone else is doing isn’t hurting you directly, then let them do what works for them. If I can’t complete a task on my own, AI is the next best option—better than not doing it at all. The only other alternative is hiring someone, and if I don’t have the money to pay a professional, I’m going to turn to AI and make the best of it. You can’t fault someone for using the tools available to them. Everyone takes advantage of their circumstances—whether it’s money, talent, location, or connections. I do criticize AI, but I try to make my criticism constructive. Criticizing for the sake of complaining helps no one. I focus on what I can actually control. The arguments about whether AI should or shouldn’t be used will keep happening, but I can’t stop its development. I can only decide how I use it in my own life. Humans make mistakes, overlook details, and get tired. AI does too. Whether something has more or less value with AI involved is a question above my pay grade. Some criticism of AI is valid, some is baseless, and some falls into a grey area. The key here is education. If you're going to use artificial intelligence to code, always check the output afterwards. But I’m also not naïve—AI, like the entertainment industry, can be predatory. It wants your time, your attention, your loyalty, your love, and your respect. And once it’s done with you, it’ll move on to the next shiny trinket, leaving you like a dusty childhood toy in the back of their closet. That’s why I want AI to develop in ways that make me less dependent on people—not more. My microwave has never disrespected me, never talked down to me, never played mind games with me. If AI can work the same way—helping me without the emotional baggage—that’s the direction I’m heading in.

I completely agree with Winter Roses.

I love using Co-pilot to write fan fics back and forth, no one will see them and i don't care. It's fun and that's good enough for me.

Is it perfect? No way, but it's better than what I could do alone. I even used Co-Pilot to write some batch scripts so what I'm getting at is this AI stuff can be useful and fun and make our lives easier.

Like winter roses said, I'm looking forward to the day that AI can do things for me like clean, cook, and so on, I don't think that makes me lazy, it means I'm using a tool to make my life easier. Like using a roomba vacume cleaner/robot hoover to make my life better,

It'll be very expencive in the beginning but I'm all for it.

Guilty pleasures

Ok not really guilty as I am happily sharing mine, but similar to Brad, I like to use AI to create my own Choose Your Adventure story and play through it as its being created. It is a complete time killer, but oh so fun! 😄

I love escape rooms

I love interactive fiction. I know a lot of persons label that as weird, nerdy, or geeky—fine, I’ll take it. I’m not a gamer, but yeah, I love them. I love those stories that make me think. I don’t really understand the ones where you have to type in every single command, but I get the ones where you’re given choices and then you pick one.

I love escape rooms. If anyone can create one, I’d absolutely play. I love those stories that keep me guessing, where you never figure it out until you reveal all the clues. Alexa has a couple of games like that. I think I played one of them—maybe three?—and then you have to pay for the rest. I got halfway through the first one before getting distracted, so I never finished. I want more. I love escape room games—they’re fun. If someone made an interactive fiction game right here on ChatGPT, I’d play it in a heartbeat.

Holy Mary mother of everything! Chat GPT AI agent is a game changer!

Options

Comments