Those of you who have watched the live stream, do come over and let's talk about how incredible the demo seemed. And also about how much of it would translate into actual usable stuff. And also what this means for accessibility and Assistive AI if I may. And those of you who haven't, do go watch the stream. it's incredible! (in all caps)
Comments
The potential...
While this is all very fascinating, and I, too, am excited to see where AI and technology, in general, takes us, might I suggest that we all take a step back and let the programmers, engineers, and designers take their time in working out all the nooks and crannies of potential failure; so that we, the consumer, can have a wonderfully innovative, life fulfilling, and ultimately joyfully engaging experience with our new AI companions?
Ya know, rather than the tired and true new software every year that is full of plot twists and pot holes, that Apple has become synonymous with?
Just a thought. 🙂
Brian
While I absolutely agree with you there, I think it's also important to keep constantly engaging with the techies so that, 1: they realise how life-altering this kind of thing is for the community, thereby aligning product development that way, and 2: an experiencial perspective is incorporated so that we get a practically usable product rather than some fancy thing which is of no day-to-day value.
Seeing AI subscription?
Did I miss this, is there a seeing AI subscription? Brad, I think you referenced it.
And, apple vis, I'm sorry, I'm assuming it's my comments that were the problem. I know better.
Not yet.
I think someone else was talking about how it might become a thing, I'd not mind that if the chat gpt4o stuff is included.
Described Youtube and video games!
So, just imagine: ChatGPT gets a mode where you can share your screen to it, and you're playing retro games on your phone, through emulators like Delta or Retroarch. So, you start the game, like Final Fantasy 6, and then share your screen it to ChatGPT. It tells you what menu options you're on, describes scenes, and tells you how to get to objects. Reads dialog and everything.
Or, your watching old TV shows that won't get audio description, like Dark Shadows from the late 60's. No, I'm not that old but my grandma got me into it. And I think it's on Tooby TV. Anyway, you may have to tell it to use brief descriptions, but I imagine it could easily do it. Or maybe you're watching a Youtube video about old game consoles from Modern Vintage Gamer. GPT could easily describe the consoles, snippets of gameplay, all that.
Or, let's say it comes to Windows! Do you have a pair of headphones, like the Corsair gaming headphones that has a mostly inaccessible app? Just share your screen, tab around, and it'll tell you what you're on! Just some ideas, of course.
Omg
Am I the only one who has been incessantly checking the chat gpt app seeing if my account was updated yet? lol!
why say this, when i go to upgrade Access to GPT-4, GPT-4o, GPT-
so it should be out then?
when i go to upgrade to plus, I get:
Access to GPT-4, GPT-4o, GPT-3.5
that will be very missleading. you'd have thought it would have been in their app like now lol, and other apps like seeing Ai, be my eyes will have to implement there own model of it, also, i noted that when the OPEN AI video channel showed someone today pointing at objects and asking what they are in Spanish, it wasn't the same voice. why would i pay, just to get it, if it's going free? if i subscribe today, right now, does that mean in theory i should have it?
when select gpt 4o
hello, so when you go in the popup menu when in a chat, you can as a paid member select now: CHAT GPT 4 point 0, and i spelt it out for you, chat Gpt 3,5 and the new 4O, but what happens in 4O as the new voice and the images are not there etc. so how do we know when we have it? if you can use 4 and the letter o, chat gpt not 4 point and the number 0 but having both options, shouldn't the 4 letter o version let you use the images and speak directly to the Ai, as nothing is changing.
It’s been said time and time again
It’s coming in the coming weeks. You have access to the text, but not the video invoice features yet.
Lottie.
I’ve felt exactly the same for the last couple of weeks. I think Open AI had an iPhone moment and the world changed a little bit but I don’t think it’s going to be the same again. The iPhone changed everything and I think having non human intelligence thats this usable and approachable changes everything too. I really wish I still had that phone lol. To start with this is likely to be a mixture of novelty and tool for us but over time I think it just gets woven deeper into our lives, our culture and everything else. Once non human intelligence is this human, how can anything be the same?
They've just brokered a deal…
They've just brokered a deal with Reddit, which is pretty huge. I'm imagining servicing my 3d printers or doing a build, even lego, with visual guidance. Let's build something, I'll say and we'll build something. Is this the right part, I'll say, and it will be able to tell me. All very exciting.
Lottie.
Yep, I’m fine. I wasn’t even aware that was going on. I’d be interested to know what they were saying. I don’t use Mastodon. Well I have an account but I kind of stopped using social media when twitter went bad and didn’t make the switch. Maybe I’ll check it out tomorrow. Thanks for defending my honour. Did you challenge any of them to a dual? I so need a blind dual to happen. I’ll referee.
Two weeks to release
hi i asked gpt plus when will the voice and imaging features be available, and i got this when asked how will i know? The new video, voice, and image features for ChatGPT are being gradually rolled out. Initially, Plus and Enterprise users will gain access over the next two weeks. If you are a Plus user, you can check for these features by going to Settings → New Features on the mobile app and opting in. If you don’t see the features immediately, they should become available soon as the rollout progresses. For more details, visit the OpenAI blog.
@ Will
It lied to you. There is no new features under settings lol.
Question regarding video description
Wait, can someone please clarify this for me? So, if we want videos to be described on YouTube, in theory, would we share the screen with the artificial intelligence, or would we copy and paste the link? I actually tried this with Google Gemini, and it was able to give me a brief overview of the video if I posted the link. I was able to get an overview of the video, as well as scenes, outfit changes, background, and any text elements on screen. Of course, I had to ask the right questions in order to get the answers. Also, there were a few videos that I tried to be described, but it told me that it contains sensitive material, or that there was no metadata for it to tell me what was in the video, so I'm guessing that there were captions that the artificial intelligence couldn't read. Or, maybe they were not included. Either way, the potential is there, but it's not quite up to standards as yet.
I have some of the text features now. I also see a list of generators created by the chatGPT team, but it's not working as yet. I can also attach files and access memory across the conversations.
@Gokul
It is absolutely acceptable to hype up new technology through advertising, and word of mouuth. I was more referring to the plethora of replies in this thread that can be summed up in 3 words; "Gimme, gimme, gimme!!"
* one word I wonder what the…
* one word
I wonder what the uptake on using chat GPT is now after the announcement and before the full release? I'm certainly using it more, though that might be down to university deadlines.
In other news, I downloaded the mac app from a thoroughly unreputable source to see if I could get chat gpt working on my mac, but it's not rolled out to my account yet which is more of a frustration as they said, on Monday, that the mac app would be available the same day. I'll keep folks updated with that one as I wonder if there is a geographic rollout, IE the US get it first... Actually, that's a good motto for the states, US first.
mac app
From what I understand, you will be presented with a popup on the main site when you get access.
I'm not really understanding why this is a gradual rollout since it wouldn't be putting any additional strain on servers or anything.
I'll also update the thread as things develop.
Maybe it's a feedback thing…
Maybe it's a feedback thing. Release to the world and you get a lot of blow back if there are bits that aren't working quite as well. Also, I imagine having it on mac may increase load as soon as it is released, people trying it out etc.
I've set my DNS system to put chatgpt.com in the states, so that might help with priority, but probably not. I'd guess they are smarter than that.. Or our new AI overlord is.
GPT-4O: my observations
Hi all,
This post will probably be a long one and maybe a bit rambling. First, I can't wait for the video feature.
All that expression is awesome, but I imagine you can tell it to tone it down a bit like the guy was doing in the video with the lullaby when he was telling it to sing in different ways. I've been messing about with GPT-4O for the last week. You can get it to browse the internet for sources. Every time I ask it something now, it either gives me sources automatically, and sometimes I ask it to look something up on the internet just to be sure it's using up-to-date material, and it still provides me with sources. It always provides links to sources, these links aren't by default accessible with VoiceOver on iOS. I have to tell it to make the link a clickable element so I can activate it with VoiceOver. More often than not, I have to ask it several times. When I use GPT-4O on my Windows computer, the links are displayed as links no problem. It won't read an entire webpage to me, it just insists on summarising. Even though I can take a screenshot of a page and it seems to read all the visible text including the ingredients for a recipe. Now to editing images. I took a screenshot of a word document on my computer, and sent it to GPT. I told it to get rid of all the icons at the top, toolbars and desktop stuff etc., and just leave the text in the Word document visible. It did get rid of all the icons and additional stuff you would usually want to get rid of if you were editing a screenshot, but it altered the font and the words ended up looking distorted, and there were spelling mistakes because it added extra letters and deleted stuff too. So all in all not a success. Oh and, it provides you a link so you can download the edited image. Sometimes this link works, but sometimes it forgets to upload the file, and when you click on the link it tells you 'file not found'. When getting it to browse the web, I asked it to look for something on Amazon, it said it couldn't. When I asked it why, it said that Amazon has tools to stop automation and scraping. I don't know how true that is. I'm not sure if it's good at finding exactly what I want from a specific website. I asked it to find the latest recipes from a site I like, and it didn't find the latest recipes. It seemed to go a few pages deep for most of them, instead of just looking at the homepage. I can't wait to try and use the video feature when I'm out and about. Maybe I can get a lanyard. I don't know how comfortable that would be. I saw a video on YouTube where somebody put two phones with the video capabilities together and the two GPTs were talking to each other and singing.
Check this out.
https://www.youtube.com/watch?v=MirzFk_DSiI
I can't wait to get my iPhone and iPad talking.
Assuming you're using the…
Assuming you're using the app on iphone, I wonder if you can add the link thing into custom in settings, IE, always clean up links and display them in a way good for voiceover, then you won't need to ask each time. I'll have a go at this too.
Could try that
Hi Olly,
Even when you can eventually click on it, it doesn't actually display as a link which is viewable with the rotor. It's not a link at all. You just have to double tap on the text which says 'view recipe from BBC good food' or something like that. So you have to navigate by line to find the text, or even by word sometimes. But maybe I could try customising it too, at least it might work consistently even if it's not a proper link. I asked it how it was coding the link, and it told me it was using the 'a href' tag which is correct to mark a link as a link, but whether it was actually doing this, who knows?
Soon, they say.
Okay, my chatgpt app said that the new features of 4o will be roled out to me soon when I randomly opened it yesterday. I wonder, how soon is soon?
open AI's X account says 'in…
open AI's X account says 'in the coming weeks'. So, I'd suggest not holding ones breath. It will come when it comes. I think they probably just wanted to announce before google AI and are now playing catch-up. Keep an ear out for what Be My Eyes is up to as they are going to be one of the first partners to get it, or so I understand?
Humane already has it on…
Humane already has it on their AI Pins, and it has improved their speed.
They won't stop bragging about it on Twitter.
It's a shame that the product isn't accessible, all because they insist on that laser ink display.
Time to see what Apple has in store for us. I doubt it'll be as impressive as this.
They haven't exactly had the wow factor for a while now, not since they introduced the M1 chip.
Vision Pro did sound impressive, but not if you're totally blind.
Humane
Humane is an absolutely terrible product lol. Seem the sht show on YouTube, x, fb, read articles and spoke to a couple people who had it. Yes I said had. It’s a terrible product. Better do your research.
Social Media vs The World
Fun fact: Social media is not good research material. That is about as bad as using Wikipedia as a "credible" source when citing other people's written and/or spoken works. There is (sadly) an old adage, or expression if you will, that goes, "If it's on Facebook, it must be true". That can be said for every social media platform ever developed. Ever.
Social media has 2 fundamental purposes; first, to allow any body and every body to voice their own, and often misguided, opinion about any given topic. Second, to target the part of the human brain that controls addiction.
I said before, that everyone should be patient for this technology to properly develop so that we, as consumers, can have a wonderful experience with said new technology. Yet, all I am seeing here and elsewhere is people bickering that "they" do not have their automagical AI companion yet.
How about, ya know instead of demanding instant gratification, every body just take a step away from their keyboard for 5 minutes and just. . .
Breathe. 😀😝😆😇
The source is verified, it's…
The source is verified, it's Open AI's X account.
Doesn't surprise me, humane is a sinking ship.
So is apple. They bet on the wrong horses. Siri died. They tried to build a car, that died. Now they built a VR helmet that is dying. Cook has killed apple. I'm not too bothered. Something better will replace it.
I'm very interested in the hardware that is rumoured to be coming from Open AI. Own phone, own platform would be a day one purchase for me. They are the only giant that, as far as I know, have highlighted the benefit to blind indviduals. You know that they would have day one accessibility on what ever they release.
I'm still keen on the idea of a hat, they can call it the thinking cap.
Ollie
I'd call it the sorting hat.
OpenAI and accessibility
I would not bet on OpenAI thinking about accessibility. While it is true that they have highlighted the benefits for blilnd people, which is great, the web interface for ChatGPT does not have labeled buttons after being available for one and a half year. That is such a simple thing to fix, but it has not been done (I have reported it multiple times and so, I believe, have others).
Ah, very good point…
Ah, very good point. Counterpoint, the app is pretty accessible even if there was a brief period when it wasn't.
I don't know how much effort they are putting into the front end as I think a lot of their work is in the api for other companies to utilise. It is bad they've not updated though.
Sorting Hats & Automagical Wands
So, we have the sorting hat, and perhaps one day a viable haptic pen. Now if we can just get the fine folks at "Glidance" to redesign their model to look more. . . broom-like, we may have a winner winner, chicken dinner !! 😃
Open ai and accessibility.
It's quite broken, I tried signing up to their website and kept getting told, on Edge, that my password doesn't meat their requirements, on Firefox, I got nothing.
Turns out I was using my old email address that i'd used in the past to sign up to them and then delete the account after realising I'd not use this survice as much as I thought I would.
Turns out that once you delete your account; your email still stays around, I think this is wierd and makes no sense to me what so ever.
I've emailed them about this and got a response saying how they're commited to accessibility, I don't believe them but we'll see, and how i can't use that email and that they might be able to do something about that in the future, I don't believe they will but we'll see.
I've asked them why emails aren't deleted and haven't gotten anything back yet, I did mention I live in the UK, I think there might be european laws that might help me, but I'm not sure.
The not deleteing your email thing honestly really bothers me, I was under the impression that once you remove your account that everything was removed.
Siri does magic
you can already do spells with your iOS device.
Just say, "Lumos," or, "Knox" to turn the light on and off.
Must be one of those silly little quirks Apple put in to make Siri less boring.
Re: Be My Eyes on WIndows
As far as I know OpenAI will release a Windows app "later this year". Will be interesting to see what functionality that and the Be My Eyes apps will offer.
My guess is it makes more…
My guess is it makes more sense for BME to be on windows as more blind people use windows. Chat GPT is coming to mac, I think, because apple and open AI are courting each other like tentative lovers. Also, windows has co-pilot which uses similar underlying tech but allows microsoft to scrape data.
Re: Hey meta
@lottie you could actually make all the HP spells into a set of prompts that would accomplish different things, as wierd as it would be.
Join Be My Eyes Beta?
Any idea how to join Be My Eyes Beta on IOS?
Just for this GPT 4O, I am willing to have beta version.
in genthl, how can test
hi all, Not just for chat Gpt but can we join the be my eyes beta if we are serious about helping out, with testflight?
I don't think there is any…
I don't think there is any way of getting to it quicker, though I admire your ingenuity. Keep checking the Be My Eyes app though. Previous betas were announced on there.
virtual photographer?
Imagine if it could help take photos as well?
ex. I tell it I'd like to take a selfie.
It directs me on how to move the camera, counts down, snaps the photo then stores it in the app's photo gallery similar to Aira.
Something like this would be amazing, especially for those of us who struggle getting good shots.
I also love nature, so something like this could be immensely helpful.
even though I have never had usable vision, I do love taking and saving pictures.
Like, Google Frame, only more advanced?
Having an AI camera assistant would be very cool. Right now, at least for iPhone users, the closest thing we have is to FaceTime video call someone, and let them take a screenshot of whatever we are pointing our camera at.
Would definitely give a sense of independence and an opportunity to explore a (not so) new concept; blind photography. 😀
When will this be out?
Omg. I nearly can’t wait any longer! How many weeks until this is available? Will it be available on ChatGPT? Or Be My Eyes? Or both? I am a paid subscriber to ChatGPT.
@Brian: Photography
AI could probably help some to get a good shot. I think the zooming would be helped a great deal, and cropping would be made possible for blind people, where it is not currently.
The iPhone does have the leveling haptics and tilt instructions that help. It also tells you if and where a face is in the viewer. You can use your own face and a tripod to figure out where the edges of the picture will be and so on, and so on... But it's an exhausting pain to go through all that. It would be nice to take a burst of wide-angle pictures and work with the AI to crop and enlarge the subject of interest into a reasonably good photo.
I still think it would be useful to cross-check multiple AI descriptions from different apps to get a better idea of the actual photo. There was a video of two of them talking to each other somewhere on the site or this thread, but I think they were the same app.
Re: photography
With 4o's multimodel capabilities, we should already be able to do this. It should be as easy as giving a prompt instructing it to help you capture a good snap of whatever you want captured, including people. Also, the more detailed your prompt, I'm guessing the better will be the picture captured. But the ability to save this picture into the gallery might not be present with the gpt app as noone in there is likely to have thought of such a use-case. Maybe we'll find a way around that. I'm also excited for getting onboard the photography bandwagon.
photography
Just as long as there was some way of storing the image to be shared later.
I think the process of getting it to take the photo would be easy enough, it's just the matter of somehow saving it.
I think this would be more of a niche feature, at least in the beginning.
Quinton
It already lets generated images to be downloaded/shared with other apps. I wonder if it'll do the same thing with captured images if prompted correctly? once the camera functionality roles out that is. If that can be done, will it also create the image with a small description attached as metadata, again on being prompted properly? In other words, is it a magic wand?
Slight diversion
We've been talking about use-cases where this will impact our daily lives; we've discussed how this will be a fun toy, snapping pictures and such; but what do y'all think will be the impact of this in Visually impaired being employed? Will it make us more compitant in professional settings? especially in settings where one is required to deal with visual information?
editing photos
Hi,
I asked it to edit a screenshot I created. The edit didn't go well at all, I said this above I think, but it did give me a finished result as a link to download from. The link in the iOS app wasn't accessible by default with VoiceOver, I had to coax it a bit to get it to work. The links it generates come out much better in a browser where they're displayed like normal links. I think taking a photo and accessing it for later use will be doable.
Release has been pushed back…
Release has been pushed back for GPT until at least late June. Everyone getting your knickers in a twist about getting it, that's myself included, stand down. It's a way off yet.
Regarding photos. I inagie it could take a series of photographs and decide on the best one or even manipulate the image to fit a profile framing. I've been using it quite a bit over the last week and it is pretty remarkable what it can do now. You can code using natural language so I'd imagine you could say, I want to take a selfy, help me line it up and take a load of photographs and then pic the best one. This is assuming direct access to the camera which it must be able to have for the live stream of two images per second it is using for visual referencing.
Regarding assisting with employability, I really don't know. I think it's a rapidly shifting landscape as is with AI in the mix. I'd very much like to think it could allow us and more rapid access to work materials without specialised equipment though what jobs there will be, I don't know. Also, there is the privacy concerns. Some businesses may disallow use of AI scanning for sensitive data. It's back to the whole use of imaging for us and security concerns. For most the seeing of something is fine because memory is falible and it can't be reproduced. We need something to be taken in a far more permanent medium and processed by a 3rd party.