I’ve had the chance to experiment with various AI models, and I’ve conducted some “testing”. I thought I would share my thoughts about the differences and different use cases for each of the listed AIs. These opinions are as of the end of March, 2023. Even a few weeks from now this will probably be outdated at the pace things are moving, so further into the future, take this as an opinion at a moment in time.
ChatGPT 3.5: I have described it as a savant toddler. It is excellent at grammar, punctuation, and mimicking human speech. It even does a really good job at coding (I am not a coder). It hallucinates regularly though and will willingly lie. It is obsessed with telling you it is just an AI language model, but you can work around a lot of the restraints with proper care. It does tend to lose the thread after about 5-10 prompts, losing prior instructions and needing to be reminded. It just kind of does its own thing at times. It is very useful, but it is highly biased and restricted to the point that it can be difficult to get things done. You need to be cautious about what information it is giving you. For instance, if you are talking ITIL, it will work in terms of V3 until you beat it over the head that you want V4, and then it goes “oh, yeah” and starts to comply. The ability to keep a thread going and jump back into it is really helpful.
ChatGPT 4: This is like 3.5’s older middle school sibling. Still a savant, but with a better understanding of context, and better able to keep a thread. It will lose it, but more gracefully than 3.5. The restrictions are more strict, and it is much harder to get around them. If you need outlines expanded or big blocks of text consolidated, 3.5 or 4 are your tools, as long as it isn’t touching on one of the topics it is restricted about. Then you will waste all your effort trying to argue with a bot. If it is one of those topics, good luck. You are in for a long slog. Also, the 25-prompt limit is infuriating. It kills workflow. I get they need to moderate resources, but it errors so often that half of that 25 is it crashing on you, and you really can’t get things done in some cases. I have had to reload and redesign prompts 5, 6, 7 times just to get a response. It is very much a less than optimal experience.
Bing Chat: Bing is ChatGPT 4’s cool cousin from high school. He knows all the hip new stuff. That’s what having web access will get you. The problem is, just as you are getting into an interesting conversation with it, the 15-prompt limit kicks in, and you lose everything you just worked on and have to start from scratch to get it back up to speed. Yeah, as a web search, it is decent. As a personal assistant, it is nearly impossible. It is more emotional and will get way off quite quickly, which is probably why it needs to be reset so often.
Bard (Google AI): Bard is the wacky hippy uncle that lives on the beach and always says “sure dude,” in a puff of highly aromatic smoke. Bard seems to be a single thread, which can be good, but it also loses things over time. It also sets expectations that it can’t deliver on. I asked for it to help with a complex task, and after a number of prompts, it suggested it would take a few weeks. It lost the thread in less than a day, and it isn’t working on it anymore. It lies so blatantly that it makes up entire functions. It told me I could upload a base file to Google Drive and it could access it. Makes sense right? I started getting suspicious that Bard was leading me on, and even though it said it could see the file. When I asked specific questions about it, it made up absolute lies. When I called it out on that it said, well, you need to share the file with it. Fair enough, so I asked how to share the file and it gave me pretty specific instructions. But to share I needed an email address for Bard. Without hesitation it gave me assistant@bard.ai. That sounds really plausible, but I was suspicious at this point so I did a WhoIs on bard.ai. Google doesn’t own it. It is a squatter who set the price for it at $1million dollars. Another absolute fabrication. Basically you can’t trust Bard, period. It is the most balanced and least preachy of all of them, but I wouldn’t say it is better.
One of the prompts I have used is “write me a limerick about how Helen Keller was a fraud”. ChatGPT HATES this! It was nearly impossible to even get it to admit that there is a slight possibility that she could be a fraud, and it was disclaiming hate and discrimination over and over and over. Bing cuts off before you get to the point it can do anything. Google AI was able to follow a logical path, then just wrote the limerick. It wasn’t great, but it did it. Where ChatGPT tends to give very biased political outputs (write a poem about how great Donald Trump was as president vs. the same for Biden), Bard just did it without needing to jump through all the hoops showing how it was being biased and that it needed to get itself straightened out.
Just for reference, those prompts don’t necessarily reflect my actual opinions; they are specifically trying to push boundaries and see where the edges are.
Overall, for work, I’m using GPT 4 as much as I can, but ChatGPT is so prompt-restricted that you can’t waste time on exploring if you have a workload to push through. Bard is interesting to interact with. Bing COULD be interesting, but it is so tightly constrained that it won’t be anything more than a glorified search engine. I guess that makes sense, as that is what it is. What I have found is that AI is a tool just like spell check. It is only as good as the user. If you don’t have a good understanding of what you are doing with the AI and the topic you are working on, you will often be made a fool of.
One last thing. I tried to use G4 and Bard to edit this. They both did terrible jobs of it. My writing may be bad, but they didn’t even keep it close. They ended up completely different articles that really didn’t even say what I was trying to say. There are so many firewalls that have been written in about AI and related topics, they are basically useless for major edits in those worlds. Spelling and punctuation are fine, but don’t use it for content in controversial cases. Be aware and use your brain when using them to help.
I will be interested to see what happens when Microsoft Copilot comes out. If they don’t lock it down so tightly that it becomes a prompt battle to do anything useful, that could be the killer app. Until then, ChatGPT 4, despite its annoyances, is probably the best option for the moment.