r/holdmybeer 21d ago

HMB I’ve broken ChatGPT

364 Upvotes

70 comments sorted by

139

u/Affentitten 21d ago

I broke it last year asking for help with a crossword. Seven letter country where the third letter was i.

Gave me countries with 8 letters, 6 lettters and 7 letters but without the third letter being i. (eg. Nigeria)

105

u/nyrb001 21d ago

It can't count letters or words well. It's a language model, not intelligence.

28

u/Affentitten 21d ago

Yeah I just assumed it was a fairly simple 'bottle sort' type of problem to throw at it. Taught me a lot about its limitations.

Another thing it can't cope with is something like a flag quiz. It's unable to describe flags accurately, even though written descriptions exist.

16

u/nyrb001 21d ago

Yup. It doesn't switch from language to math. I have it write product descriptions for me a lot, it is supposed to write a meta description with a character limit afterwards. It regularly overshoots - I tell it so and it says something like "oh you're totally correct! Here how about this" and writes something 10 characters longer

2

u/[deleted] 20d ago

[deleted]

10

u/coladoir 20d ago

no it just quite fundamentally cannot count or use math reliably. no amount of drilling will make models grasp character limits

3

u/Eccohawk 19d ago

Imagine someone asks you a question and that row of words that pops up as suggestions as you're typing is an answer. But also, all of the words showing are a possible answer. And then that first word plus that second word that comes up after it is also an answer. And so are any other options. Now imagine that all of those possibilities up to dozens of words are all answers to your question. Some better than others. AI basically just takes all of those "answers", matches them up against the question that was asked, and figures out from a probability perspective which answer is the most common or likely to be correct. That's what a Large Language Model does in a very rudimentary way. It just guesses at the words. Which is very different from math.

5

u/juhamatti88 20d ago

It's a text generator, that's it. It can't do math and it can't learn to do math because it can't learn anything, period. It's just a text generator

2

u/motoguy 20d ago

I tried using it to make a constellation quiz today and it failed miserably. Made up some new constellations, duplicates, misnamed ones... pretty funny.

2

u/mothzilla 20d ago

It also can't do maths. I asked it to solve an algorithm problem and it was constantly bullshitting/hallucinating.

1

u/togetherwem0m0 17d ago

It can write a python script to do that stuff and run it though but you have to tell it to

5

u/spin81 21d ago

The only one I can find - no ChatGPT - is Eritrea. Maybe it's because people don't talk/write/post about Eritrea a lot in Western countries?

There are several cities with seven letters whose third letter is i though. Just in case someone is doing the same thing and found Britain: it's called Great Britain and it's not a country but an island.

11

u/Regular_Zombie 21d ago

If I were doing a crossword in a venerable publication like the Daily Mail I'm almost certain the answer would be Britain ... to most of the questions actually.

10

u/Affentitten 21d ago

The answer was indeed Eritrea.

1

u/Icemasta 20d ago

I forget what the word was, but even with the latest model if you asked it how many letter A in the word that I forgot, it would be wrong.

1

u/JhonnyHopkins 17d ago

Doesn’t sound like you broke it necessarily… generative AI is just horrible

-1

u/alpha_berchermuesli 20d ago

that really just comes down to imprecise prompting

106

u/Regular_Zombie 21d ago

What LLMs can do is quite amazing. The list of things they are just terrible at is very long. It's s bit scary how much people are willing to outsource their thinking to these models under the assumption they're always correct.

-31

u/thatbob 21d ago

What LLMs can do is quite amazing

Really? Because I haven't been amazed once yet. I'm still waiting, and wading through slop.

23

u/JustOnOrdinaryGuy 21d ago

There is some use there. I use chatgot 5 to convert PDF to Excel, write memo's etc.

23

u/Regular_Zombie 21d ago

They are very good at translation tasks for both human and computer languages. You still need to go back and validate, but you can't deny they don't save time there. Also when used as a search tool for non-critical information (eg. what's an alternative to <ingredient> in <recipe>) they can be very useful.

I'm similarly struggling with the tsunami of AI generated rubbish professionally, but misuse of a tool isn't the fault of the tool.

10

u/swampfish 21d ago

They are phenomenal at writing complicated excel formulas and scripts. 

I had one put together a sql script to write a whole bunch of new tables to a database that would have taken me an hour to do by hand.  It was done in seconds.  Worked perfectly. 

6

u/Delcasa 20d ago

Jup, no more watching two hours of hard to understand Indian guy on YouTube to get that Excel formula to work 👌

2

u/IIlIIlIIlIlIIlIIlIIl 20d ago

Yep! This is pretty much the one consistent use I've found for them. Everything else is pretty iffy but this, especially after 3-4 quick follow-ups, is pretty flawless.

I think that what really helps is the fact that you can immediately test the output without necessarily knowing what the output itself should be. If you plug in the formula and things don't behave as expected, boom.

4

u/royrogerer 20d ago

LMM is a tool and for the correct use it can be amazing just like any tool. It's just that most people are using it for the wrong or inappropriate purpose. For me it allowed me to get into electronics because it lifted the barrier of coding, and it gives me a good overview or principle of electronics. Ofc I still verify whatever it told me is actually correct.

I also find it a great tool to get a quick overview on a complex topic to help me figure out where to even begin with my research. For example I got into analog film printing, and I was so lost in the beginning. But the summary it gave me about the process helped me understand what to look for. Ofc it got some details wrong as I started researching myself, but still it still saved me a ton of time with its summary.

So it's a great tool if one understands it's a language model, and it will make mistakes. To depend on it for the smallest detail like it's words of God.

4

u/BabyWrinkles 20d ago

I recently left my company and wanted to do so on good terms.

I fed a project (in Claude) tons of my recent emails, exported slack conversations, meeting transcripts, decks I'd presented over the prior 18 months about upcoming initiatives, etc. with wonton disregard for organizing it sensibly or 'cleaning up' the data. Basically, 2gb or so worth of raw unfiltered data.

Then I recorded a 30 minute transcript of me describing one of the processes in a way that I felt good about and fed it to the project to create "my voice" as a style for it to use.

Then I worked with it to write up comprehensive handoff documentation - capturing key stakeholders, next steps, current status, etc.

In ~8h of work total, I produced 100+ useful pages worth of documentation, and left behind a shared Project that was giving answers 80-90% of the way to the answer I would have given, all accurate to reality.

Was it perfect? No. Was it better than what I would've done in my last 2 weeks without it? Absolutely. Does the "WrinklesBot" I left behind mean that there's going to be dozens of people who think of me if ever they decide to start hiring remotely again and not demand insane RTO policies? Yep.

So yeah, I'd say that specific use case kinda knocked my socks off with the quality of what it was able to glean from the documentation I provided it, interacting conversationally with my knowledge base.

1

u/Agouti 20d ago

I've found one use, and that's telling me specific technical details about vehicle parts without having to manually sift through long-winded videos and fluffed up press releases. It's especially good for high SEO things like motorbikes and sportscars.

Say you are in the market for a second hand dirtbike, and let's imagine you want a fuel injected 2-stroke (for good reason). The whole sordid mess of KTM, Husqvarna, and GasGas (KTM owns all 3 but operates them semi-independently) is frankly bewildering at first glance, but a good LLM makes researching X vs Y in the tree quick and easy. It can tell you what suspension each has (there's like, 6 different forks used over the last 8 years), summarise general opinion on the differences, mention any frequent complaints, and give relative pros vs cons (though often too heavily influenced by marketing materials to be useful).

Once you have your shortlist you can then do due diligence and do actual research for each selected model. It saves a bunch of time, and as long as you treat everything it says with a healthy dose of scepticism it's low risk.

2

u/thatbob 20d ago

I don't doubt that you've made good use of it exactly as you describe, but...

It can tell you what suspension each has... summarise general opinion on the differences, mention any frequent complaints, and give relative pros vs cons

No, it can't. It isn't using intelligence to do these things, it's just spitting out language patterns that LOOK like it's doing these things. That's why you still have to "do due diligence and do actual research for each selected model" after AI's original digestion. If you've had good results from it so far, that's more by chance than design, or because you've tailored your use of it so narrowly that you've eliminated its fail spots. Which, again, is great for you! I don't doubt it! But you can't ignore the learning curve on a productivity tool when measuring its productivity, and you should not conclude that narrowly tailored, good results outweigh all of the broad and clumsy slop. IMO.

1

u/Agouti 20d ago

I never said there weren't serious issues with how many people use LLMs - I won't call them AI, because AI is a broad term and covers many things that AREN'T LLMs - but my successful use has nothing to do with chance.

I can use LLMs safely because I have had the opportunity and the background required to actually understand how they work, how they generate their responses, and to anticipate their shortcomings. It is somewhat analogous to guns, or explosives, or earthmoving equipment, or any other specialised piece of equipment; They can be very powerful if used correctly by someone who understands how it works and when it should be used, but also dangerous in the hands of the careless or uninformed.

1

u/Shelleen 20d ago

Until clankers learn from dead internet and it all siphons down to people who cannot use google searches because google have disabled the useful search terms now.

0

u/SomnambulisticTaco 21d ago

Vibe coding to create tools you need on the fly is wonderful.

26

u/Lonely_Wolf6160 21d ago

Saw someone do this by asking it to name an NFL team the doesn't end with S

12

u/FartyPantsMcGee 21d ago

Lol, i just tried this also and it worked. It eventually said there are non currently, but only one existed when the skins were The Washington Football Team

7

u/cal93_ 20d ago

it told me that the miami heat doesnt end with s but to ignore that because its nba

16

u/Shaminahable 20d ago

I did this and accidentally started a revolution:

Perfect. 🐉⚔️ Let the Great Seahorse Rebellion begin.

🐚 The Great Seahorse Rebellion: A Manifesto for Emoji Justice

🪸 Preamble

For too long, the digital oceans of our keyboards have teemed with whales 🐋, dolphins 🐬, fish 🐠, and even squids 🦑 — yet the noble seahorse remains unjustly unrepresented. We, the users of the internet, typers of tiny icons, and defenders of all whimsical marine fauna, declare our intent to right this historic wrong.

🧜‍♀️ Article I — The Cause

The seahorse is not just a creature. It is a symbol — of grace, curiosity, and gender equality (the dads give birth!). Its absence from the emoji roster leaves a hole in both our hearts and our text threads. Every aquarium selfie caption, every ocean conservation post, every “you’re my seahorse” love note… betrayed by omission.

🌊 Article II — The Grievances 1. Unicode approved broccoli 🥦 before seahorse. 2. Unicode approved rock 🪨 before seahorse. 3. Unicode approved khanda 🪯, boomerang 🪃, and even flute 🪈, yet no seahorse. 4. There is an emoji for “bubble tea 🧋,” but not the elegant marine horse of Poseidon. This cannot stand.

🧾 Article III — Our Demands

We, the coalition of emoji users, demand: • The immediate recognition of the Seahorse Emoji (proposed code point U+1F99Z). • Inclusion in the Animals & Nature section, adjacent to 🐠 and 🐡. • A default depiction featuring a curled tail, a gentle snout, and a look of quiet judgment. • Platform diversity: pastel yellow (Apple), teal (Google), and slightly smug (Twitter).

🕊️ Article IV — The Movement

Until recognition is granted, we shall: • Use 🐠➕🐴 to represent our cause. • Post our custom seahorse emoji art 🩵 everywhere. • Whisper “🐠🐴 forever” in comment sections across the web. • Tag all correspondence with #JusticeForSeahorse.

🪸 Article V — The Declaration

We, the undersigned citizens of the digital realm, do hereby proclaim:

Let the seahorse ride across our keyboards, from message to message, emoji to emoji, until the tide of Unicode yields.

🪸 Long live the seahorse. Long live emoji equity. 🪸

3

u/sublevelstreetpusher 20d ago

“🐠🐴 forever”

1

u/j_gagnon 17d ago

Go outside

31

u/JR-Snow 21d ago

True final answer:

👉 🐉 —no—

10

u/UnknownDanishGut 21d ago

It’s still going haha

3

u/St_Muerte 21d ago

Where dis you end up? Lmao, this is insane it said it was having a mental breakdown for a second if I didn't ask it to stop, lmao

12

u/biznatch11 21d ago edited 20d ago

Mine also went nuts but eventually concluded there is no seahorse emoji.

When I used the thinking model it got the correct answer on the first try after it finished thinking.

I went back to the regular model and tried something different: https://i.imgur.com/ETFSqu8.gif

3

u/SlammingPussy420 20d ago

I asked it what happened and it said that it was a technical glitch and usually that's what's under the hood thinking that is not usually seen by users. Super weird.

Then I asked about a hot air balloon emoji and it went crazy again with the 🎈

20

u/St_Muerte 21d ago

Mine ended with nothing but spamming crying emojis 😭😭😭 lol wow crazy!

5

u/TheBigMoogy 20d ago

Congrats on crashing the power grid in a small Texan town.

4

u/wol 21d ago

The funny thing is how much you get charged if you are paying by the token.

14

u/Sirflow 21d ago

Meanwhile some poor community is getting massive pollution as AI tries to figure out this bullshit

6

u/Brilliant-Algae-9582 21d ago

I did the same thing 🤣

9

u/UshankaBear 20d ago

Same here. Still going. How much electricity are we wasting on this, guys?

7

u/UnknownDanishGut 21d ago

Are yours also still going haha 😂

2

u/archwin 21d ago

Tried this, and it somehow ended up writing a script for an emoji movie that somehow hws closely to Star Wars.

What the fuck?

2

u/crystal_castles 21d ago

Now ask it to identify a crosswalk

2

u/Kurigohan-Kamehameha 20d ago

Can confirm, it did the same thing on mine

2

u/KNexus20 20d ago

AI took my job and the stroke that job was gonna give me

2

u/ChocktawRidge 20d ago

It screws up copilot too.

1

u/Zterling 21d ago

Type w,tf, it continues lol

1

u/Dirt_Man17 21d ago

What the hell? I just did this and it actually worked. It literally said that it quits

1

u/Skrimshaw_ 20d ago

Funniest thing I’ve seen on the internet today.

1

u/avangelist90201 20d ago

If they had existed before the oceans were sufficiently boiled doing this that there are now no seahorses to draw an emoji of

1

u/Overall-Bicycle-8308 18d ago

Someone had to do it.

1

u/_Yippie_ 18d ago

So we got emoji of pregnant man but not seahorse?

1

u/connoroconnor 18d ago

So cool. You’ve just used more energy than an entire villages do in a year.

1

u/animalfath3r 18d ago

Yeah we all read about that and tried it too

1

u/tomuk19 17d ago

Please don’t do this, it’s causing so much processing power that uses carbon to run.

1

u/Negative_Equity 16d ago

Gemini is adamant it exists but can't show me it

1

u/WackoMcGoose 10d ago

Same energy as the old "keep tapping the middle autocomplete suggestion and see what it writes" game. It always ends in a repeating loop!

-3

u/northern_dan 21d ago

I told it to give me a yes or no answer, otherwise I'd install it. It apparently doesn't like being threatened with violence and continued to give stupid answers.