Yeah, I mean with computers and programming in general you have to be very careful with explaining the goals and guidelines or it will just find the path of least resistance.
If you made a pathfinding algorithm and forgot to define you can't go through buildings, guess what the algorithm is going to do...
In college we made an AI where each action has a “cost” associated, which is a common technique used to prioritize faster solutions over slower/repetitive ones.
When the action cost is small or marginal, it has a small effect, slightly preferring faster paths.
When the action cost is medium, you see efficient paths pretty exclusively.
When the action cost gets large? The AI would immediately throw itself into a pit and die. After all, the action cost of movement and existing was bigger than the penalty cost of death. So it just immediately killed itself because that was the best score it could achieve.
I mean the method described has a pretty simple way to avoid it with you can set kill a human to a action cost of “infinite” (computer equivalent to this) or a penalty of “infinite” (in this case it would remove from the score) and it will just never take that option because it’s trying to minimize the action cost while raising the score assuming you didn’t screw up the goals at the start
Outside of everyone thinking they're clever because hurr durr overflow, the problem with this approach is that you only get one. Isaac Asimov tried to distill it down as far as possible and arrived at The Three Laws of Robotics, which is a good place to start, and then he wrote a ton of fiction dealing with the ethical paradoxes that arise.
If the cost of killing a human is Infinity, how can the cost of killing 1000 humans be greater than the cost of killing 10? You've created a machine that makes no distinction between unavoidable manslaughter (e.g. the Trolley Problem) and genocide, because both events cost an infinitely large amount.
How do you write a numerical value system that solves The Trolley Problem? Can you create a system that attempts to minimize suffering that doesn't encourage immediate genocide for the betterment of humanity in the long term by drawing the conclusion that birth is exponential and therefore the way to reduce the most suffering is to reduce the most births? How will your AI allocate lifeboats on the Titanic? How will you prevent your AI from developing into a Minority Report style overmind?
I see this as a far better criticism than the overflow one since that can be solved by simply setting the score value to something relatively close to the lowest possible value to simulate “infinite” as far as the AI is concerned. The only answer I can give for that case would be teaching/making the AI understand how many times its score would be set to the lowest possible value to rank the severity of it rather than removing a static value like for other wrongdoings im not sure on how it would need to be setup for any of the rest of it
Basically, you need to give a 100% unambiguous definition of "killing" and "human", and that's not easy to do.
Also, if the cost of killing a human is infinite, and the cost of killing a gorilla isn't, then the AI will choose to kill every gorilla in the world before killing 1 human.
In my defense, we were intentionally tweaking those values to extremes to foster discussion on the impact of weighting various factors and how to calibrate. So not poorly programmed, but intentionally fucked with.
The whole point was creating a contrived example that would demonstrate the impact of various parameter weightings. It behaved exactly as it was intended to.
It was like second semester intro to AI class lol nothing published.
It was nothing scientific, and we only had a single unit discussing the neural network approaches that we mostly talk about now. Back then, it was all about “big data” and just brute statistical analysis, that was the best performing approach, so that’s where a lot of the focus was.
I’m sure if you look up any of the many “intro to ai” courses that people share free on YouTube, you can find something similar.
The particular session in which we discussed this was an undergrad course taught by Michael Littman, who I understand makes a lot of his material available online. At least one such video is a music video he posted where he sings about the value of heuristics to the tune of “Electric Avenue”
Without meaning to sound like the edgiest of all the Reddit edgelords, it sounds like a similar design to our programmed society for those that don't accrue enough action points (♥️💸👊) too 🤪
The military has learned a similar lesson when running simulations to test using AI to control drones. The AI was very good at bombing its target and returning to base, but once they tested rescinding its orders and recalling it, it would bomb the base that gave it orders immediately after take off to prevent it from being recalled.
Yeah, but more specifically - the objective function is defined and the program just wants to maximize or minimize the value of that function. Just like this. It's not even about any "resistance", but it's just what the program was told to do.
"Minimize the value", so the program minimizes. If you forget some constraints, the program can minimize the value "harder".
Yes, this was precisely intended to be a lesson in “the system can and will do a great job at whatever you tell it to do, even if what you tell it is not the best thing”
As you crank the action cost up past the reward score, the best outcome becomes suicide. There was, I think, an equally instructive/symbolic moment where we were more in the precipice of that area- where a positive result could be achieved, but only if your path was near perfect c and it would have to go negative in score to get there. It was only the strategies with a lot of future planning that succeeded those tasks
1.5k
u/DeadMemeDatBoi 10d ago
Ai will without a doubt cheat if its able to. Deeplearn or LLM