Note to reader: If the idea of “AI alignment” rings empty to you, feel free to skip this one, it will be uninteresting. Recently, Gwern wrote a story about an AI taking over the world. While well thought-out and amusing it is unrealistic. However, people have been using it to reinforce their fear of “unaligned AGI killing all humans”, so I think it’s dangerous and it might be worth looking at it line-by-line to see why its premise is silly, and why each step in his reasoning, individually, is impossible.
I don't think an AI could do X because I am "oh so close", I feel that Newton would have invented relativity if he was a bit smarter. A small difference in intelligence can lead to large differences in conceptual framework.
I think the differences in intelligence between humans are like the differences in running speed between cheetahs. (compared to the speed of light) The difference between humans is fairly small, so we don't see humans with magic godlike powers. (We appear pretty magic to other animals, but we don't usually think of that) The theoretical limits are really high.
"Assuming we live in a world where a “generic” machine learning model can figure out new exploits and infect hardware, we also live in a world where thousands of “purpose-specific” machine learning models have figured out those same tricks long ago."
This doesn't really follow. Sometimes being purpose specific gives a large advantage, sometimes it doesn't. The best code writing AI's and the best fiction writing AI's are basically the same large language models.
In this hypothesis, the genetic algorithm has found something better than conventional ML. And the new AI doesn't need to beat the pants off specialized algorithms, just be able to compete with them somewhat.
Most of this argument is "if X were possible, someone would have done it already". This doesn't really hold. There can be, and probably are, bugs that existing ML and humans will have great difficulty finding, and that a new, better ML algorithm could find with ease. Some ML tricks are general purpose. Once you find a good general purpose trick, performance goes up on many tasks at once.
I do believe AI is a potential threat, but it is often overhyped.
Awesome post, George. You hit on all the issues I have with the AI Risk debates: it's a real thing, there are definitely things we can do, but the risks are more on the level of the Theriac-25 - people can be hurt, killed, traumatized, financially impacted, but not "kill everybody and turn the solar system into paperclips".
In my own thinking about problem solving - coming in essays on my Substack real soon I promise - I think what is happening here is the invention of problems. Invention is distinct from discovery. If you create a fun game, like Dwarf Fortress or Cataclysm: Dark Days Ahead, you have invented a problem that did not exist, that people have the freedom to try to solve or not to solve at their leisure, and only as long as it is fun. That's honest problem creation.
Dishonest problem creation is the creation of problems that are not real, and thus not properly speaking solvable. In philosophy, I came to believe that the subject was riven with what were Wittgensteinian fly-stuck-in-a-bottle "problems," based on conceptual mistakes several concepts back from where the discussion was taking place. The rewards in philosophy - publishing, peer esteem, publishability, grad students - accrue to the inventors of problems who keep the field stimulated, keep special issues of journals being published, etc.
Think of all the ink spilled over Marxism, Freudianism, etcetera for examples of what I am talking about outside of philosophy proper.
For atheists, consider the Christian conflicts over the idea of predestination, the real presence of Christ in the eucharistic bread and wine, etcetera.
This problem invention does not have to be consciousky dishonest, but I think self-deception is often involved. Ask yourself, what would [currently prominent AGI risk thinker] be doing if the field did not exist? Probably having nowhere near the fame, peer esteem, and job prospects that they presently have, instead working anonymous jobs as (well paid but not particularly esteemed) software developers. It makes sense that they would persist in a self-serving delusion about the powers of Intelligence with a capital I, discount adversarial conflict in controlling growth, and the like.
Love that you wrote this. The first bits I have in a draft on the AI risk debate, but now I can safely just link here instead.