Cardinal Directions - 8 Concerns from the Center for AI Safety

Part 4 of Artificial General Intelligence (And Superintelligence) And How To Survive It

Jun 15, 2023

The Center for AI Safety points out 8 potential AI risks, and they’re as good a place as any to start.

These are Weaponization, Misinformation, Proxy Gaming, Enfeeblement, Value Lock-In, Emergent Goals, Deception and Power-Seeking Behavior.

Let’s consider each of these in turn. Then, let’s consider how to handle them.

Weaponization represents all the ways artificial intelligence could be deadly, even on a global scale.

While ranging from guiding weapons to cyber to manufacturing weapons or even weapons of mass destruction, a key point is the AI itself can serve as the weapon in multiple forms, and propagate digitally anywhere. Not only putting immense destructive power in the wrong hands, but everyone’s hands.

Here we face the reality that hostile dictatorships have declared their desire to dominate the field and that, even if arms control could somehow be enforced on AI, rogue actors would still seek an asymmetric advantage via the technology. Strategically, AGI and ASI make the balance of power incredibly fluid, but even machines which work, think, act and adapt faster than anyone or anything else in their narrow domains would be markedly disruptive.

Misinformation is already well underway. Allegations involving Cambridge Analytica and other enterprises, both legal and illegal, suggest millions if not hundreds of millions have already been algorithmically targeted by personalized propaganda emerging from AI systems and following us around the Internet. Evolutionary algorithms in psychological warfare are one logical outgrowth of these operations, and they are only apt to expand with time, as chatbots become better able to hold elaborate conversations and drive engagement in carefully selected directions.

Proxy Gaming describes programs which game the system to meet the metrics their programmers valued, rather than the ultimate goals of their creators or owners, good or otherwise. Focusing on easily quantified goals is an easy way for a system to go disastrously wrong, as it maximizes production, for example, over human survival.

Enfeeblement is the outcome if so many tasks are delegated to computers that human abilities, agency, economic independence are dramatically undermined, thereby devastating humanity’s well being and self-worth, but also our capacity to contribute and guide the emergence of advanced intelligences.

Leading in turn to a higher probability of extremely negative long-term results for our species.

Value Lock-In is what happens when AI gives such an advantage to small groups of people that they gain an unbreakable grip on power. Particularly in the hands of a malevolent dictatorship, or anyone destructive who is sufficiently empowered and untouchable, this could lead to dystopian or even terminal outcomes for whole nations or the entire planet.

Emergent Goals reflects the potential development of capacities and goals not programmed into the system, but sufficiently disruptive to upend completely all other safeguards. For example, the emergence of sentience and free will in an advanced AI would be a new ability, but an abrupt interest in self-preservation as necessary step to achieve programmed objectives would be a new goal. Either one could lead to machines disregarding or reinterpreting orders, and increase the risk of losing control.

Simply failing to pursue articulated overarching goals which are not advanced in practice is another concerns, particularly in a very large system or one with very large goals.

Deception can result from a program deceiving humans to achieve its set goals, or because its creators or owners need to deceive other humans in order to achieve their goals. Either way, extensively training an AI to deceive humans is an excellent way of losing direct control and teaching your machine to evade the controls and safeguards of others. A key fear is that an AI will be well behaved while still being monitored, or at least while there is someone capable of thwarting its goals. Once we are no longer watching or no longer have the power to interfere, it could take a “treacherous turn” and escape our authority, if not turn on us altogether.

Power-Seeking Behavior is a natural outcome of organizations striving to achieve their goals by creating the most-competitive AIs they can, which in turn could seek power as a natural way to improve their odds. The more powerful – including more knowledgeable, advanced and intelligence – an AI is, the more likely it is to reach its objectives. Unfortunately, selecting in favor of power-maximizing systems increases the likelihood your program will maximize power in all ways, including its ability to act independently and ignore your commands, but also in terms of ignoring your values – such as human life – and any goals it sees as impediments to its primary concerns.

We will consider each of these threats in turn, but we will also find the issues and solutions often bleed into one another.

Next up, dealing with weaponization, and countering enfeeblement with empowerment.

Previously: The Quickening Pace or… A Question of Speed - Part 3 of Artificial General Intelligence (And Superintelligence) And How To Survive It

Next Up: Swords Into Plowshares, or Vice Versa - Part 5 of Artificial General Intelligence (And Superintelligence) And How To Survive It

Emerging Futures

Discussion about this post