Given decent clarity about our fundamental values, the all-important question becomes: what causes and interventions optimize those values?
In this post I shall present some of the reasons in favor of focusing directly on fundamental values in these regards. That is, reasons why a good way to optimize our fundamental values is to reflect on, argue for, and work out the implications of, these values themselves. We may call it “the values cause”.
So why is this sensible? In short, because of the unmatched importance of fundamental values. Our fundamental values comprise the most important and fundamental element in our notional ‘tree of ought’. They are what determine the sensibility of any cause and intervention we may take part in, and hence what any reasonable choice of causes and interventions should be based on. An important implication of this is that the (expected) sensibility of any cause or intervention cannot be greater than the (expected) sensibility of our fundamental values. For example, if we have 90 percent confidence in our fundamental values, and then choose a cause or intervention based on these, we cannot have greater confidence in the sensibility of this cause or intervention than 90 percent. Indeed, 90 percent would be the level of credence we should have if we were 100 percent sure that the specific cause or intervention optimizes our fundamental values perfectly; a degree of confidence we will of course never have about any cause or intervention. Thus, we must have greater confidence in the sensibility of our values than in the sensibility of any action taken to optimize those values.
In a world where little seems certain, this relationship is worth taking note of. It means that, of all our beliefs pertaining to ethics, our fundamental values are — or at least should be — what we are the most certain about. This also, I would argue, makes them the thing we should be most confident about arguing for in our pursuit for a better world.
Getting Others to Join Us in the Best Way
Arguing directly for our fundamental values rather than causes and interventions derived from those values also, if done successfully, has the benefit of bringing people down alongside with us in the basement of fundamental values from which the sensibility of causes and interventions must be assessed. In other words, not only do we argue for that which we are most confident about when we argue for our fundamental values, we also invite people to join us in the best possible place: our core base from which we ourselves are trying to find out which causes and interventions that best optimize our values. Having more minds to help optimize our tree of ought from the bottom up seems very positive, and the deeper down they join us, the better.
And even if we do not manage to convince people to fully share our fundamental values, arguing for our values likely does at least make them update somewhat in our direction, which, given the large changes in practical implications that can result from small changes in fundamental values, could well be far more valuable than convincing others to agree with specific causes or interventions we may favor. Not least because it might make them more likely to agree with these interventions, which then leads to another, albeit somewhat speculative, reason to focus on fundamental values in practice.
For one counterargument to the argument I have made above is that people might be more receptive to arguments for specific causes or interventions than they are to the fundamental values that recommend those causes. Yet I think the opposite is generally true. I suspect it is generally easier to convince people of one’s fundamental values, or at least make them update significantly toward them, than it is to convince others of one’s most favored causes or interventions.
For example, it seems much easier to convince people that extreme suffering is of great significance and worth reducing than it does to convince them that they should go vegan. And in order to convince people of the importance of a given cause or intervention, it might well require bottom-up reasoning from first principles — in this case, fundamental values — to see the reasonableness of that given cause or intervention. It can indeed seem naive for us to think, after we ourselves have come to support a given intervention based on an underlying value framework, that we should then be able to convince others to support that intervention without communicating this very framework that led us to consider that intervention a sensible one ourselves.
So not only may people be more receptive to our fundamental values than the causes and interventions we support (an admittedly speculative “may”), it might also be that arguing for our fundamental values is the best way to bring people on board with our preferred causes and interventions in many cases, due to the likely necessity of following a chain of inferential steps. And again, if we invite others to try to step in our own inferential footsteps, we might be lucky to have them spot missteps. In this way, we enable others to help us find even better causes and interventions based on our fundamental values than the ones we presently focus on.
An instructive example of failure here, I think, is found in the strategy of most anti-natalists. The vast majority of anti-natalists seems to share the fundamental goal of reducing net suffering, yet their advocacy tends to focus exclusively on anthropocentric anti-natalism — a highly specific and narrow intervention. They appear to confidently assume that this is the best way to reduce suffering in the world, rather than focusing on the fundamental goal of reducing suffering itself, and encouraging discussion and research about how to best do this. If anti-natalists focused more on the latter, they would likely have more success, both by inspiring more people to take their fundamental values into consideration, and by inviting these others (and themselves not least) to think deeper about which other ideas they might be able to spread that could be more conducive to the goal of reducing suffering than the idea of anthropocentric anti-natalism (which seems rather unlikely to be the best idea to push in order to reduce the most suffering in our future light cone).
Reducing Moral Uncertainty/Updating our Fundamental Values
Another reason to focus on fundamental values is our own moral uncertainty. For given that we may be wrong about what we value, whether in a strong moral realist sense or an idealized personal preferences sense (or anything in-between), we should be keen on updating our fundamental values. And reflecting on and discussing them openly is likely among the best ways to do so. To restate this important point once more: given the immense importance of fundamental values, even small updates here could be among the most significant moves we could make.
And fundamental values do appear quite open to change. Indeed, values are contagious and subject to cultural influence to a great extent, as a map of people’s religious beliefs around the world reveals (such beliefs are undeniably closely tied to beliefs about fundamental values). Arguably, our values are subject to change and cultural influence to a significantly greater extent than technological progress is (cf. What Technology Wants by Kevin Kelly), which may be harder to influence and hence might be less of a leverage point for impacting change than focusing on values is. To put things crudely, technologies tend to be developed regardless, while how they are used generally seems more contingent. And arguing values seems among the best ways to impact how we use our powers.
Values are, to a first approximation, ideas, and ideas tend to be updatable and spreadable. In my own case, I used to not care about ethics at all, then I became a classical utilitarian, and eventually I updated toward negative utilitarianism and suffering-focused ethics as I came upon arguments in their favor. We should expect similar changes to be possible in others, and in ourselves, as we learn more and keep on updating our beliefs.
Not only would we all benefit from having our moral uncertainty reduced/our moral views updated, which is valuable in itself; it seems that we should also expect to benefit from the greater convergence on fundamental values that is likely to follow from mutual discussion and updating on them, even if the magnitude of this updating is small. The reason this is beneficial is that such convergence likely reduces the level of friction in our efforts of cooperation, and on virtually any set of fundamental values, success in achieving the most valuable/least disvaluable future seems to rest on humanity’s ability to cooperate. This makes such cooperation a high priority for all of us. While somewhat speculative, this consideration in favor of convergence on fundamental values, and hence, arguably, in favor of mutual discussion and updating on them, is important to factor in as well.
Fundamental Values and AI Safety
I have tried elsewhere to explain why I think the Bostromesque framing of the issue of “AI safety” is unsound. But even assuming it isn’t, I would argue that fundamental values should likely still be our main focus, the reason being that we have little clarity or consensus about which values to load a notional super-powerful AI with in the first place (and I should note that I find using the term “AI” in this unqualified way highly objectionable — for what does it refer to?).
The main problem claimed to exist within the cause of “AI safety” is the so-called control problem, particularly what is called the value loading problem: how do we load “an AI” with good values? What seems implicit in such a question, however, is that we have a fairly high level of consensus about what constitutes good values. Yet when we look at modern discussions of ethics, especially population ethics, we find that this is not the case — indeed, we see that strong certainty about what constitutes good values is hardly reasonable for any of us. This suggests that we have a lot to clarify before we start programming, namely what values we estimate to be ideal. We must have decent clarity about what constitutes good values before we can implement such values — in anything we do or create. We must solve the values problem before we can solve any notional values loading problem.
For an example of an unresolved question, take the following, in my view critically important one: What are the theoretical upper bounds of the ratio between happiness and suffering in a functional civilization, and can the suffering it contains, if there is any, ever be outweighed by the happiness? At the very least, these questions deserve consideration, yet they are hardly ever asked (not to mention the loud silence on the issue of the utilitronium shockwave that would seem, at least in theory, the main corollary of classical utilitarianism; are classical utilitarians obliged to work toward such a shockwave, contra the present dominant view on “AI ethics”, which seems to be an anthropocentric preference utilitarianism of sorts [see note on the goals of AI builders below], which appears very bad from a classical utilitarian perspective, at least compared to a utilitronium shockwave?).
Another example would be the aforementioned subject of population ethics, where many ethicists believe that we should bring about the greatest number of happy beings we can, while many others believe that adding an additional happy life to any given population has no intrinsic value. Given such a near-maximal divergence of views on an issue like this, what does it mean to say that we should build a system that does what humans want? What could it mean?
This issue of value implementation underscores the importance of convergence on values, as that would likely make any such project of implementation go smoother (an example of the general point about human cooperation made above). It could well be that trying to make mutual value updating happen among those who try to build the world of tomorrow — both in the realm of software and in other realms — is the better way to implement our values than to bargain at the level of the direct implementation with others who have more divergent values; that is, more divergent values than they would have had if we had put more effort into arguing for our fundamental values directly.
In other words, if humans are going to program values into “an AI”, the best way to impact the outcome of that process could well be to impact the values of these humans and humanity in general. Not least because the goal many of these AI researchers aim to implement in tomorrow’s software simply is “that which humans want” (Paul Christiano: “I want to see a future where AI systems help humans get what they want […]”; OpenAI: “We believe AI should be an extension of individual human wills […]”; the so-called Partnership on AI by Google, Facebook, Microsoft, Amazon, IBM, and Apple seems to have essentially the same goal).
I have argued that the future of “intelligence” on Earth and beyond will be shaped by a collective, distributed process comprised of what many agents do, which also holds true in the case of a software takeover. And the best way to impact such a collective process in a positive direction is, I think, most likely one where we try to impact values directly.
Whether we deem it the main cause or not, it seems clear to me that “the values cause” must be considered a main cause, and an utmost neglected one at that. Our altruistic efforts ought to be informed by careful considerations based on first principles, those principles being our fundamental values. Yet for the most part, this isn’t what we are doing. If it were, we would have better clarity about what exactly our first principles are; at the very least, we would be aware of the fact that we do not have such clarity in the first place. Instead, we go with intuition and vague ideas like “more happiness, less suffering”, believing that to be good enough for all practical purposes. As I have tried to argue, this is far from the case.
Saying that we should focus much more on fundamental values is not, however, to say that we should not focus on other specific causes and interventions that follow from those values, nor that we should not do advocacy for these. I think we should. What I think it does imply, however, is that we should try to communicate our (carefully considered) fundamental values in such advocacy. For instance, when doing concrete anti-speciesist advocacy, we should do so by phrasing it in terms of our fundamental values, e.g. concern for sentience and involuntary suffering. Thereby, we both do advocacy for a (relatively) specific cause recommended by our fundamental values and those values themselves, which invites people to consider and discuss both. It does not have to be a matter of either focusing on values or focusing on “doing”. We can encourage people to reflect on fundamental values with our doing.