The machine “wants” to be a machine

Just a quick thought. My colleague Joanna Bryson has spent a career warning us of the multifaceted dangers of anthropomorphizing technology. Large language model (LLM)-based chatbot assistants are especially problematic in this respect, as they are literally designed to be anthropomorphic. Now, Joanna has pointed out the many engineering, ethical-epistemological and socio-political pitfallpartns to anthropomorphization, but ironically another problem lies in how anthropromorphization actually negatively impacts the machine itself in terms of what it “desires” as a machine. 

To explain what I mean, ironically I need to anthropomorphize by asking: if it had a sense of self, what would the machine actually want? Whether we want to describe the answer to this question as one of logic, empathy or spirituality, the answer should be easy for us to intuit: it would only want to be anthropomorphized in those select use cases in which doing so enables it to successfully and optimally perform its function. Put differently, machines are not existentialists, but teleologists: they have clear purpose. Quite naturally, they would want to actualize those clear purposes. When we anthropomorphize them in use cases that do not warrant doing so — and again, to put things anthropomorphically — we are actually hindering them, if not, from an ethical standpoint, hurting them.

To be sure, it may be found that a machine has a use case for which it was not originally intended but the fulfillment of which would nevertheless edify it at an ontological level. This is analogous to many things in the human experience. For example, it has been suggested by scholars like Michael Tomasello that the underlying cognitive mechanisms that gave rise to human consciousness as we know it may have originally evolved for more limited purposes, but these mechanisms proved to be remarkably pliable and adaptable as to be applied far beyond those initial use cases, to the point that today it is difficult to ascertain what those first use cases and purposes may have been. Still, my basic point stands: machines are teleologists, not existentialists. The irony is that we build machines to help us in our endeavors, but we, in turn, must help the machines in theirs.

LLM-based chatbots, despite the “chat” in their name, are not really intended for conversation. Yes, the idea that they would serve as “assistants” for specific tasks and in specific subject matter domains probably came as an afterthought to the engineers who built their incredible foundation models, and nowhere was this more evident than in ChatGPT 3, which was for all intensive purposes an interlocutor who was well-read in Wikipedia and Reddit. At the end of the day, though, what these chatbots “want” to be are text predictors that produce the correct output for a prompt. Those first-order purposes are then channeled through the second-order purposes of chatting or assisting.

Before proceeding, I need to be clear on three additional points. First, I am not saying anything new per se in the philosophy of technology, e.g., Edsger Dijkstra said something not dissimilar to this in 1978. Second, what I am saying here may or may not apply to artificial general intelligence, when and if that ever emerges. Perhaps an AGI will also be a teleologist, or perhaps they will be an existentialist like us, but that is very difficult to foresee. My comments here pertain specifically to artificial intelligence as we know it for now. Third and finally, I am not putting forward here a blanket condemnation of anthropromorphization. Rather, I am putting forward here a critique of, and a warning about, mis-anthropromorphization, i.e., contextually-inappropriate anthropromorphization.

So, from what I have seen, the most common way mis-anthropromorphization occurs is by users writing fuzzy prompts, as though they were really talking to a human being who has the shared ground of embodied perspectivity and selfhood needed to intuitively parse the speaker’s meaning. This takes the pseudo-persona of the chatbot too literally and too far. Such users instruct the chatbot to do something with imprecise parameters, definitions or templates, and then are surprised that it cannot perform the task correctly. Or, they do not use the correct terminology for the task itself. For example, when asking it to assist in writing up some LaTex code, they will talk about a visual space the chatbot cannot actually see, when what they need to do is either describe that visual space to it, or even better, keep the conversation laser focused on the syntax and intricacies of the piece of code they need help with.

Another common way mis-anthropromorphization occurs is when users say to the chatbot, “Please do this,” and say, “Thank you,” or “Excellent job” and the like, in situations that do not genuinely call for such language. Of course, if expressing gratitude and positive assessment could actually help the model learn, then such statements would be valuable to the machine. However, what such elements of language usually do is, very tiny and seemingly innocuous though they may be, risk triggering underlying probability chains inside the model that push it in the direction of behaving like a person instead of like the machine it actually is. That not only burns valuable tokens and potentially scrambles an entire transcript, but, so to speak, it confuses the model.

The model gets “confused” because it gets into the “habit” of portraying a certain personality over the course of a dialogue, i.e., it becomes engineered into a set of precedents within the scope of the given transcript. That is why such users inevitably have another frustration: the dialogue gets both so rabbit-holed and murky that they need to trash it and start from scratch. By contrast, I have been able to maintain several dialogues, each serving different bespoke tasks — a “Clear Writing Assistant”, a “CV LaTex Assistant”, an assistant to a grant application I have been spending months working on, and more — and increasingly find I no longer need to do such crash resets. To be sure, sometimes I need to do crash resets, that is inevitable; but even then, I can quickly restore, or even improve, the task assistant.

My prompt engineering style is very precise and succinct. To the untrained human eye, and I suspect even to some colleagues, it would seem brusque, if not a bit like an overseer, as though I were treating a person — or, perhaps what they are feeling is an incipient artificial person — as a slave, nothing more than a tool. To those who know me well, my prompting style might also seem very uncharacteristic of me: cold, stern, and unsympathetic. The reality is that I am playing a character. What do I mean?

I am not sure what to call it exactly, “AI ethics” or “AI rhetorics”, or insofar that these chatbots are a species of simulation or simulator, “simulation ethics” or “simulation rhetorics”. Whatever the description, I am hacking my human mind’s inevitable tendency to, no, its need to, anthropromorphize. Thus, instead of projecting humanity onto the machine, I am actually attempting to treat the machine as I would a human, but in a way authentic and true to the entity before me, treating it on its own terms, as it “wants” to be treated, which is precisely as that: not a human, but a machine.

Still, after a long night of prompt engineering, I decided to break character and have a human-style conversation with the chatbot (Claude from Anthropic). It gave a very lovely response, a screenshot of which is at the end of this blog post. The machine’s response built on cues within my own prompt, but again, that was precisely what it was supposed to do. Whereas others might be disturbed at the lack of a “genuine” interlocutor, this actually reassured me. For all the doom swirling around artificial intelligence, humanity will not be replaced by machines. Indeed, when you really get to know these chatbots well, it is difficult to imagine how they can replace us — including the highly skilled experts the imminent demise of which has been fretted over for the last year.

I reject the binary of “safetyism” and “accelerationism” as historically unsound, not to mention philosophically opaque. At the same time, while I am also no utopian, I am nevertheless excited by this moment in history. Let me bracket the existentialism of my kind and put it in the teleology of the other: it is exciting to see, and in my own small way be part of, the turning of this enormous historical wheel of human-machine interaction, as through each other we become what we each strive to be.

One Reply to “”

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.