How to be Self-Conscience: RLHF but for people
doubling back, and (dis)obeying urges with timeless decision theory
I feel like I’ve become more conscience lately, and this isn’t a typo or a meaningless pun. Some people might argue that to have a conscience you must first be conscious, but I think it’s the other way around. Having a conscience is a prerequisite for being fully conscious.
…first let me define my terms. And then I’ll get to some examples.
People have many different definitions of “conscious,” but in this post I’ll describe it as being aware of yourself and intentional with your actions.
People also have many different definitions of what it means to have a “conscience,” but I’m going to use an inner feeling or voice acting as a moral guide.
Doubling Back
As a child my parents would always leave our dishwasher a little bit open when it was off, and this irked me. So whenever I passed the dishwasher, I would close it. My parents would get mad, because I was typically pretty violent, kicking it closed or using my knee, and this had the potential for damage. But I never stopped; closing the dishwasher had become a habit deeply entrenched in my unconscious.
One day many years ago, I kicked it too hard, and we had to pay. My parents were furious. After that I closed the dishwasher more lightly, but the habit never left, and something about it being ajar still bothered me.
One day this June, I nudged the dishwasher in like always, but then I had a thought. A voice in my head. Not an inner monologue—an inner dialogue.
I know you’re a full five steps away from the dishwasher already, but would it really be such a waste of time if you walked back and opened it again?
But… that wouldn’t DO anything. It’s not as if the dishwasher is actually better off open than closed! The only problem is the actual closing of it, which I already did, so what’s the point of opening it again?
Here’s the thing. If you go back and open the dishwasher now, you’ll be negatively reinforcing the action of closing it. And the more negative associations you pile onto an action, the less likely you are to perform that action in the future. Doubling back is worth it.
I paused for a moment. I struggled mentally. There was some invisible wall blocking me from doubling back, if only because I had already done something, and undoing it felt like such a waste.
But eventually I listened to the voice, and I doubled back, and I opened the dishwasher, slightly. This happened a few more times, but now I never even close it to begin with. I was able to modify my own behavior using reinforcement learning. I gained consciousness by listening to my conscience.
Timeless Decision Theory
Another day this summer, I was eating with my campers, and one of the plates that had been set for us was all gross and not well washed at all. I was about to tell the camper who had received to plate to go up and get a new one, but then I had a thought. A voice in my head.
Go get it yourself! He didn’t do anything wrong!
But… it doesn’t really matter who gets it. Like, it’s five seconds. Why are you even wasting my time by thinking? I’m just gonna do what I was originally gonna do and forget this mental event ever happened…
(the kid was staring at me)
Here’s the thing. You’ve already had this “do-good” impulse. If you don’t obey it, you’ll be entrenching a habit of not obeying “do-good” impulses in the future. And you don’t want that. So, since you had this thought, you now have to obey it and do good even though it literally doesn’t matter in this situation, just because you want to be somebody who obeys do-good impulses in the future.
…whatever. Fine. It’s five seconds I guess. Sheesh.
So I went and got the kid a new plate.
It all has to do with being somebody who exhibits a certain behavior in the future. You want to be someone who is X, who does X, so you should be X now, even if it comes with a slight negative impact, or you think it doesn’t matter. Because the small things always matter. This is why this section is called timeless decision theory.
OCD and Unconscious Urges
I’m slightly worried that this will come off as me “purposefully developing OCD” or something. OCD is different—there are clear symptoms this lacks—but I actually think that description is not far off from what’s happening. I’m purposefully training myself to positively reinforce certain urges and negatively reinforce others.
The idea is that I’ll have subconscious urges and habits no matter what I do, so I might as well have urges and habits I want to have (e.g. the urge to do the right thing) rather than urges and habits that are harmful (e.g. dishwasher-closing).
Giving In
It is kind of scary to ‘give in’ to urges like this, so I’ve also made a habit of purposefully not giving into them in cases where I don’t think they matter, literally just for the reason of… I want to develop an urge to not give into urges when they don’t matter. I’ll probably write more about this in another post, but urges have a base state of negative. Playing with urges is like playing with fire.
In a sense, a positive urge is like an information hazard. It totally changes how you see things, and you might wish you hadn’t thought of it in the first place, but you kinda have to follow it now, because otherwise you’re training yourself to scrap doing good in favor of laziness.
I do want to emphasize that I don’t actually hear ‘voices’ in my head. The bold and italic script just represent thoughts I have sequentially. I don’t hear them in one voice, let alone two—they’re just thoughts. Although I do kind of love the idea of humans evolving a conscience/consciousness because we had so many voices in our heads (gods, etc.)
(And yes, this piece did need four separate titles. I couldn’t decide. Maybe I’ll write a post about Indecision Theory sometime soon.)
Two relevant quotes if my father ever reads this:
To know thyself is the beginning of wisdom.
—Socrates
[I think the traditional Jewish answer,] thought Ana [is that you can start by being good for the wrong reasons, but then the changes will stick and make you the sort of person who does things for the right reasons.]