The Bright Side of Positive Reinforcement Training... Understanding Counter Conditioning
Generally, my blog posts are geared toward the pet dog owner. This one is for the trainers, so if you are not a trainer, hang in there. I'll write something a bit more comprehensible soon. And if you are a behavior academic, my apologies. The language here is a mix of correct terminology and that which is comprehensible to other trainers. Please be gracious. There’s a recent Facebook post that is circulating the various animal training communities entitled “The Dark Side of Positive Reinforcement… ‘Faux Conditioning.’" It brings up a really important point that positive reinforcement trainers should take note of and learn from. I also think the article is deeply wrong.
The post is geared toward horses but applies to all animals because the science of behavior change is consistent across species, and since I’m a dog trainer, I’m going to apply this to dogs. Here's the crux of the article:
But what might surprise you is how many of [the qualified trainers] will somewhat bashfully admit that these protocols fail just about as often as they succeed-if not more so.
Less-experienced but well-read trainers will scoff and assume the protocol wasn't carried out well; that pieces were missing, steps were skipped, etc.
Certainly this is the case some of the time, but what about when a well-designed, expertly-executed CC/D protocol fails? Why is this happening?...
It might be that our currently held information about CC/D is what is slightly off, and it might be that traditional CC/D protocols are less-capable than we thought...
First, the article seems to make no distinction between positive reinforcement and desensitization and counter conditioning. While all of these typically happen simultaneously, they are not the same processes, and good trainers are constantly considering both types of learning in their training plan. A problem with a desensitization and counter conditioning protocol is not necessarily indicative of a problem with a positive reinforcement protocol because they are two different things.
Positive reinforcement is when something applied after a behavior that occurs given a specific set of antecedents causes the behavior to be more likely to be repeated given those same antecedents in the future. A simple example is, if I ask a dog to sit, the dog sits, and then I give the dog a treat that the dog wants, he is more likely to sit in the future when I ask him to in order to acquire the treat. The antecedent is me asking the dog to sit. The behavior is sitting. The positive reinforcement is the treat.
However, desensitization and counter conditioning are two completely separate processes that are not dependent upon the dog's behavior. Desensitization is the act of gradually helping an animal become accustomed to something that was previously distressing by introducing it at such a low level that it does not bother the animal and gradually increasing the intensity of the stimulus. For example, a dog that is too scared of a garbage can to walk by it on leash can eventually learn to feel safe doing so by first seeing the garbage can so far away that the dog hardly cares that it’s there. Maybe you practice some training. Maybe you play with the dog. Maybe the dog is just in a fenced yard, and the can is outside the yard but where the dog can see it. Gradually, over time, and only as the dog is comfortable, either the can gets closer to the dog, or the dog moves toward the can. At no point do we allow such proximity that the dog is worried about the can. They may look at it, make sure it’s not a threat, and then dismiss it before moving on to chewing sticks. Eventually, the dog and can are right next to each other, and the dog doesn’t care. The dog has been desensitized to the presence of the trash can.
Counter conditioning is yet a third process. This is when something unpleasant predicts something that is even more wonderful than the unpleasant thing is unpleasant, resulting in a change in how the once-unpleasant thing makes the learner feel; the once-unpleasant thing becomes a now-pleasant thing. To be successful at this, one needs to be cognizant of just how bad an animal perceives something to be while also recognizing that good things have varying values as well. I frequently counter condition dogs to my presence since I work with dog aggression cases for a living. If a dog doesn't like me, I figure out what the dog absolutely loves, and I make sure that my presence predicts the arrival of that wonderful thing.
Often, in order to be successful at overcoming fear of something, it's necessary to combine desensitization and counter conditioning. In the example above, if I'm too close, I may be considered so threatening to the dog that there is nothing in the world that is more good than I am scary. In these cases, I start far away from the dog and toss wonderful things. Or I place a visual barrier and produce meals or treats though slots so that the dog experiences my sounds and smells but not the visual stimulation of my presence, and gradually, as the dog feels more comfortable and begins to associate my presence with good things, I increase the amount of me that the dog experiences before the dog gets a super wonderful item. This is combining desensitization and counter conditioning. In the article I’m discussing, it's abbreviated CC/D. It's often written as DS/CC.
The author implies in her title and by referring to all the unsuccessful “expert” or “very skilled” trainers that positive reinforcement training doesn’t usually work to help an animal with fear or aggression. The author states that “Less experienced but well-read trainers will scoff and assume the protocol wasn’t carried out well.” And while I absolutely agree with the author’s point about overshadowing, I’m going to have to join the company of my “less experienced” colleagues who say it’s a problem with the training plan. And for those who don’t know me, my “less experience” includes thousands of private aggression cases over the last decade; working as a Manager of Behavior and Training for a large, inner-city shelter and a suburban shelter in the Bay Area; and running a board and train facility for aggressive dogs.
So here are some reasons why a dog that can tolerate triggers when food is involved may have a meltdown when the food goes away. This is what overshadowing really is and what we need to watch out for as trainers. 1. While desensitization and counter conditioning may be happening with the trigger when we reinforce behavior in the presence of the trigger, it also might not be because these are different processes. A high-drive dog is a dog that will work for a reinforcer in spite of difficult environmental circumstances. That’s the very definition and purpose of high drive. This is why a Malinois will bite a suit while being thwacked with a stick or hearing a gunshot, why a Labrador will tear up his feet while running through the woods and never slow down, why a Terrier will keep going after the rat even after the rodent has put holes in his face, why a Cattle Dog will continue herding those cows even if he gets tossed by a set of horns, and why a SAR dog will search and search and search when most dogs would just get bored and wander off. The reinforcers are so strong to these dogs that they are willing to endure or ignore and possibly don’t even notice the things they otherwise wouldn’t tolerate.
And so it is true for the highly food motivated dog who will work in spite of proximity to a trigger to get the food even though they still don’t like the trigger. And the moment the food is absent, the trigger becomes very apparent, and the next best reinforcer available is threat elimination, so the behavior changes in order to meet that function.
To be successful with desensitization and counter conditioning for these dogs, it becomes necessary to actively concentrate on the desensitization and counter conditioning without focusing on reinforcing behavior. Can the dog tolerate the trigger from 100 yards with no food? Good. Start there. At what point does the dog start noticing the trigger? Don’t go any closer. In fact, go back a couple steps. Let him look long and hard if he wants. Then, you can offer food for counter conditioning. Let them process what they are seeing, take it in, recognize on their own that it isn’t a threat. Then, deliver food. Or utilize BAT training so that the only real reinforcers are those the dog gets from enjoying the environment. 2. Sometimes, what is being desensitized and counter conditioned is extremely specific. For example, I’ve successfully desensitized and counter conditioned thousands of people-aggressive dogs to my presence, but that doesn’t mean that they like anyone else. The DS/CC is specific to me. When out in public, I may have succeeded in desensitizing and counter conditioning a very specific setup where people walk by, I have food, and I’m marking and reinforcing behavior I like. The dog may love this setup whereas a month ago, they would have lost their mind. But when the food is gone, this is an entirely different scenario. The DS/CC wasn’t specific to just seeing strangers. It was specific to the entire setup, and the presence of food was part of that setup. Here’s a study that demonstrates just how context specific counter conditioning is: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4793199/? Another component is that trainer and owner are different environmental setups. Very frequently, I can work with a dog who is dog or human aggressive, make a bunch of progress, and eventually take them to very busy and public places like the hardware store or a busy public park. However, as soon as the leash changes hands and the dog is with their owner, the dog acts like no training has ever been done. This is because of the associative learning that has taken place with that specific owner as part of the environmental setup. In the past, when the dog saw a trigger while the owner handled the leash, from the dog’s perspective bad things happened. The desensitization and counter conditioning has to involve the owner holding the leash and learning how to do the training themselves. If it doesn’t, it’s not likely that they will be successful with their dog.
3. And finally, there’s one other tough pill to swallow about associative learning. The association that is first learned is permanent. However, the association learned through counter conditioning only remains over time through continued pairing. This isn’t new science. We’ve known this for several decades. Here is just one of many studies demonstrating this: https://link.springer.com/article/10.3758/BF03197954 (You can download the entire study from this abstract.). This means that if a dog is afraid of other dogs and is successfully counter conditioned so that she can comfortably walk past other dogs in public, if the experience of walking past other dogs is not at least occasionally paired with good things for the rest of the dog’s life, the dog will eventually revert back to their former feelings of fear. Counter conditioning is something that has to be maintained for life. Sometimes, this is really easy. For dogs that used to be terrified of me, the relationship we build, the petting, the play, the positive reinforcement training we do, the fun we have together all maintain the good associations. When it comes to harnessing and leashing, if the dog enjoys what follows the harnessing and leashing process, that maintenance work is done for us (but remember that the introduction of food with the harness might be part of the context that has been counter conditioned!).
So it’s true. Overshadowing is a real thing. People can think they have successfully desensitized and counter conditioned when they haven’t at all because the dog is ignoring the trigger in order to have access to the reinforcer. This isn’t a problem with positive reinforcement or a shortcoming of DS/CC. When the DS/CC plan didn’t work, it simply wasn’t as well executed as we thought. Perhaps we didn’t account for all the different contexts in which we need to desensitize and counter condition. Maybe we’ve been reinforcing our high-drive dog for behavior and not giving them a chance to even process the existence of the trigger. It’s possible we got a little sloppy and thought we were done instead of maintaining our training for life. And sometimes, we just need more time because desensitization and counter conditioning is a process, and it doesn’t typically happen quickly.
Can a super skilled trainer make these mistakes? Absolutely! I know I have. And I’ve got one dog in my board and train program right now keeping me on my toes for this very thing. Wonderful trainers make these mistakes because they may be working with a dog slightly outside their current wheelhouse; because sometimes we get so invested that we develop tunnel vision, and we need to reach out to other trainers and collaborate in order to get an outside perspective; and sometimes just because even great trainers are still learning stuff. We may have already learned it, forgot, and need to learn it again. Been there and done that, too.
But ultimately, none of this is because there is a dark side to positive reinforcement training. Desensitization and counter conditioning work provided we understand the parameters in which it is taking place, account for all the various contexts, and make sure the dog isn’t ignoring triggers in order to access reinforcers. It also needs to be maintained. And failure at DS/CC either means that the job is merely incomplete, there really was an issue with the training plan or its execution no matter how good the trainer was, or the training wasn't maintained. I hate being fallible, but we all are. And do you know what I think is the most awesome thing about positive reinforcement training? It does not create negative conditioned emotional responses that we then need to counter condition. Well-executed R+ training almost always creates initial positive associations, and those are permanent.