The single most effective and efficient technique available to those who work in the field of Dog Training is Differential Reinforcement. The problem is that most “trainers” don’t even know what this term means! This term is a fancy term which comes out of HUMAN studies, but I find that it applies directly to dog training.
Properly implemented, this approach will “solve” more than 80% of all the problems that you encounter when working with dogs. It will allow you to accomplish this by focusing on building new skills through the use of positive reinforcement, rather than punishing existing behaviors. It is not a new concept. It is not difficult to explain. It is very difficult to do, on a consistent basis. In order to use the technique effectively, it is critical that you understand how and why it works the way that it does. This is a very complex subject, which I have attempted to put into understandable terms and concepts. Hopefully, this article will help to provide you with that understanding.
Learning: A relatively permanent change in behavior that occurs as a result of reinforced practice. Relatively is used in the definition because not all of the things that we learn are things that stay with us for all of our lives. Few of us can remember all of our teachers’ names from K thru 12. Practice is used in the definition because there are very few things that we learn in our lives that take only 1 trial.
Behavior: A behavior is anything that we say or do. It must be observable and measurable. You can observe and measure “walking” for example, but you can only measure things like “thinking”. Behaviors are described in terms of frequency, latency, duration, intensity, and topography. Frequency refers to the idea of how often the behavior happens. Latency refers to the idea of how long after an event does the behavior happen (i.e.. after eating, after waking up, after being denied something). Duration refers to the idea of how long the behavior lasts. Intensity refers to the severity of the behavior. Topography refers to what the behavior looks like. Keeping all of these points in mind when describing a behavior will minimize the possibility of confusion or misunderstanding among your friends and family members. All behaviors, according to Learning Theory, occur only because they are reinforced. Some clarification on this point may help. In new situations, a dog does not always know the “rules”. As a result they may do a couple of things. First they may look around to see what others are doing. Second they may approach the handler, and (in dog body language) may ask…..”What is it that you want of me?”. Third, they may begin to interact with the environment in an exploratory manner. Those things that they do which have positive outcomes, they will do again. (This is called Thorndike’s Law of Effect) Those things that have a negative outcome, they don’t do again. Those things that have no discernible outcome are also things that they don’t do again.
Reinforcement: Anything that increases the frequency of the behavior that it is paired with is a reinforcement. It is important to emphasize the point that anything can be a reinforcer. It does not need to be something that is appealing to you personally! Never, ever second-guess the value of a reward or reinforcer for Fido. Let Fido tell YOU what has high value. Reinforcement and reward are often confused with each other. Both typically involve giving something that they like in return for something that they do. A reward however, does not lead to a long term increase in the frequency of the behavior that it is paired with. To illustrate, imagine that you are walking along the sidewalk and you see a sign on a telephone pole that says “REWARD”. Under the heading is a picture of a Dalmatian dog, together with relevant information on how you could get $100 for returning this valuable pet to its owner. You look up after reading the sign and see a Dalmatian walking toward you. Naturally, you catch the dog, return it to its owner and collect your reward from obviously happy owners. What you do not do is go back out looking for more Dalmatians! You have your one- time payoff for what you were “asked” to do. The likelihood of your returning “lost” dogs to their owners has not changed. Imagine now that instead of “REWARD” the sign had said “LOST” and appealed to the public to return a distraught little girl’s dog. You see the dog and deliver it to the owner. Not only does the little girl shower you with hugs and kisses, but the parents are so thankful that they insist that you take $100 as a token of their appreciation for your time and effort! What do you think would be your response the
next time you see an apparently lost dog? Was the action you took REINFORCED by the little girl and her family? The difference between the two scenarios is the process. The reinforcement process involves giving a positive after the behavior occurs and not in any contractual way. It is not a “You did that because I told you I’d give you this” process, but rather an “I’m giving you this because you did that” process. This may seem like a subtle difference, but it is crucial in terms of the impact that the two processes have on a dog’s behavior. In dog training lingo, this illustrates a dog who is self-motivated to work (he enjoys it), vs. a dog who is simply food or reward motivated.
There are two processes for reinforcement, positive and negative. Positive Reinforcement involves giving something positive (a treat, a pat on the back, a toss of a ball or Frisbee, etc.). Those things that meet basic needs (food to a hungry dog, warmth to a dog who is cold, etc.) are referred to as Primary reinforcers. Those things that acquire their value by being paired with a Primary reinforcer (treats during training, a belly rub, etc.) are referred to as Secondary reinforcers.
Negative Reinforcement is a little more complex. Technically it involves either escaping or avoiding something that is aversive, thereby increasing the likelihood of the escaping or avoiding behavior. Most of us have touched a hot stove at some point in our lives. Jerking our hand away is a behavior that is negatively reinforced in that we escape the aversive heat.
Furthermore, keeping our hands away from hot stoves is negatively reinforced in the future because we avoid getting burned! Theoretically, negative reinforcement results in faster learning and learning which stays with us longer than positive reinforcement. Because it requires an aversive to be present “up front” it is rarely, if ever, used in a systematic approach to teaching dogs (by reputable trainers). There are, however, situations and dogs which REQUIRE the use of aversives and negative reinforcers in order to accomplish the goal. An example might be training a K9 patrol dog. This is a situation where one BRINGS OUT prey drive and aggression in a dog, under controlled situations……most companion animals don’t do too well under this type of approach.
Punishment: Anything that decreases the frequency of the behavior that it is paired with is a punisher. As is the case with reinforcement, anything can be a punisher. There are two ways to punish, by presentation and by removal. Punishment by presentation occurs when a specific aversive consequence follows a behavior. Punishment by removal occurs when something positive is removed following a behavior. So withholding a reward if a dog refuses to perform a known behavior IS a form of punishment. Punishment does not have to be physical towards the dog. Indeed, I have found that non-violent, “message sending” methods of punishment work best with companion animals. Within this category is a technique called Extinction, which involves not reinforcing a behavior that has been reinforced in the past.
Schedules of Reinforcement: We cannot reinforce a dog’s behavior every time that it occurs, forever. It’s impractical in terms of our time. It’s artificial and intrusive in public settings. Most of all, the dog that we are teaching will eventually experience Satiation, which basically means that he/she gets “full” of the reinforcer (You may like steak, but not for every meal, for weeks at end!). To avoid satiation, and to help the dog to internalize the behaviors that we are teaching so that the behavior becomes second nature, schedules of reinforcement are used. As the term implies, these are a set of different ways that we can plan for the gradual fading out of planned reinforcement, without have a negative effect on the learning process.
Continuous Reinforcement (CRF) is the first and most basic of the Schedules. Under this schedule, every time the target behavior occurs, it is reinforced. Basic math comes into play here…..the ratio of reinforcement to behavior is then 1:1. This type of schedule results in a reasonably steady learning curve. It is most often used when we are teaching a brand new behavior to a dog, or when it is considered to be critical that the dog “get” the message as soon as possible (i.e. safety skills). When extinction is introduced, the behavior rapidly disappears.
Fixed Schedules of reinforcement are an extension of the CRF concept. Instead of one reinforcement for each behavior, a predetermined number of behaviors are required to earn a reinforcer. A Fixed Ratio of 3:1 then would mean that the dog would have to demonstrate the target behavior 3 times in order to receive a reinforcement. In the same manner a Fixed Interval of 3:1 would mean that the dog would be expected to demonstrate the target behavior in each of 3 intervals before being given a reinforcer. Ratio Schedules refer to the exact number of behaviors that are required, while Interval Schedules refer to time periods wherein the behaviors must be in evidence. (In theory, there is no limit to how high the ratio could go.) This type of schedule produces a learning curve that has “plateaus”, interspersed with fairly high rates of behavior. The plateaus occur when the dog pauses to enjoy the reinforcement. When extinction is introduced, the frequency of the behavior drops off fairly rapidly, although not as rapidly as with CRF. Examples of Fixed Schedules with humans are “piece work” (Ratio) and being on salary (Interval).
Variable Schedules of reinforcement are the ultimate goal of any training system. Like the Fixed Schedules, they come in both Ratio and Interval form. A Variable Ratio Schedule of 3:1 means that on the average the dog is reinforced for every 3 demonstrations of the target behavior. Reinforcements are administered on an apparently random basis, as far as the dog is concerned. Variable Schedules produce the highest rates of responding and the most resistance to extinction of any of the Reinforcement Schedules. Human Examples of the Variable Schedules are Lotteries (Ratio) and hunting or fishing (Interval). Most of us have most of our social behaviors reinforced on a Variable Schedule (Think of how often you are complimented!). It can be said that these Schedules induce a kind of paranoia in the dog, who never knows when the next reinforcement is coming. The reality is that the Schedule has to be carefully set in advance in order to ensure that enough reinforcement comes often enough to avoid a phenomena called Ratio Strain. This happens when the Schedule of Reinforcement is set too high and the dog “gives up” before the next reinforcement becomes available. This is an example of DISENGAGEMENT. Those who train with me know that an engaged dog is easy to train; a disengaged dog is a nightmare to train. An example of this would be the dog who has been rewarded every two times a behavior was exhibited, whose handler suddenly decides that reward pay checks will only be issued once every 200 times the behavior is displayed. For most dogs, this would constitute Ratio Strain. On the other hand, if the handler gradually moved toward a “once in 200″ pay schedule, most dogs would be able to adapt. Moving through CRF, Fixed, and Variable Schedules in a gradual manner, based on the dog’s abilities serves to reduce the likelihood of Ratio Strain.
Superstitious Behavior: (also referred to as “Extinction Burst”) This term refers to the “burst” of behavior that happens when we, as trainer/handlers, start a new program. Invariably, the dog responds to the change by increasing the frequency of the target behaviors for a short period of time. If you think about it, this makes perfect sense. Imagine what your behavior would be like if the front door to your home was “re-hinged” when you were away for a little while. Now, instead of opening “in”, it opens “out”. It’s probably safe to bet that the first few times that you use the door, you are going to end up “pushing” on it a couple of times (and even hitting your face), before you remember that the “rules” have changed and pull instead. Superstitious behavior, then, is like the person “pushing” as he/she learns what the new rules are. It is “habit” based, and often almost subconscious.
Consistency: This is an essential requirement for any training program, and it is critical to Differential Reinforcement. It refers to the need to, as much as is humanly possible, provide the same response to the dog that we want to teach. The closer that we can get to 100% consistency, the faster the teaching / learning process will proceed. Consistency might also be seen as synonymous with structure and technique. It takes practice and repetition.
Communication: Communication is obviously at least a two party process. One party has the intent of expressing something. The second party, by definition (or default!) Must listen. There is almost an implied contract in the communication process that goes something like this. “I’ll listen to you if you agree to listen to me….and if we are speaking the same language.” Invariably, the reason for communicating is to achieve a task or a goal that requires action by others. That action may be as simple as acknowledgment of what was communicated ( I’m going to the store, bye!) or so complex as to require the cooperation of the dog being communicated to (Can you sit quietly for me while I trim your nails?). The nature of the task and the relationship that we have with the dog that we communicate with helps us to decide what communication style to use. A deep understanding of the canine mind and communication system helps immensely!
Now let’s apply what we know to the most common cause of inappropriate behaviors – attention seeking. A typical formula, in Learning Theory, includes an Antecedent (A), a resulting Behavior (B) and the Consequence (C) that follows. The formula is often called an ABC. In an attention seeking situation, we see the following:
A. Need for Attention
B. The Behavior
C. Handler Attention
The handler’s attention towards the dog which follows the behavior is invariably a positive reinforcement for the dog. How do we know? Remember that a reinforcer is something, anything, which increases the behavior that it is paired with. It is logical then to conclude that attending to an inappropriate behavior can and does reinforce that behavior.
One of the obstacles that is necessary to overcome in attention seeking situations is the impact of the whole process on the handler involved. From the handler perspective, the behavioral formula above looks like this:
B. The Behavior
A. Attend to the Behavior
C. Peace and Quiet
Say we are working on a jumping dog, or an excessive barker, or a leash-lunger. The handler definitely resorts to a mental process like BAC. Yet the dog is resorting to ABC. Remember that escaping or avoiding something that is uncomfortable (punishing) is what Negative Reinforcement is all about. What we have then, is a situation where both the attention seeking dog and the handler involved are being reinforced by the process and the handler and dog are being Negatively Reinforced, which has a stronger and longer lasting impact!
We can get more clues as to what is happening in the attention seeking dynamic if we look at baseline information in a different way. Typically handlers begin to mentally document attention seeking behavior as it occurs over time, thus developing a baseline of how often it is happening. To optimize the use of baseline data, we need to remember that, according to the theory, behaviors only occur because they are Reinforced. With that in mind, the baseline then tells us that, aside from all of the reinforcement that we are providing through contact and programs and aside from the reinforcement available in the environment, the dog is telling us that at least x (where x is the baseline frequency) more is required. When we decide that the behaviors that we see are “inappropriate” and that we want to assist the dog to stop what they are doing (or at least decrease the frequency), what we are really saying is that we want the dog to “give up” reinforcement. In the absence of a real solid understanding of the reasons for what we are asking (and that is generally not going to happen), there is not likely to be any motivation on the dog’s part to participate. An example of this scenario might be if you decided to drop the temperature in my home from 68 degrees to 64 degrees, during the winter, to save money. Heat to someone who is cold is a primary reinforcer. If I don’t clearly understand, and agree with your plan and all of its potential benefits, I’m very likely to react negatively. On the other hand, if you go out of your way to offer alternative sources of heat (read reinforcement!) and Inform me – such as a big fire in the fireplace, nice warm sweaters and slippers, and perhaps the occasional hot chocolate, my motivation to change and accept your plans is likely to increase significantly! People – and dogs – accept change much more easily if they are informed as to the WHY’s…….The same is true when we begin a Differential Reinforcement program. It is absolutely essential that a wide variety of reinforcement is made available, for any and all behaviors that are appropriate and compatible with those defined as the target behaviors. The reinforcement that is given need not be primary reinforces. (food etc.) nor does it need, always, to be time consuming. Positive comments, short interactions, even the offer to interact by playing a game or helping with a task can be highly effective reinforcers. The message that you want to give to the dog is that there are other ways to get your attention, ways that you like and appreciate. How do you know how often you should offer reinforcement? Look to your baseline information. It’s a clear communication from the person concerning how often they need reinforcement. If the inappropriate behaviors are happening 25 times a day, then you will need to offer more than 25 reinforces. for alternative behaviors. A good rule of thumb is “The more the better”. Basically, offering more means that you are providing more teaching/learning opportunities, and the more of those that you have, the quicker the process.
The second component of Differential Reinforcement involves ignoring the behaviors that you do not want to see. Once again, remember that all behaviors occur because they are reinforced, therefore we can have an impact on their frequency by not providing the attention that we used to provide (read Extinction!). The simplest way to ignore a behavior is to turn and walk away. It is also possible to ignore by not responding to what has happened at all. Instead, you behave as if nothing has happened and introduce a new subject or activity. In other words, you ignore what has happened, and redirect the dog to something else. Activities that are chosen for redirection efforts should not be things that the dog does not like. They should be things that are presented as opportunities to interact with you and not directives or compliance episodes. Not all behaviors can be “ignored”, especially if they involve danger to a dog or person. In these cases, you can intervene as much as is necessary to ensure the safety of those involved, but, do so without comment and with a neutral expression. Remember that the dog that you are working with is expecting specific types of responses from you, and those responses are his/her reinforcement. If you provide something that has not been experienced before it is unlikely to meet the requirements of a reinforcer.
A major problem with the use of Differential Reinforcement is that handlers tend to focus on the “ignore” portion of the technique. It is not uncommon to see reports that detail how an dog “acted out”, was ignored, continued to ” act out” and was essentially ignored for a whole session. When this dynamic develops, it is very hard on both the handler and the dog being supported. Functionally, what has happened is that the behavior is being responded to (by ignoring it) but the communication from the dog goes unnoticed. The communication is a clear message, “I need attention!”. If you respond to the behavior but not the communication, you will not be successful.
In any situation that involves attention seeking, it is your reactions that are responsible for maintaining (reinforcing) the behavior.
I say it to every one of my clients, and it’s worth repeating again……
Unless you change the way that you do things, there will be little reason for the dog that you are supporting to change their behavior.
Until YOU change, your dog won’t change.
His passion, enthusiasm and love for the dog is evident in his many years of experience as well as his hunger to learn more and it is all this that has made him what he is today! He has had extensive training in the area of canine behavior and training! His studies have included 2 summers in the kennels of the New Skete Monestary, 1 year mentoring with Dr. Ian Dunbar, 1 year mentoring with Ed Frawley, and 2 years association with Michael Ellis!
He is a current Professional Member of the International Association of Canine Professionals and owns and operates his own dog training business with 45+ years of professional Canine Training experience in his kitty! You are in good hands with Scott!