Reward-Based Training
Landsberg G, Hunthausen W, Ackerman L 2003 Handbook of Behavior Problems of the Dog and Cat. Saunders, Edinburgh
# 2003, Elsevier Science Limited. All rights reserved
The key to the effective use of rewards involves giving the reward immediately when the desired response is
exhibited (contiguous) but only when the response is exhibited (contingent). For positive reinforcement to be
effective, the reward must be given immediately following the desired response so that it increases the
chance that the response will be repeated.
Reward selection and timing
1. Anything that your pet enjoys can be a reward. This can include treats, food, a toy, attention, play,
affection, going for a walk, or even a rub of its head or belly. Since there is a great deal of
individual variation you must first choose the rewards that most appeal to your pet.
2. Whenever you give the pet something it enjoys, you are positively reinforcing whatever behavior
the pet is performing at that time, whether desirable or undesirable. Therefore, never give a reward
unless it immediately follows a behavior you wish to encourage. If you do so, the very problems
that you need to correct (e.g., fearful responses, grasping and biting, barking, etc.) may
inadvertently be rewarded by your responses.
3. Learn to earn: rewards should be used only as positive reinforcement for desirable responses.
Rewards must be withheld at all other times. In fact, before getting anything of value you should
use a training command to ensure that the dog is behaving appropriately.
4. Depriving the dog of rewards at all times except for training increases their motivational value. In
principle, depriving a pet of a reward increases the pet’s ‘hunger’ for the reward, while rewards that
are given too often may lose appeal. For example, food rewards are most effective when the pet is
most hungry, around meal time. Therefore, if your pet is fed free choice, it might be better to switch
to a feeding schedule. Training can be held just prior to meal times in order to increase the appeal
of the food rewards.
5. Reinforcer assessment: assess the motivating value of rewards and place them in order from most
desirable to least desirable. Use your dog’s most favored rewards or multiple rewards (reward
jackpot) to shape and reward newer, more difficult, or more exact training responses and use
lesser rewards for intermittently reviewing and rewarding previously learned responses or less
exact responses.
6. Timing: dogs learn fastest if the rewards are given every time, immediately as the dog displays the
desirable behavior. Later, a switch to a variable intermittent reward schedule will help to ensure
that the pet continues to perform indefinitely.
7. Secondary reinforcers: a clicker can be paired to a food reward by consistently sounding it just
prior to giving the food until it becomes a conditioned stimulus for food. The value of a clicker is
that it can then be used as a reward to immediately mark correct responses in a convenient and
precise manner, with the food being given shortly afterwards. While they can become effective
enough to reinforce responses without the need for food, intermittently giving the food treat
following these secondary reinforcers will help continue to maintain their value. In addition to
clickers, favored food rewards can be paired with praise, stroking, or petting.
8. Extinction: if you stop reinforcing a previously reinforced behavior, it will eventually stop being
performed. This is often the best way to stop undesirable behaviors that have been reinforced by
attention, praise, affection, or food (e.g., jumping up, barking). However, behavior problems that
have been rewarded intermittently will take much longer to become extinct.
Command–response–reward training
There are a number of training methods that might be considered and your veterinarian or trainer can help to
determine which might be most suited to you and your dog. In simple terms you need to give a command
and reward the desired response immediately and every time until the pet consistently responds. Begin the
training in an environment with few distractions when the pet is calm. Start with simple commands, and
gradually progress to more difficult commands in more difficult environments. Use mildly appealing treats at
first, and save the highly favored rewards for later when the pet is giving more improved responses in
difficult situations. This will encourage the dog to progress and improve.
If your dog doesn’t obey immediately, it may be that the command is not understood or that the pet is
distracted. If the pet does not immediately respond to the command, there are two alternatives: either to give
no reward and try to progress a little more slowly or to consider a physical control device such as a leash
and head halter to physically guide the dog into the correct response. Punishment should not be used for
training. Punishment for incorrect responses may lead to fear and anxiety, and while it may stop the
inappropriate response, it in no way encourages the pet to display the desired response.
Training with rewards: command–response–reinforce
If a command is paired with a response and there is immediate reinforcement, the pet should learn the
desired response for each command. Once a response can consistently be achieved on command, shaping
can be used to progress to more difficult responses in a variety of environments.
1. Food lure training
a) The movement of food is used to lure the pet into performing the desired behavior. Holding
and wiggling the food in front of the dog should lure the dog into a come, while moving the
food upward and back should lure the dog into a sit, while down and forward should lure the
dog into a down.
b) A cue word (command) is spoken as the pet is performing the movement.
c) The food is given immediately upon completion.
d) As training progresses, the lure is made less obvious by being presented in a closed hand,
and praise and stroking are intermittently substituted for the food reward.
2. Observe and reward
Observe the pet for desired behaviors and reward immediately. If a behavior can be anticipated, a
command can be given just prior to the behavior and then an immediate reward can be given once
the behavior is completed. Some dogs can learn to eliminate on command with this technique.
3. Physical prompt and fade
Give a command as you use a prompt such as a head halter or hand prompt (e.g., guiding the pet
into a sit position) to get the desired response, and then reinforce. Over time, the prompt can be
faded (i.e., gradually removed).
4. Negative reinforcement
Apply a prompt or correction that the subject dislikes until the desired response is achieved, and
then immediately remove it to indicate that the correct response has been achieved.
5. Shaping
Determine the desired response and reward behaviors that approximate the response. Once
successful, only behaviors that are slightly closer to the desired goal are rewarded, while less
accurate responses are no longer rewarded.
Punishment
1. No physical punishment should be used. Never hit the pet, throw it on its back, shake it by the
scruff, push the lips against the teeth forcefully, or use any other type of physical correction.
2. If you observe the pet doing something that is undesirable, interrupt the behavior in a manner that
is sharp, startling, and strong enough to immediately stop the undesirable behavior without causing
the puppy to be overly anxious.
3. If you cannot generate enough volume with a loud ‘no’ to stop the pet, then a shake can, air horn,
alarm, citronella spray, or water bottle may be used immediately as the word ‘no’ is given. Care
should be taken when scolding cats. Many cats will begin avoiding family members if yelled at
loudly or frequently. It is preferable to use a neutral stimulus (e.g., water gun) without saying
anything or even looking at the cat when attempting to interrupt misbehavior.
4. After interrupting the undesirable behavior, you should guide your pet into the proper behavior and
reward it, if possible.
5. A leash and head halter can be used to guide the dog into position if it does not immediately obey,
and a release of pressure and positive reinforcement given for success.
6. If the undesirable behavior occurs when you cannot interrupt and guide your pet into the proper
behavior, an unpleasant consequence associated with the behavior may deter recurrence. Aversive
noises and devices (e.g., Snappy TrainerTM, spray avoidance devices, motion detectors) and
aversive odors and tastes may be effective deterrents depending on the pet and the problem.