Treat or Edible Reward – Understanding the Science of Learning

Indiscriminately offering treats can transform a lovely horse into a beggar, mugger, and biter, turning the horse owner into a candy dispenser. This article will teach you how to use treats in a way that improves, rather than harms, your horse-human relationship. I will use some technical terms in this article, so be prepared to read the technical section a few times. However, understanding these concepts will increase your ability to construct precise lesson plans that develop your horse’s potential quickly and without stress.

Providing edible rewards to motivate animals to perform a task is not a new concept. Edible rewards encourage dolphins to assist deep-sea divers, dogs to search and rescue, and mice to maneuver through complicated mazes.In fact, it was codified in the Law of Moses several thousand years ago: “You shall not muzzle an ox when it treads out the grain” (Deuteronomy 25:4).

Many horse trainers have found that edible rewards produce well-trained horses who engage in happy partnerships with their riders. One of the most unusual uses of edible rewards in horse training was by a veterinarian, Dr. William Key, in the late 1800s. By sugar-coating cards inscribed with numbers and letters of the alphabet, Dr. Key taught his horse, Jim, to read, write, and solve simple math problems.

Around the same time Dr. Key was working with his horse, Edward L. Thorndike, an early pioneer in psychology and behavior, was conducting experiments about learning. He coined the term “Operant Conditioning” to describe how animals learn. Thorndike’s Law of Effect (published in 1905) states that behaviors (movements, actions, responses to questions, etc.) followed by satisfying consequences tend to be repeated, while behaviors that produce an unsatisfying consequence tend not to be repeated.

In the 1930s through the 1950s, behavioral psychology researcher B.F. Skinner carried Thorndike’s research on operant conditioning further. He labeled each step of the learning process and hypothesized that all behavior can be described as a result of reinforcement (increasing the frequency of a behavior) and consequences (decreasing the frequency of a behavior).

Reinforcement – (those things that increase the frequency of a specific behavior) is broken down into two categories:

Positive reinforcement: The use of a reward that satisfies the subject’s natural desire, such as food, affection, physical contact, etc.
Negative reinforcement: Withdrawing a negative stimulus when the requested behavior is exhibited, such as pressure and release.

Consequences – (those things that discourage a specific behavior) are broken down into three categories:

Positive punishment: Imposing an uncomfortable stimulus for exhibiting a specific behavior, such as smacking a horse for biting.
Negative punishment: Removing something comfortable to discourage a behavior, such as depriving a horse of the companionship of other equines for rough-housing too aggressively.
Extinguishment: Ignoring unwanted behavior so that it is no longer exhibited, such as not offering a reward when a horse performs an unrequested trick.

Reinforcers are the reward tools used in positive reinforcement to motivate the subject to change behavior. Primary reinforcers are those that the subject naturally enjoys, including food, affection, praise, and free time. Secondary and tertiary reinforcers are things and actions a subject learns to like because they are “bridges” to primary reinforcers. They include money, words, clicks, and other signals indicating that a primary reinforcer is coming.

Primary reinforcers are offered when an animal has responded to the best of its understanding and physical ability to the handler’s request to perform a task. The act of offering primary reinforcers for each improvement of the performance of the task is called “shaping.” Consistently rewarding and extinguishing (ignoring) behaviors until the desired movement is perfected is called “molding.” Asking an animal to do several things in sequence is called “chaining.”

In human terms, consider the emotional difference between these three statements:

“If you complete this project by July 29th, your paycheck will contain a substantial bonus.” (Positive reinforcement)
“Your job description includes completing this project by July 29th.” (Negative reinforcement)
“If you don’t complete this job by July 29th, you will be fired.” (Positive punishment to discourage laziness)

The average employee will complete the job by July 29th, regardless of the motivation. However, numerous scientific studies indicate that subjects who are offered a reward, such as the bonus in the first example, are inclined to feel less anxious, more motivated, and will finish the project more quickly than the subjects in the second or third example. The subjects in the third example will feel more stress and finish the project more slowly than the others. Similar studies show that the results are the same whether the subject is a mammal, fish, bird, or other creature.

I teach how to use positive reinforcement to develop equine intellectual and physical potential and to rehabilitate unsafe horses. At the beginning of each training session, we ensure the horse is relaxed and focused by asking for responses to simple requests during the grooming and tacking process. We may ask it to touch various grooming utensils and tack, lift feet, stay, come, cross front or hind legs, turn on the haunches or hindquarters, and back or bring forward individual feet.

Our seven-point procedure for requesting a behavior is as follows:

Preparation – We say the word “Aa-nd,” which is our half-halt word that tells the horse to focus or prepare for a request.
Request – The vocal word for the requested behavior and/or the body language gesture. This is the most difficult step of the process. Sometimes we act out the action we desire so that the horse can imitate us. For example, we may put our feet on a pedestal to encourage a horse to do the same. At other times, we may actually place the horse’s foot on the pedestal.
Tertiary reinforcer – The encouraging word “Good” is a tertiary reinforcer or bridge that informs the horse it is on the right track.
Response – The behavior the horse exhibits in response to the request.
Secondary reinforcer – Once the horse has exhibited the behavior to the best of its understanding and ability, the handler says, “Nice” (the secondary reinforcer or bridge).
Primary reinforcer – This is offered immediately after the secondary reinforcer. It may be an edible reward, a caress, or praise.
Extinguishment – If the behavior does not approximate the request, we skip steps 5 and 6 and ignore the behavior. We either repeat the request, break down the request into smaller steps, or move on to another activity.

Sometimes the tertiary reinforcer (step 3) is skipped if the horse responds quickly. At other times, especially when chaining (more about chaining below), the secondary and primary reinforcers (steps 5 and 6) are skipped, and another request is made.

Once the horse is tacked up, this procedure is used during in-hand work, longeing, and mounted work. Special attention is given to ensure that the horse stays relaxed and focused at all times.

When a horse has learned several behaviors, the behaviors can be chained together in a variety of sequences such as: walk four strides, trot five strides, canter three left-lead strides, flying lead change, canter three right-lead strides, trot two strides, and halt. Each transition of the sequence is initiated by a half-halt: “Aa-nd” and a squeeze of the reins, lead-rope, or longe line. The handler says, “Nice!” (the secondary reinforcer) and offers a primary reinforcer at the end of the sequence. The handler may say “Good,” the tertiary reinforcer, during the sequence.

Appropriate Edible Rewards

Sugar-free extruded hay pellets are ideal as they don’t cause pockets to become soggy and are liked by most horses. Although messy, many horses love tiny bits of apple, carrot, grapes, or watermelon. Do not feed sugar, fruit, or carrots if your horse is overweight, insulin-resistant, has Cushing’s disease, founder, laminitis, or other metabolic problems.

Size of an Edible Reward

The pieces of food should be no larger than the tip of your little finger. Horses are nose-breathers, so when they move, they close the back of their throat and seal their mouth to create a clear airway. Consequently, edible rewards must be small enough to be chewed and swallowed before the horse is asked to move. Edible tidbits motivate the horse to stay relaxed, focused, and eager to exhibit the requested behavior. If the horse becomes satiated, the “reward” will no longer be an effective incentive. However, most horses, once they learn a skill, are seldom interested in an edible reward and are quite content with praise.

Timing for Offering an Edible Reward

It takes practice to perfect one’s timing. The basic rules are:

Never offer food without asking your horse to do something. It can be as simple as taking a step forward or backward.
Always say “Nice” or another secondary reinforcer before offering a food reward.
The secondary reinforcer should be used immediately after the horse executes the requested behavior, and the primary reinforcer should be given no more than a few seconds later.

Be patient because insisting that a horse exhibit a specific behavior transforms a request (positive reinforcement) into a demand or pressure (negative reinforcement) which is guaranteed to take the enjoyment out of the task and slow the learning process. Our rule at EBHRC is to request something no more than three times and then go on to something else, we never want to be so goal oriented that we lose sight of our horse’s emotions. We do not want to insist or nag.

Body Position for Offering an Edible Reward

Body position is important because you do not want your horse nosing in your pocket or fanny pack. If you are standing in front of your horse, maintain a three-foot or one-meter distance between you and your horse’s head. Put the reward in your hand and reach under your horse’s head close to his chest so that he puts his head down, not toward you. Open your hand flat so that he doesn’t accidentally put a finger in his mouth.

If you are walking beside your horse, working in-hand, or your horse is on a longe line, your horse should halt and face the direction of travel without turning his body or head toward you. To offer an edible reward:

Stand at your horse’s shoulder facing the direction of travel.
Place the reward in the hand nearest the horse’s shoulder.
Reach your arm under your horse’s chin at his chest and open your hand.

If the horse is on the longe line, you can:

Ask your horse to stay.
Drop the longe line and quickly walk up to the horse.
Position yourself at its shoulder, facing the direction of travel.
Reward the horse as described above.
Ask your horse to stay again.
Pick up the longe line and let it slide through your hand as you back up into position.
Ask your horse to walk on.

Horse’s Age
The age of the horse makes no difference. Younger horses have excellent stamina but can be impulsive and lose focus easily. Older horses may tire quickly but excel in muscle coordination, focus, and self-control.

Using a Mix of Positive and Negative Reinforcement
When a horse feels relaxed and comfortable with positive reinforcement learning, negative reinforcement in the form of pressure and release can be used as long as it is momentary and the pressure is light enough that the horse does not feel the least bit forced. For example, when we teach the horse the turn on the forehand, we use a slight touch (pressure) with our hand behind the girth while we say “cross.” We do not expect a horse to understand our request the first or even the tenth time, but as the horse shifts its weight away from the pressure and/or makes a little bit of movement in the correct direction, we move our hand away (release), say, “Nice” and reward as we gently mold the behavior.

Unlike positive reinforcement, negative reinforcement in the form of pressure and release does not work on every horse. In fact, if not used carefully and sparingly, it takes the fun out of the horse-human relationship. Luke spent the first few years of his life in a training program that relied solely on negative reinforcement in the form of pressure and release and it caused him to be very distrustful of humans. Unfortunately, he eventually turned the tables around and began applying pressure and release on his previous owners. His “pressure” included attacking, biting, kicking, striking, stomping, and rearing, and he didn’t “release” until they left him alone. Luke’s owners couldn’t sell him with those problems, so I agreed to take him in. Positive reinforcement is working very well with him, and he seldom exhibits those behaviors.

Using Consequences
When Luke first came here, I was compelled to send him away from us (humans and horses) a few times due to his aggression. Those time-outs would last only two to five minutes. When he approached with his head down in apology, he was allowed back into the herd; otherwise, he had to stay away and think about his behavior. Most of all, prevention has been our best strategy to solve problems with aggressive behavior.

Final Thoughts
Understanding the basic concepts of operant conditioning allows the handler to problem-solve and devise strategies to teach horses a variety of movements using praise and edible rewards. The advantages I have personally noticed with this training method are that horses are calm, curious, and eager to work with their handlers and are well-behaved and engaged when they are around humans in general. What begins as a teaching technique quickly develops into a mutually enjoyable conversation. The handler asks a question, the horse rewards the handler with as good a response as the horse can come up with based upon his skill and knowledge; the handler rewards the horse for the response and asks another question. As these conversations develop, so does the relationship, and the two parties, although different species, look forward to spending time together.

Best wishes,

Chris Forte

Two more articles on the importance of edible rewards:

USING A REWARD SYSTEM

DEVELOPING CONFIDENCE