Treat or Edible Reward – Understanding the Science of Learning

 

Indiscriminately offering treats can transform a lovely horse into a beggar, mugger, and biter; and the horse owner into a candy dispenser. This article will teach you how to use treats in a way that will improve rather than harm your horse-human relationship. I will use some technical terms in this article, so be prepared to read the technical section a few times.  However, an understanding of these concepts will increase your ability to construct precise lesson plans that develop your horse’s potential quickly and without stress.

Providing edible rewards to motivate animals to perform a task is not a new concept.  In fact, it was codified in the Law of Moses several thousand ago, “You shall not muzzle an ox when it treads out the grain.” (Deuteronomy 25:4,).  Edible rewards encourage dolphins to assist deep sea divers, dogs to search and rescue, and mice to maneuver through complicated mazes.

Many horse trainers have found that edible rewards have produced well-trained horses who engage in happy partnerships with their riders.  One of the most unusual uses of edible rewards in horse-training was by a veterinarian, Dr. William Key, in the late 1800s.  By sugar coating cards inscribed with numbers and the letters of the alphabet, Dr. Key taught his horse, Jim, to read, write, and solve simple math problems.

About the same time Dr. Key was working with Jim, Edward L. Thorndike, an early pioneer in psychology and behavior, was conducting experiments about learning.  He coined the term “Operant Conditioning” to describe how animals learn. Thorndike’s Law of Effect (published in 1905) states that behaviors (movements, actions, responses to questions, etc.) followed by satisfying consequences tend to be repeated while those behaviors that produce an unsatisfying consequence tend not to be repeated.

In the 1930s through the 1950s, a behavioral psychology researcher, B.F. Skinner, carried Thorndike’s research on operant conditioning even further.   He labeled each step of the learning process.  Skinner hypothesized that all behavior can be described as a result of reinforcement (increasing the frequency of a behavior) ; and consequences ( decreasing the frequency of a behavior).

Reinforcement – (those things that increase the frequency of a specific behavior) is broken down into two categories are:

  • Positive reinforcement (the use of a wage or reward that satisfies the subject’s natural desire such as food, affectionate, physical contact, etc.) and
  • Negative reinforcement (withdrawing a negative stimulus when the requested behavior is exhibited such as pressure and release.)

Consequences – (those things that discourage a specific behavior) are broken down into three categories:

  • Positive punishment (imposing an uncomfortable stimulus for exhibiting a specific behavior such as smacking a horse for biting),
  • Negative punishment (removing something comfortable to discourage a behavior such as depriving a horse of the companionship of other equines for rough-housing too aggressively), and
  • Extinguishment (ignoring unwanted behavior so that it is no longer exhibited such as not offering a reward when a horse performs an unrequested trick).

Reinforcers are the reward tools used in positive reinforcement to motivate the subject to change behavior.  Primary reinforcers are those that the subject naturally enjoys including,  food, affection, praise, free time. Secondary and tertiary reinforcers are things and actions a subject learns to like because they are “bridges” to primary reinforcers.  They include money, words, clicks, and other signals to indicate that a primary reinforcer is coming.

Primary reinforcers are offered when an animal has responded to the best of his understanding and physical ability to the handler’s request to perform a task. The act of offering primary reinforcers for each improvement of the performance of the task is called “shaping.”  Consistently rewarding and extinguishing (ignoring) behaviors until the desired movement is perfected is called “molding.”  Asking an animal to do several things in sequence is called “chaining.”

In human terms, consider the emotional difference of these three statements:

  1. If you complete this project by July 29th, your paycheck will contain a substantial bonus. (positive reinforcement)
  2. Your job description includes completing this project by July 29th. (Negative reinforcement)
  3. If you don’t complete this job by July 29th, you will be fired. (Positive punishment to discourage laziness)

The average employee will complete the job by July 29th, regardless of the motivation. However, numerous scientific studies indicate that subjects who are offered a reward such as the bonus in the first example are inclined to feel less anxious, more motivated,  and will finish the project more quickly than the subjects in second or third example. The subjects in the third example will feel more stress and finish the project more slowly than the others.  Similar studies show that the results are the same whether the subject is a mammal, fish, bird, or other creature.

I teach how to use positive reinforcement to develop equine intellectual and physical potential and to rehabilitate unsafe horses.  At the beginning of each training session we make sure that the horse is relaxed and focused by asking for responses to simple requests during the grooming and tacking process.  We may ask it to touch various grooming utensils and tack; lift feet; stay; come; cross front or hind legs; turn on the haunches or hindquarters; and back or bring forward individual feet.

Our seven-point procedure for requesting a behavior is as follows:

  1. Preparation – we say the word “Aaand” which is our half-halt word that tells the horse to focus or prepare for a request;
  2. Request – The vocal word for the requested behavior and/or the body language gesture. This is the most difficult step of the process. Sometimes we act out the action we desire so that the horse can imitate us. For example, we may put our feet on a pedestal to encourage a horse to do the same.  At other times we may actually place the horse’s foot on the pedestal.
  3. Tertiary reinforcer – The encouraging word “Good” is a tertiary reinforcer or bridge that informs the horse that it is on the right track.
  4. Response – the behavior the horse exhibits in response to the request.
  5. Secondary reinforcer – Once the horse has exhibited the behavior to the best of his understanding and ability, the handler says, “Nice” (the secondary reinforce or bridge)
  6. Primary reinforcer – is offered immediately after the secondary reinforcer. It may be an edible reward, a caress, or praise.
  7. Extinguishment – if the behavior does not approximate the request, we skip steps 5 and 6 and ignore the behavior. We either repeat the request; break down the request into smaller requests, or move on to another activity.

Sometimes the tertiary reinforcer (step 3) is skipped if the horse responds quickly.  At other times, especially when chaining (more about chaining below), the secondary and primary reinforcers (steps 5 and 6) are skipped and another question is asked.

Once the horse is tacked-up, this procedure is used during in-hand work, longeing, and mounted work.  Special attention is given to ensure that the horse stays relaxed and focused at all times.

When a horse has learned several behaviors, the behaviors can be chained together in a variety of sequences such as: walk four strides, trot five strides, canter three left-lead strides, flying lead change, canter three right-lead strides, trot two strides, and halt.  Each transition of the sequence is initiated by a half-halt: “Aaand” and a squeeze of the reins, lead-rope, or longe line.  The handler says, “Nice!” (The secondary reinforcer) and offers a primary reinforcer at the end of the sequence. The handler may say “Good”, the tertiary reinforcer during the sequence.

Appropriate edible rewards – Sugar-free extruded hay pellets are ideal as they don’t cause pockets to become soggy and they are liked by most horses.  Although messy, many horses love tiny bits of apple, carrot, grapes, or watermelon.  Do not feed sugar, fruit or carrots if your horse is over-weight, insulin resistant, has Cushing’s disease, founder, laminitis, or other metabolic problems.

Size of an edible reward – The pieces of food should be no larger than the tip of your little finger.

  • Horses are nose-breathers so, when they move, they close the back of their throat and seal their mouth to create a clear airway. Consequently, edible rewards must be small enough to be chewed and swallowed before the horse is asked to move.
  • Edible tidbits motivate the horse to stay relaxed, focused, and eager to exhibit the requested behavior. If the horse becomes satiated the “reward” will no longer be an effective incentive.

Timing for offering an edible reward – It takes practice to perfect one’s timing. The basic rules are:

  • Never offer food without asking your horse to do something. It can be as simple as taking a step forward or backward.
  • Always say “Nice” or other secondary reinforcer before offering a food reward.
  • The secondary reinforcer should be used immediately after the horse executes the requested behavior and the primary reinforce should be given no more than a few seconds later.
  • Be patient because insisting that a horse exhibits a specific behavior transforms a request (positive reinforcement) into a demand or pressure (negative reinforcement) and forcing a horse to exhibit a specific behavior turns pressure into punishment and is guaranteed to take the enjoyment out of the task and slow the learning process down. Our rule, at EBHRC, is to request three times and then go on to something else so that we do not get so excited about reaching our goal that we lose sight of our horse’s emotions.

Body position for offering an edible reward – Body position is important because you do not want your horse nosing in your pocket or fanny pack. If you are standing in front of your horse, maintain a three foot or one meter distance between you and your horse’s head, put the reward in your hand and reach under your horse’s head close to his chest so that he puts his head down not toward you.  Open your hand flat so that he doesn’t accidently put a finger in his mouth.

If you are walking beside your horse, working in-hand, or your horse is on a longe line, your horse should halt and face in the direction of travel, not turn his body or head toward you.  So, when offering an edible reward, stand at your horse’s shoulder facing the direction of travel, place the reward in your inside hand; reach your arm under your horse’s chin at his chest and open your hand. If the horse is on the longe line, you can ask your horse to stay, drop the longe line, quickly walk up to the horse and position yourself at its shoulder, looking toward the direction of travel; reward the horse as described above, ask your horse to stay, pick up the longe line and let it slide through your hand as you back up into position before asking your horse to walk-on.

Horse’s age – The age of the horse makes no difference.  Younger horses have excellent stamina but they can be impulsive and lose focus easily. Older horses may tire quickly but they excel in muscle coordination, focus, and self-control.

Using a mix of positive and negative reinforcement – When a horse feels relaxed and comfortable with positive reinforcement learning, negative reinforcement in the form of pressure and release can be used as long as it is momentary and the pressure is light enough that the horse does not feel the least bit forced.  For example, when we teach the horse the turn on the forehand we use a slight touch (pressure) with our hand behind the girth while we say “cross.”  We do not expect a horse to understand our request the first or even the tenth time, but, as the horse shifts its weight away from the pressure and/or makes a little bit of movement in the correct direction, we move our hand away (release), say, “Nice” and reward as we gently mold the behavior.

Unlike positive reinforcement, negative reinforcement in the form of pressure and release does not work on every horse. Luke spent the first few years of his life in a training program that relied solely on negative reinforcement in the form of pressure and release.  It appears that, at times, the negative reinforcement became positive punishment.  Unfortunately, he is a very bright youngster who eventually turned the tables around and began applying pressure on his owners.  His “pressure” included attacking, biting, kicking, striking, stomping, and rearing and he didn’t “release” until he was left alone.  Luke’s owners couldn’t sell him with those problems so I agreed to take him in.  Positive reinforcement is working very well with him and he is no longer exhibiting those behaviors.

Using Consequences – While we have never found a need to use positive punishment at EBHRC, in the past I used negative punishment when I was compelled to send Luke away from us (humans and horses) a few times due to his aggressive behavior.  Those time-outs would last only two to five minutes.  When he approached with his head down in apology he was allowed back into the herd, otherwise, he had to stay away and think about his behavior. Most of all, prevention has been our best strategy to solve problems with aggressive behavior.

We use extinguishment frequently to help mold behaviors.  We simply ignore incorrect answers and only reward those that are correct.  In this way, horses quickly navigate their way to the correct behavior.

Final thoughts –

Understanding the basic concepts of operant conditioning allows the handler to problem-solve and devise strategies to teach horses a variety of movements using edible rewards and other motivators. The advantages I have personally noticed with this training method is that horses are calm, curious, and eager to work with their handlers and are well-behaved and engaged when they are around humans in general. What begins as a teaching technique quickly develops into a mutually enjoyable conversation.  The handler asks a question, the horse rewards the handler with a correct response; the handler rewards the horse for the correct response and asks another question. As these conversations develop, so does the relationship; and the two parties, although different species, looking forward to spending time together.

Best wishes,

Chris Forte

Two more articles on the importance of edible rewards:

USING A REWARD SYSTEM

DEVELOPING CONFIDENCE