ACQ
From USCBPWiki
Preliminary Draft-please send comments to James Bonaiuto
Contents |
Background
Augmented Competitive Queuing (ACQ) is inspired by the scenario of #Alstermark's Cat and based on #Competitive Queuing (CQ).
Competitive Queuing (CQ)
CQ is a classic method of sequence production based on a parallel representation. Entire sequences are represented by neurons in a sequence storage layer. Each of these neurons projects to every neuon in a parallel planning layer, which represent sequence elements. In this layer, the position of an element in the sequence is determined by the firing rate of its corresponding neuron (with the highest firing rate denoting the first element in the sequence). Therefore the connection weights between neurons in the sequence storage layer and those in the parallel planning layer encode the serial order of elements in each sequence. Each neuron in the parallel planning layer projects to a corresponding neuron in a competitive choice layer. In addition to input from the parallel planning layer neurons, each neuron in the competitive choice layer received recurrent excitatory input (from itself) and lateral inhibition from all other neurons in the layer. In this way, the competitive choice layer implements a winner-take-all (WTA) process. The winning neuron then inhibits its corresponding neuron in the parallel planning layer (inhibition of return) so that it is removed from subsequent competition. In this way the spatial pattern of activity in the parallel planning layer is parsed into a temporal pattern of activation of neurons in the competitive choice layer.
Alstermark's Cat
Alstermark et al. (1981) investigated the role of propriospinal neurons in forelimb movements of the cat through a series of lesion studies. While intended to illuminate the motor control circuitry in the cat's spinal cord, these experiments also happened to illustrate interesting aspects of the cat’s motor planning and learning capabilities. The experimental setup in this study consisted of a piece of food placed in a horizontal tube facing the cat. In order to eat the food, the cat was required to reach its forelimb into the tube, grasp the food with its paw, and bring the food to its mouth. Lesions in spinal segment C5 of the cortico- and rubrospinal tracts interfered with the cat’s ability to grasp the food, but not to reach for it. In contrast, both reaching and grasping were defective after a corresponding transection of the cortico-and rubrospinal tracts in C2. It was concluded that the C3-C4 propriospinal neurons can mediate the command for reaching, but not for grasping which instead can be mediated via interneurons in the forelimb segments C6-Th1. Not reported in the paper, is the account (B. Alstermark, personal communication, 1990) that after the lesion, the cat would reach inside the tube, and repeatedly attempt to grasp the food and fail. However, these repeated failed grasp attempts would eventually succeed in displacing the food from the tube by a raking movement, and the cat would then bend its head down, grasp the food from the ground with its jaws and eat it. After only 2 or 3 trials, the cat began to rake the food out of the tube, a more efficient process than random displacement by failed grasps.
It is assumed that before the lesion the cat already had a motor program for getting the food out of the tube and eating it. Rather than learning to parameterize motor schemes comprising an entirely new skill, or refining and tuning those motor schemas of an already-learned skill, it seems that modification is occurring on some sort of decision variable that controls which motor schema to execute at a particular time. The fact that after lesioning it took only a few trials for the cat to modify this motor program suggests that this is a form of learning that takes place on a faster time scale than classical models of motor learning.
Augmented Competitive Queuing (ACQ)
Augmented Competitive Queing (ACQ) seeks to explain the Alstermark's cat senario in terms of interacting and competing motor schemas in which the highest desriable action out of the set of physically possible actions is chosen in each time step. Note that we dissociate the executability of an action (determined by available affordances) from its desirability (learned via temporal difference (TD) learning).
