Present at the January 2014 LA Workshop:
My first book, Brains, Machines and Mathematics (McGraw-Hill, 1964) set the theme that the brain is not a computer in the current technological sense, but we can learn much about machines from studying brains, and much about brains from studying machines. I have thus always worked for an interdisciplinary environment in which computer scientists and engineers can talk to neuroscientists and cognitive scientists. At the University of Massachusetts I helped found the Center for Systems Neuroscience, the Cognitive Science Program (where my contribution focused on the linkage of computer science, linguistics and computational neuroscience), and the Laboratory for Perceptual Robotics.
My research has long included a focus on mechanisms underlying the coordination of perception and action. This is tackled at two levels: via schema theory, which is applicable both in top-down analyses of brain function and human cognition as well as in studies of machine vision and robotics; and through the detailed analysis of neural networks, working closely with the experimental findings of neuroscientists. My group prepared the first computational model of mirror neurons and conducted some of the key initial imaging studies of the human mirror system. We continue to develop further insights into the monkey brain and use them to develop our theory of the evolution of human language. More specifics are offered in the abstracts below for a few papers plus my introductory talk at the Workshop.
1. Metaphorical processing in the brain: 1) understanding how metaphorical language related to disgust/morality activates emotion-related brain regions, and how it is modulated by context as well as impacts decision making. 2) Understanding how lesions in motor-related brain regions impact metaphorical processing of action phrases. The goal of these studies is to understand the involvement of sensory-motor brain regions during metaphorical processing and if these regions are essential to metaphorical processing or not.
2. Mirror Neurons: 1) how does interacting with dissimilar others modulate activity in the MNS? 2) how is the MNS impacted in neurological disorders such as dyspraxia and stroke? The goal of these studies is to not only gain a better understanding of the human MNS, but to build a solid scientific basis from which translational work can be based.
My PhD work is organized around the question of how our nervous system allows us to talk about the world we perceives, whether to describe in a compact way what has caught our attention in the environment, or conversely to match descriptions we hear with a subpart of a visual scene. Focusing on language as embedded within the action-perception cycles in which living organisms are constantly engaged, I am in the process of developing a (neuro)computational system-level model that captures some key aspects of the dynamic interactions that exist between language and vision/attention (in particular through the proxy of eye-movements). Work by former PhD student in our lab Jinyong Lee, had proposed a model of the production of visual scene descriptions. Turning to comprehension my goal is to offer an explicit computational model of how visual, grammatical processes, and world knowledge, distributed over multiple functional routes, can dynamically cooperate during a sentence-picture matching task. The hope is to integrate and simulate some key empirical results from neuropsychology (agrammatism), psycholinguistics (eye tracking based on visual world paradigm), and neurolinguistics (multi-stream distributed architecture of the language system).
In order to achieve this goal, I am currently exploring a few sub-issues:
- Schema Theory. Given the scope of the model, I explore the possibility to define the computations in terms of schema theory, focusing on coarse grain functional decompositions starting from the top-down (from behavior towards brain systems). Such coarser but integrative approach complements the more bottom-up (from neuron to function) narrow-scope approaches that have focuses on providing detailed models for specific sub-functions (e.g. syntactic processing, saliency maps,…) but which often lack insights into how those integrate into the an overall process that supports behavior. Briefly stated, schema theory offers a distributed model of computation in which functional units can dynamically self-organize, based on a cooperative computation paradigm, to adaptively support the ongoing behavior. Schema level models offer a necessary intermediary step in bridging from behavior to neural circuits (or, stated differently, to bridge between psychology and neuroscience).
- Template Construction Grammar. I am working on expanding schema theory to incorporate language-schemas, exploring the use of construction grammar as a way to model the functional content of such language schemas. Since schema theory uses a cooperative computation paradigm, it yields a grammatical processing model in which, as utterance inputs are received, constructions dynamically compete and cooperate to self-organize and generate (competing) construction assemblages, each representing a form-meaning mapping hypothesis. The Semantic Representation (SemRep) models the language/vision interface. It offer a way to model the linguistic conceptualization process that can on one side interact with the visual and attentional systems, and on the other with the language schemas (as input during production or as output during comprehension). This general effort can therefore be seen as a way to operationalize and integrate into brain theory some key cognitive linguistics insights regarding grammar and meaning.
- Synthetic ERP. From a more methodological point of view, I am interested in the question of how neural and schema models could be more rigorously tested against EEG neuroimaging data. This question is particularly important for neurolinguistics since EEG recordings and in particular ERPs provide a unique window into online brain processes. Although still preliminary, I have started to develop a framework that would allow the coupling of neural network models to realistic forward model of EEG signal (including both realistic head models, cortical surface model, and lead-field model). The goal is to build a synthetic EEG tool, which, after mapping the model processing unites (neural layers or schemas) onto 3D cortical surface coordinates, can translate the model’s processing activity into EEG sources magnitudes (electric dipoles) and from there, simulate the corresponding EEG signal using the forward model. Such methods would open new venues to test computational neurolinguistic models against EEG data points. Crucially, it also turns out to be a great way to quantitatively and formally analyze some interesting computational and neuroinformatics challenges that the field of neurolinguistics faces.
Modeling work has focused on the development of Embodied Construction Grammar, a modeling framework for implementing representations of linguistic knowledge and use and their integration with embodied simulation.
Experimental work has tested predictions of theories of simulation and language use–affects of language on visual and auditory perception and on motor control, as well as affects of perception and motor control on language processing. In particular, work has focused on effects of grammatical structures on simulation and effects of simulation on perception and motor control while driving a motor vehicle.
Professor Aude Billard is head of the Learning Algorithms and Systems Laboratory (LASA) at the School of Engineering at the EPFL. She received a M.Sc. in Physics from EPFL (1995), a MSc. in Knowledge-based Systems (1996) and a Ph.D. in Artificial Intelligence (1998) from the University of Edinburgh. She was the recipient of the Intel Corporation Teaching award, the Swiss National Science Foundation career award in 2002, the Outstanding Young Person in Science and Innovation from the Swiss Chamber of Commerce and the IEEE-RAS Best Reviewer award. Aude Billard served as an elected member of the Administrative Committee of the IEEE Robotics and Automation society for two terms (2006-2008 and 2009-2011). She was a keynote speaker at the IEEE International Symposium on Human-Robot Interaction (ROMAN) in 2005, general chair for the IEEE International Conference on Human-Robot Interaction in 2011 and co-general chair for the IEEE International Conference on Humanoid Robots in 2006. Her research on human-robot interaction and robot programming by demonstration was featured in numerous premier venues (BBC, IEEE Spectrum) and received seven best paper awards at major robotics conferences, among which ICRA, IROS and ROMAN.
Our work focuses on the design of algorithms by which a robot can learn new skills from observing skills performed by humans. This is known as “robot programming by demonstration” (PBD) or “robot learning from demonstration”. PbD is fundamentally interdisciplinary and cannot be understood without a thorough reading of the literature on how humans, and in particular children, acquire new skills through imitation. The ability to imitate others, and in particular the ability to learn by imitating, has been a central topic of developmental and cognitive psychology and more recently of the neurosciences. The phenomenon is complex and akin to understanding fundamental cognitive processes, such as the recognition of self and others, the existence of internal models of human motion, and the theory of mind. Although some of these cognitive skills are likely not needed for the next generation of robots, others, such as the ability to recognize and predict human action, are of direct use to ensure robust and safe robot movement in the presence of humans.
Since 2002, we have pursued the development of computational models of human imitation in parallel to the development of core robotic controllers. These models entail neural models of the cortical pathways behind visuo-motor imitation. For instance, in [1, 2] we exploited information provided by lesion studies to better elucidate the role of specific neural impairements in visuo-motor imitation, such as those displayed by apraxic patients. In , we explored the use of dynamic neural fields to model the mechanisms underlying some of the neural processes fundamental to motor imagery and imitation and in , we offered two hypotheses on the pathways underlying the ideomotor principle. One of these hypotheses was confirmed by a follow-up experimental study conducted by Berthental and colleagues [Boyer et al, Acta Psychologia, 319, 2012]
These computational models form a core compound of fundamental research in our group that support the design of similar controllers in robots. For instance, in  we had introduced the notion of force-based modulation of a primitive dynamical system to explain the curvature of human reaching movement when the trajectories are calculated to avoid hitting the body. We recently revisited this concept in the context of obstacle avoidance in robot control . Similarly, the detailed analysis of the dynamics of neural fields, which we developed in , inspired the statistical approach to the estimation of analytical dynamical systems in the context of robot control .
Our current efforts are directed to better understand how coordinated patterns of sensori-motor control are performed in humans. In , we revisit the well-known visuo-motor coupling in reaching and analyse how this coupling is exploited to avoid moving obstacles. The effect of unexpected changes and how humans recover from them, surprisingly, has been hardly studied so far. The literature offers little insight on how humans re-plan on the fly a reaching motion in the face of very fast perturbations, such as when avoiding a fast moving obstacle. I would look forward to interacting with members of this consortium and initiate collaboration between experimentalist and our group to conduct motion studies and accompany this to look into such problems.
: Petreska, B, Billard, A, Hermsdörfer, J and Goldenberg, G. (2010) Revisiting callosal apraxia: the right hemisphere can imitate the orientation but not the position of the hand. Neuropsychologia, vol. 48, num. 9, p. 2509-2516.
 B. Petreska and Billard, A. (2009), Movement Curvature Planning through Force Field Internal Models. Biological Cybernetics, 100, p. 331-350.
 Sauser, E. and Billard, A (2006) Parallel and Distributed Neural Models of the Ideomotor Principle: An Investigation of Imitative Cortical Pathways. Neural Networks, 19(3): 285-298.
 Sauser, E. and Billard, A (2005) Three dimensional frames of reference transformations using gain modulated populations of neurons. Neurocomputing, 64, 5-24
 Khansari Zadeh, S. M. and Billard, A. (2012) A Dynamical System Approach to Realtime Obstacle Avoidance. Autonomous Robots, 32(4), 433-454.
 Khansari, M, Billard, A. (2011) Learning Stable Non-Linear Dynamical Systems with Gaussian Mixture Models, IEEE Transaction in Robotics. vol. 27, num 5, p. 943-957.
 Lukic, Santos-Victor and Billard, Learning Robotic Eye-Hand Coordination, Biological Cybernetics, In Press
I’ve developed several models of the mirror system, been heavily involved in the design and implementation of the current version of BODB, and have experience performing fMRI experiments on awake, behaving non-human primates. I’m looking for targets for future modeling of the mirror system, consensus on a common format for experimental data reporting, and ideas for techniques in comparing model performance with experimental data.
There are two aspects of the INSPIRE proposal that my work touches upon:
Developing increasingly realistic neural models of the mirror system
In Richard Andersen’s lab I developed a bihemispheric model of LIP in spatial decision-making (J. Bonaiuto, Kagan, & Andersen, 2011). Some of Lisa Aziz-Zadeh’s prior work (Aziz-Zadeh et al., 2002, 2004, 2006) involved lateralization in the human mirror system. I’d like to see what are the differences in lateralization in the monkey versus human mirror system? Some of the early mirror neuron studies in the Parma group found mirror neurons with a preference for one hand or the other – do later studies bear this out? What about actions presented in one visual hemifield?
I’d like to develop a vision for MNS3, realizing (at least) the integrated view of MNS2 (J. Bonaiuto, Rosta, & Arbib, 2007), ACQ (J Bonaiuto & Arbib, 2010), and ILGA. While MNS2 and ILGA build on previous ideas of the role of the mirror system in feedback-based control of manual actions, ACQ proposes that mirror neurons are involved in decision-making for action selection. New work in Sven Bestmann’s lab also involves decision-making, but I’d like to develop experiments looking at the role of action observation in decision-making (i.e. planning a response to an observed action). This would involve consideration of some of Vittorio Caggiano’s data on mirror neuron response modulation by the action workspace (Caggiano et al., 2009) and subjective value of the observed action (Caggiano et al., 2012).
I’ve been involved in extending and applying the technique of synthetic brain imaging to models of decision-making (J. Bonaiuto & Arbib, 2014). I’d like to apply synthetic brain imaging to address Orban and Vanduffel’s linkages and develop neural models of mirror neurons that include adaptation to interface with Vittorio’s data. Lack of adaptation in neurons but robust fMRI adaptation in human fMRI has been found in other areas such as the face patches, suggesting that they may be the result of different mechanisms. I wonder if SBI could shed some light on this.
ERP studies look at suppression of the -rhythm as an indicator of mirror system activity, but no one seems to have any idea what it means. I’d like to extend the MNS models to use synthetic ERP and EEG to model -rhythm suppression and suggest a role for it. Perhaps some of Roger Lemon’s data could suggest a mechanism for inhibiting automatic imitative actions. One question this raises is why is such a mechanism needed if monkeys do not imitate? Perhaps the mechanism is required to suppress automatic imitation, but monkeys do not have cognitive control of it.
Neuroinformatics approaches to representing neurophysiology, connectivity, and neuroimaging data and linking it to models
I’m interested in working with experimentalists at the conference to develop data formats for capturing their results and developing a) specifications for specialized databases to store them, and/or b) strategies for summarizing them and federating with BODB. Example: CoCoMac has been resurrected: http://cocomac.g-node.org/drupal/?. The previous version did not allow public data entry and our federation strategy was therefore to pull summaries of all existing connectivity data from it and store them in our database with links back to the full records in CoCoMac. The new version of CoCoMac will allow data collation from the public. Since their dataset will therefore be more rapidly expanding, our new strategy involves dynamically searching their site using their published API whenever a BODB user searches for SEDs to link to their models. Connection summarizes are still stored in BODB with links to CoCoMac, but this allows new CoCoMac entries to be imported to BODB on-the-fly.
I’m particularly interested in developing summary data formats for neurophysiological data. I’d like to know if experimentalists such as Roger Lemon, Vittorio Caggiano, and Leonardo Fogassi contribute their data to any current neuroinformatics resources such as NeuroDatabase, and if not, why? How can such data be summarized in a format compatible with BrainSurfer for visualization?
Quantification of the contribution of mirror neurons to the encoding of visual (e.g. perspective and features of the effectors) and contextual (e.g. possibility of interaction and reward) information of the observed motor acts. Neural encoding of action semantics. Interaction between observation and execution.
My research explores the acquisition and evolution of language by comparing communicative structures and development in apes and humans. I have worked mainly on the gestures of captive orangutans in European zoos, but I am starting to work on the gestures of other ape species. I focus on two main areas of inquiry: gesture’s potential role in the origin of language, and the dynamic relationship between early parent input and infant communicative development. Comparative studies hold great promise for understanding how parent-offspring interaction shapes the emergence of communicative conventions and for deciphering how we evolved the capacities that support the ability to master language.
In this workshop, I hope to gain a better understanding of how to connect my work to computational modeling and neuroscience. I hope that by increasing my fluency in the constraints and affordances of computational neuroscience, I will be able to design new empirical studies that will provide rich data for modeling, and in turn generate new predictions for empirical work. I also hope that this workshop will provide a framework to address the issues of data coding and integration present within the field of ape gesture research. If the workshop encourages gesture researchers, linguists, and computational modelers to engage in a discussion about best practices for the representation and sharing of data, it may lead to the development of new informatics tools and could potentially have an impact far beyond our individual research programs. My current projects focus on 1) gesture as a representational medium, 2) gesture as part of an integrated communicative system, and 3) the role of non-verbal input in the early learning/rearing environment.
Gesture as a representational medium
In both my ape and human studies I use a combination of observational and experimental approaches to ask how gesture conveys meaning. My work on orangutan gesture has revealed that more than half of their gestures have predictable, stable meanings (Cartmill & Byrne, 2010; 2011). In my human research, I have found that young children use gesture to modify or clarify spoken words, and that they use these gesture-speech combinations to perform communicative tasks beyond the scope of their spoken linguistic structures (Cartmill, Demir, & Goldin-Meadow, 2011; Cartmill, Hunsicker, & Goldin-Meadow, in press). In ongoing work, I assess the relationship between gesture use and social cognition and ask how apes and humans come to understand relationship between gestures and their referents. I am particularly interested in understanding the mechanisms underlying representational gestures and how gesture differs from both vocal communication and action (Cartmill & Goldin-Meadow, 2012).
Gesture as part of an integrated communicative system
Both apes and humans have multimodal communication systems, but humans have a uniquely rich system characterized by both semantic and temporal integration of speech and gesture. I am interested in the nature of the relationships between gesture and speech in humans and between gesture and vocalization (or between gestures within a sequence) in apes. The distribution of communicative elements across modalities is of particular interest—are certain features routinely conveyed in only a single modality? What can the relationship between gesture and speech tell us about the cognitive state of the signaler? In my work on language development, I found that the gestures and speech of young children are never entirely redundant, even when used simultaneously to reference the same object (Cartmill, Hunsicker, & Goldin-Meadow, in press). In a somewhat similar vein, orangutans use repetitions of gestures in strategic ways, conveying something beyond a simple reiteration of the initial communicative goal (Cartmill & Byrne, 2007; Cartmill, 2008). In ongoing work, I explore changes in the gesture-speech relationship over development as children become proficient speakers as well as changes in juvenile apes’ use of gesture as they become more proficient communicators.
Non-verbal behavior as a source of input in the early learning environment
The human rearing environment provides uniquely rich input to infant development. Yet we know little about how this key intersection of biological and cultural processes emerged over evolutionary history. To address this question, we must compare communicative development between species by studying both the learners and the “learning environment” and by identifying the sources of variation in each. Human children vary greatly in how well and how quickly they acquire language. My recent work has focused on developing rich quantitative and qualitative measures to ascertain how variation in non-verbal parent input affects early word learning in English (Cartmill et al., 2013). Most of my work on human language involves typically-developing children. However, I include children with unilateral brain lesions in some of my work and this allows me to ask questions about the impact of environmental and biological differences in early language development. Research suggests that these children may rely more heavily on gestural input that their typically-developing peers.
I’m currently a PhD candidate in cognitive psychology at UC Santa Cruz. Of my current work, what’s most relevant to the BODB collaboration is not my thesis research, but rather a couple of projects I have been working on in collaboration with Marcus Perlman using a video corpus (provided by the Gorilla Foundation) of the gorilla Koko interacting with human caregivers. One project explores Koko’s flexibility in learning and controlling vocal and breathing behaviors. We find that Koko exhibits control and learning across her lungs, larynx, tongue and lips articulators to a much greater degree than is typically considered possible in gesture-first theories of the phylogeny of language. (Of course, the degree of flexibility across these effectors is not uniform.) Further, these novel vocal and breathing behaviors are almost always coordinated with manual behavior, especially routines involving manipulating objects. This coordination points to a deep integration of vocal and manual modalities in the great apes.
The second GF project concerns Koko’s use of deictic and iconic gestures in combination with a stereotyped emblem, the ‘directed scratch’ gesture. We are documenting varying degrees of stability vs elaboration in these gestures, and plan to analyze them for changes over time along these dimensions. In broader, more theoretical terms, I am also collaborating with Marcus Perlman and Joanne Tanner on an article considering possible sources of apparently iconic Great ape behaviors from the perspectives of ontogenetic ritualization, innate repertoires, and iconicity. We will be giving a talk on this topic at Evolang, and plan to expand it for publication.
While most of my theoretical work has been descriptive and analogical, rather than computational, I look forward to interacting with researchers from more computationally grounded theoretical backgrounds. I think that my perspective can offer constructive challenges for these approaches about what behaviors and productive modalities are most important to model, and that the data I am collecting can offer useful comparisons to wild and captive Great apes (especially gorillas) to gain a more complete picture of extant species’ behavioral potential. In return, I expect to be challenged constructively to develop more computationally-grounded definitions for iconicity and related phenomena.
Perlman, M., Tanner, J., and Clark, N. (in prep). Iconicity and ape gesture.
Perlman, M. and Clark, N. (in prep). Multimodal analysis of novel vocal and breathing behavior of a human-fostered gorilla.
Clark, N. and Perlman, M. (in prep). Stereotypy and elaboration in the directed scratch requests of a human-fostered gorilla.
Gibbs, R. and Clark, N. (2012). No need for instinct: Coordination communication as an embodied self-organized process. Pragmatics and Cognition, 20(2): 241-262.
My research looks at the structure and cognition of the comprehension of the visual language used in sequential images, as found in comics. This has involved the development of a theoretical model of event structure understanding, built as an extension of Jackendoff’s (Jackendoff, 1983, 1990) Conceptual Semantics, though dealing with events at a higher level of structure than what is typically described at the sentence level.
This approach has also focused on the “narrative grammar” of sequential images (Cohn, 2013b), which provides a structure for presenting the event structure, directly analogous to the way that syntactic structure functions to present conceptual/semantic structure. Like syntax, this narrative structure is 1) separate from semantics, and 2) organized into a hierarchic constituent structure. These parallels between narrative structure and syntactic structure are further explored—and appear to be borne out—by experimental research, especially those using event-related potentials to explore the neurocognition of sequential images. Across several studies, my colleagues and I have now shown that narrative categories have consistent distributional trends in visual sequences (Cohn, In Press) and that violation of semantic and/or narrative structures—using paradigms directly replicated from classic psycholinguistic studies—result in the same ERP effects as found in sentence processing: N400 effects, P600 effects, and LAN effects (Cohn, 2012b; Cohn, Jackendoff, Holcomb, & Kuperberg, Under Review; Cohn, Paczynski, Jackendoff, Holcomb, & Kuperberg, 2012).
These findings raise important questions for the overlap of neurocognitive functions across domains. Furthermore, such results are embedded into my larger research program exploring “visual language theory” that posits that the entire graphic modality of communication (i.e., drawing, comics) is structured in comparable ways to language (Cohn, 2012a, 2013a). Thus, any theories of the structure, cognition, and evolution of the linguistic system must contend with not just two modalities of expression (verbal-auditory, visual-manual), but must also have the explanatory power to account for expression in the visual-graphic modality as well (which, I’ll note, is the only manner of expression that is truly human specific, since other animals at least use their bodies or make noises to communicate—none manipulate the world to create conceptual expression in the way that we do when drawing).
Thus, as part of the INSPIRE project, I hope I can offer a unique viewpoint and learn what I can from the other contributors in aiming towards understanding the human communicative system as a whole.
Cohn, N. (2012a). Explaining “I Can’t Draw”: Parallels between the Structure and Development of Language and Drawing. Human Development, 55(4), 167-192.
Cohn, N. (2012b). Structure, meaning, and constituency in visual narrative comprehension. Doctoral Dissertation, Tufts University, Medford, MA.
Cohn, N. (2013a). The visual language of comics: Introduction to the structure and cognition of sequential images. New York: Bloomsbury.
Cohn, N. (2013b). Visual narrative structure. Cognitive Science, 37(3), 413-452. doi: 10.1111/cogs.12016
Cohn, N. (In Press). You’re a good structure, Charlie Brown: The distribution of narrative categories in comic strips. Cognitive Science.
Cohn, N., Jackendoff, R., Holcomb, P. J., & Kuperberg, G. R. (Under Review). The grammar of comics: Neural evidence for constituent structure in visual narrative comprehension. Psychological Science.
Cohn, N., Paczynski, M., Jackendoff, R., Holcomb, P. J., & Kuperberg, G. R. (2012). (Pea)nuts and bolts of visual narrative: Structure and meaning in sequential image comprehension. Cognitive Psychology, 65(1), 1-38. doi: 10.1016/j.cogpsych.2012.01.003
Jackendoff, R. (1983). Semantics and Cognition. Cambridge, MA: MIT Press.
Jackendoff, R. (1990). Semantic Structures. Cambridge, MA: MIT Press.
Recent and ongoing projects are particularly centered on language-vision interface, including (i) how people reconcile linguistic constraints, general knowledge, and apprehension of the non-linguistic visual environment, and (ii) the on-line integration of the speech signal with visual communication channels such as gaze and ultimately gesture.
Previous and on-going work is combining eye-tracking and ERP methods to examine the time-course and relative priority of information coming from the visual environment on comprehension processes. These findings have inspired the development of a connectionist computational model of situated comprehension that qualitatively models relevant eye-tracking and ERP findings from our own studies (Knoeferle et al, 2008; Crocker et al, 2011). Further, studies of the influence of referential speaker gaze on comprehension have revealed both facilitation and disruption comprehension due to gaze, with evidence further suggesting that gaze is distinct from other attentional cues, likely due to its intentional status (Staudte and Crocker, 2011). Finally, we have also demonstrated the crucial interplay of linguistic and visual information during language learning (Alishahi et al, 2012). Goals within INSPIRE include:
- Determining whether the high priority of both linguistic (speaker gaze) and non-linguistic (scene events) visual information that we have observed is consistent with findings from nearby domains of inquiry.
- More fine-grained linking hypotheses to relate the mechanisms of language-vision integration with a variety of behavioral and neurophysiological measures.
- Better understanding of the origins of our findings, in terms of both the rational emergence of visually-situated communication, and the biological and neural mechanisms that support it.
One of my main interests in primate communication is to reconstruct the evolution of distinctive expressions by using phylogenetic analysis based on acoustic and facial (FACS) measures. Laughter is playing here a special role since it represents a human expression that shows notable variation in form and function as well as important links between nonverbal and linguistic communication systems. Our research supports the postulate of gradual phylogenetic changes of expressions, where human laugh sounds and faces evolved from great ape multimodal displays. These findings gave the leeway to develop a model on laughter evolution, where we could situate the specific phylogenetic changes in both form and occurrence of laughter along a trajectory of 13 million years, back to when the last common ancestor of humans and extant great apes existed. My other main research interests on primate communication include mimicking, flexibility in production, and gestures. Current field research on orangutan and chimpanzee communication takes place in Sepilok Orangutan Rehabilitation Centre (Borneo) and Chimfunshi Wildlife Orphanage (Zambia).
Development of neural network models of grammatical construction learning. These models learn the mappings from sentence structure to meaning structure. They currently demonstrate generalization to new grammatical constructions. The weakness in this approach is having a more embodied representation of meaning, rather than the current predicate-argument representations. We hope to improve this. (Hinaut & Dominey, 2013)
Human neurophysiology of meaning. To address this we are examining human brain function in the comprehension of sentences and images depicting meaningful human interaction. We have characterized a semantic network related to that described by Binder, using fMRI and DTI. We are characterizing the functional aspects of the network using EEG (ERP and time-frequency analyses). This work has been presented over the last 3 years at NLC.
Embodied self. Using the iCub robot, we are developing an embodied representation of self, and self-other relations. We believe that the self-other interactions are at the basis of human communication (including language) and so we are dedicated to understanding the self-other relationship, and thus must understand the self.
Autobiographical memory and narrative self: We are interested in the emergence of the linguistic representation of self (I, me) and its role in the construction of the narrative self. This is work in an ongoing and new EU project (Pointeau, Petit, & Dominey, 2013).
- Cortical areas and mechanisms involved in action organization and in coding motor intention;
- Specific visual features that trigger mirror neurons response
- Characterization of the properties of F5 motor and visuomotor neurons also with respect to their location into the cortical depth
- Premotor neurons properties underlying control of vocalization
I think that the investigation about the first three themes will be able, in the next years, to offer data for updating present models, in particular that on the formation and the functioning of the mirror neuron system. The first theme will involve new data on prefrontal cortex, so that I hope that the chain model and the model of action organization at the cortical level will benefit of these additional findings. On the other hand, I believe that further proposal on action sequence organization (learning included) and an updating of the mirror neuron model can prompt new investigations.
Also the last theme seems to me important for the aim of the project. In particular I am interested to understand if and how vocalization could have been possibly included in the evolution of a language–ready brain.
I am working on complementary neuro-computational modeling projects, linked by their emphasis on detailing the neural mechanisms coordinating aspects of primate social behaviors: on the one hand, ‘observational learning’ in monkeys, and on the other gestural communication and social interaction in apes. (We see these disparate efforts as important in unraveling clues to the evolution of neural and cognitive mechanisms involved in linguistic skill in humans.) The first project involves implementing a model of ‘list learning’ based on the Simultaneous Chaining Paradigm (SCP) (1). We suggest that, beyond ‘surface level’ associations learned through trial-and-error for each individual item of a list, higher-order learning facilitates the development of ‘task expertise’ in the animals – that is, learning new lists more quickly and efficiently over time. This learning of the ‘structure’ of the task then becomes important in analyzing the results suggesting that monkeys may learn through observation of a ‘teacher’ monkey performing a new list: that is, some knowledge of the ‘structure’ of the task is hypothesized to be necessary to afford learning through observation. Importantly, there is no interaction between the agents in this paradigm, as the ‘observer’ monkey merely observes the teacher monkey, and only later is presented with this list.
We go beyond ‘observation’ and model ‘interaction’ in apes, simulating two agents in dynamic interaction with learning processes, and different motivational states, leading, over time, to the emergence of communicative signals. Ontogenetic ritualization posits that the ‘mutual shaping of behavior’ between two individual apes may explain at least a subset of observed gesturing behavior (2). By simulating interactions involving social bonding between a Mother and Child ape, we can show that simple learning processes may indeed facilitate the emergence of gestural signs in this way. In our simulations, a ‘reach-to-grasp-to-pull’ action – intended to bring another into close proximity to oneself – becomes ritualized into a ‘beckoning’ gesture whose meaning is linked to this end-state of the action sequence. We note that the ape model we have thus far simulated is preliminary (as well as our chosen gesture, the ‘beckoning’ gesture), but based on work on (primarily) neurophysiological recordings of macaque ‘mirror system’ function, decision-making and motor control – appropriately ‘extended’ to apes. Among our novel claims, we suggest that mirror system responses to others’ behavior may flip one’s own goal states, thus providing an explanation for what downstream targets observed mirror neuron responses project to.
- neural encoding of action semantics
- neurodynamics of representations that are underlying action encoding
- neural population codes involved in the encoding of action
- coordinate transformation involved in the link between
- action perception and action production and their
- neural implementation
- Constructionist approaches to language (representation, acquisition, processing)
- Statistical preemption (learning what not to say)
- Metaphors and embodiment.
Comparative human/nonhuman primate structural and functional neuroimaging related to the evolution of action observation and social learning. Data available for modeling could include relative magnitude of white matter connections between frontal, parietal, and occipitotemporal regions in macaques, chimps, and humans, and relative magnitude of activation in these regions in chimpanzees and humans during the observation of transitive grasping actions.
The majority of my work has focused on the study of gestural communication in wild chimpanzees. In order to consider questions of an evolutionary nature it was key to investigate great ape communication under natural social and environmental circumstances. Our group now works across all four ape species, and I was able last month to also look at gesture in west African, stone tool using, chimpanzees.
My initial studies of gestural communication in wild apes were aimed at replicating the work done in captivity and specifically also at attempting to test the Ontogenetic Ritualization hypothesis. We found no evidence for ritualization in the repertoires of wild apes but I feel that much of the difference in the findings between the current captive vs wild research groups can be accounted for by differences in 1. methodology and 2. the physical and social environment of the apes. In the case of the latter, we are aware that apes can acquire behavior through ritualization, from biological repertoires, and through other forms of social learning such as imitation. Given that multiple acquisition mechanisms are available to them, it may be the case that variations in environment may promote different mechanisms in the acquisition of gestures. For example the lack of fission-fusion in captive chimpanzees decreases the costs of acquiring a behavior through ritualization or social learning that may inhibit this in wild apes. I’m interested in discussing what methodologies may best allow us to examine these questions under different conditions.
As is the case for several other people in our group I’ve now extended my perspective to include multimodal communication. While I feel that this is a key step forwards for the field I find that currently the methodologies for investigating this, including my own, are unsatisfactory and there is considerable development needed to be able to examine these complex combinations. I am particularly interested in the capacity of the group to develop a unified approach to coding and modeling multimodal communication at a level that will provide sufficient generality to be compared across individuals and species (including humans), and sufficient detail to examine these.
At a more basic level in order to pursue the kind of questions outlined above large quantities of data will be required, frequently more than any one researcher or group could hope to acquire. Shared access to information collected between groups would vastly expand the quantity of data, across behavioral contexts (avoiding the current play-bias), and across individuals of different ages, sexes, dominance ranks, and life-histories (e.g. adopted). Inevitably each research group has a particular focus that biases the type of data collected; however, without significant additional work (i.e. avoiding everyone to code the entire list of variables that everyone needs) it would already represent significant progress to agree a collective terminology and sets of definitions. For example: the definition of a particular behavior, signal, or terms such as persistence or elaboration. To assist in this I am able to provide example coding categorizations and video data from multiple contexts, species, and sites (captive, wild).
Relative importance of non-linguistic cues: Empirical research has shown that comprehenders may have a preference to ground their interpretation in recent (over stereotypical or future) events. However, some have argued that these inputs are on a par unless one is more predictive than the other in a situation. We examine this claim by pitching the recent-event preference against a frequency bias towards future events and against speaker gaze as another situation-immediate cue (Knoeferle et al., 2011, Abashidze et al., 2013).
Effects of emotional facial expressions on sentence comprehension: young vs. older adults: In emotion processing, older adults exhibit a positivity bias while young adults exhibit a negativity bias. We assessed the existence of these biases for sentence comprehension by priming older and younger adults with valenced facial expressions that either matched or mismatched the valence of an ensuing sentence (Carminati & Knoeferle, 2013). Ongoing research examines whether we can find similar effects in children, and with facial emotions of a virtual rather than human speaker.
Including a model of the comprehender (age, cognitive resources) into accounts of situated language comprehension. We have been comparing (within items) how non-linguistic cues affect children’s and young adult’s language comprehension with a view to enriching current accounts of situated language comprehension with a model of the comprehender (e.g., age, cognitive resources, Knoeferle, Urbach, Kutas, 2011; Zhang & Knoeferle, 2012)
Effects of spatial distance on semantic interpretation: Inspired by conceptual metaphors like similarity is closeness, we have examined whether visual perception of spatial distance can affect real-time interpretation of sentences about abstract semantic similarity (or dissimilarity) between nouns (Guerra & Knoeferle, 2012). Our results suggest that such influence happens and that it is fast (in first-pass reading times). Ongoing research examines to which extent these results extend from semantic similarity to other domains such as intimacy.
Complementing eye tracking with EEG: At present this line of research has relied on complementing eye tracking with EEG across different studies. Future research aims to extend the Coordinated Interplay Account (which is based on eye-tracking results) with a layer that also models functional brain correlates (Crocker, Knoeferle, & Mayberry, 2010; Knoeferle, Habets, Crocker, & Münte, 2008; Knoeferle, Urbach, & Kutas, 2011, under review).
- Identification, function and properties of mirror neurons in non-human primate and human.
- Corticospinal function in tool use.
- Comparative biology of hand function.
I have been interested in modeling MNs and the circuitry supporting them. As such, I have asked the question, how they develop and what the mechanisms are that lead to the MN properties as we see. First, I developed the MNS model (Oztop and Arbib 2002); then explored, through modeling, how mental state inference ability may be based on motor circuitry (i.e. internal models) available to an organism. (Oztop, Wolpert et al. 2005). I also developed a neural network model of AIP neurons that encode object affordances for grasping (Oztop, Imamizu et al. 2006). After this I made a switch to robotics and studied how robotics can benefit from neuroscience. I focused on how the adaptability of the body schema may help as in this front (Oztop, Lin et al. 2006; Babic, Hale et al. 2011; Moore and Oztop 2012). Parallel to these works, we produced a few review articles critical of ‘standard’ interpretation of MNs (Oztop, Kawato et al. 2006; Oztop, Kawato et al. 2013) as well. Finally, I also supervised several robotic studies that take an affordance based object manipulation as the core mode if interaction with the environment (Ugur, Celikkanat et al. 2011; Ugur, Oztop et al. 2011).
I study the gesturing of great apes, with an emphasis on gorillas. Much of my research is based on a large video corpus of Koko (provided by the Gorilla Foundation) — a human-fostered gorilla taught to communicate with conventional gestures. I also work closely with Joanne Tanner using her video of captive gorillas at the San Francisco Zoo.
My ape gesture research focuses on the relationship between gesture and instrumental action and what I refer to as iconicity in gestures. Iconicity turns out to be a controversial term in this context, but I believe it is an important consideration for our project, and I look forward to discussing it. More generally, I am interested in understanding the processes that are involved in the evolution (e.g. Biological Inheritance), development (e.g. Ontogenetic Ritualization), and online production (e.g. Iconicity) of ape gestures.
My research on the origins and cognitive nature of ape gestures has important implications for how we model ape gestural acquisition. My work informs the particular processes we aim to model (e.g. ontogenetic ritualization). It also informs how we represent ape gestures as data and what we determine to be the relevant gestural parameters that we aim to model.
We focus on the question of what is the structure of meaning such that it is visible and/or “package-able” by language. This question therefore addresses semantic modeling (i.e., the possibility of a two-level semantics model vs. other logical possibilities such concept decomposition, model theoretic approaches), the parameters that should support this categorization, the core phenomena involved and how they are intended to help us link linguistic processing to visual processing.
The question also presupposes a complete understanding of the linguistic system which the field currently does not have, so a lot of our work has to do with modeling basic morphosyntactic infrastructure that is constrained by the larger cognitive and neurological architecture, and that makes minimal assumptions about the nature of the units and principles of combination that constitute a given relevant subsystem (specifically in our case: morphology, syntax and semantics).
This approach connects nicely with that underlying Construction Grammar. So some discussion will take place on this domain alone.
In terms of the structure of meaning we focus on a family of phenomena sometimes identified with the blanket term “Enriched Composition” (Jackendoff, 1997). We use this term to refer to those phenomena that show meaning composition effects that are not accountable through syntactic or morphological composition. That is, from the point of view of the grammar (specifically syntax and morphology) their properties are trivial. Semantically however, they all present problems for standard views on meaning composition. Consequently they represent windows of opportunity to study the structure of meaning (i.e., conceptual structure), the mechanisms of composition that support meaning generation and the potential interfaces between language and other cognitive capacities.
Some of the phenomena we are currently exploring and which have the greatest relevance for our meeting are (the phenomenon is listed alongside the specific published (or in preparation) work from our lab, the methods used to investigate and the language in which it was tested):
- Aspectual Coercion: Jump for an hour vs. Swim for an hour (Piñango, Zurif & Jackendoff, 1999, (cross-modal lexical decision, English), Piñango & Zurif, 2001, (aphasia, English) Piñango et al., 2006, Deo & Piñango, 2012 (linguistic analysis), Lai et al. (in prep) (self-paced reading, fMRI))
- Light Verb construction: Give an order vs. Give an orange (Piñango, Mack & Jackendoff, 2006 (cross-modal lexical decision, English), Wittenberg & Piñango, 2011 (cross-modal lexical decision, German), and Sanchez-Alonso et al. (in prep) (Aphasia and fMRI)),
- Complement Coercion: Begin a book vs. Read a book. (Piñango & Zurif, 2001, (aphasia, English), Katsika et al. 2012 (eye-tracking), Piñango & Deo (linguistic analysis, submitted), & Lai et al. (in prep) (self-paced reading, fMRI).
- Metonymic Composition: The ham-sandwich in the corner wants another cup of coffee vs. The customer in the corner wants another cup of coffee (Piñango, Foster-Hanson et al. (2012 and submitted) (fMRI) and Foster-Hanson, Zhang, et al (in prep) (self-paced reading and ERP).
Besides the shared feature of requiring “enriched” (i.e., non-morphosyntactically supported) composition, the motivation for investigating these specific phenomena arises from the observation that the organizing principles of the conceptual system (as viewed from language) appear as dimension-independent, and that instead, the conceptual system seems to organize itself into more general “templates” around which seemingly distinct kinds of meanings (e.g., temporal location vs. spatial location vs. possession) can be captured. Interestingly for current purposes, these principles are expected to be the organizing principles not only for meaning observed through language, but also for meaning observed through visual information and for meaning informing generalized processes like executive functioning; thus representing the architecture through which distinct cognitive domains (e.g., language and vision) would be formally linked.
Katsika, A., Braze, D., Deo, A., & Piñango, MM. (2012) Complement coercion: Type-shifting vs. Pragmatic inferencing. Mental Lexicon. (7)1, 58-76. [This paper shows, contrary to previous proposals, that the so-called coercion effect is restricted to a specific semantic class of predicates, aspectual predicates]
Lai, Y-Y., Lacadie, C., Constable, R.T Deo., A & Piñango, MM (CUNY conference 2014., in prep) Working title: an fMRI and Spr investigation of the language-CS interface: the case of complement coercion. [This paper presents the most recent evidence from our lab on complement coercion that unifies under one analysis previous apparently disparate results regarding the neurocognitive correlates of the phenomenon].
Piñango, M.M. & Zurif, E. (2001) Semantic Combinatorial Operations in aphasic comprehension: Implications for the cortical localization of language. Brain and Language, 79, 297-308. [This paper shows that the computation of complement coercion preferentially recruits the workings of the LpsTG over the LIFG]
Piñango, MM & Deo, A (2013) A generalized lexical semantics for aspectual verbs. Yale University manuscript (under review). [This paper presents a semantic analysis of aspectual predicates that captures the coercion effect and that serves to explain, why the effect is not available for psychological predicates (in contrast to what was previously thought)]
My research focuses on the neurobiology of language, including its phonological, lexical, combinatorial, semantic and pragmatic aspects. Theoretical models are spelt out at the mechanistic level of distributed neuronal assemblies and implemented in neurocomputational simulations using neuronal networks that mimic brain structure and function. Predictions of these models are tested in cognitive and neuroscience experiments using brain activation (EEG, MEG, fMRI) and behavioral measures in healthy people, after brain stimulation (TMS) and in neurological patients. Neurobiological insights into language mechanisms are translated into new methods for language therapy, for example in patients with chronic post stroke aphasia.
Theory and brain-grounded neurocomputational modeling: We use neuroanatomically and neurophysiologically inspired network models to address questions about language and cognition. These networks mimic the structure and connectivity of the language areas and sensorimotor systems of the human cortex to model language learning and breakdown due to lesions. The networks help explain established knowledge about specific neurocognitive and –linguistic brain activations and offer new experimental predictions to be addressed in neuroimaging experiments or patient studies [1, 2].
Neurophonology: The brain basis of speech sounds and spoken word forms is modeled and investigated in neurophysiological and patient studies. Network models predict that linguistic neuronal assemblies are distributed over frontal and temporal areas. A current key question is whether the inferior-frontal cortex (Broca’s area) and adjacent premotor cortex play a crucial role in speech perception and language comprehension. Old neurological models (Lichtheim, Wernicke) say “no”, but our data provide evidence for a clear “yes”. The debate is still ongoing .
Neurocombinatorial investigations: In network models, the learning of static combinations of the same linguistic units (syllables, morphemes) leads to the formation of distributed neuronal assemblies (DNAs) for these elements of a “lexicon”. In contrast, the learning of flexible combinations between words from specific lexical-semantic categories sets up indirect links by way of circuits we call combinatorial neuronal assemblies (CNAs), which may realize aspects of a “grammar”. DNAs and CNAs motivate the proposal of distinct neurobiological mechanisms for combining meaningful elements into strings, one for whole form storage of static complex-lexical elements and fixed constructions, with distinct brain signatures. Consistent with this model, we found different neurophysiological signatures for violations of static-lexical and flexible-grammatical predictions , These signatures may be useful for addressing questions in linguistics and brain science .
Neurosemantics: The question how meaning is processes and represented in human mind and brain is addressed using brain theory, network simulations and experimental research with neurophysiological and neuropsychological methods. One of our key observations is that the meaning of words and constructions is manifest in specific predictable topographical patterns of brain activation, for example in the motor system . We are currently exploring aspects of the abstract meaning of words and constructions, guided by neurosemantic models .
Neuropragmatics: The meaning of words and constructions is reflected by topographically specific brain activity, but even the same linguistic form appearing in contexts where it carries different communicative function can elicit different brain activity patterns . Brain activations can be mapped for different speech acts and we explore theory-guided explanations for such neuropragmatic relationships .
Neurorehabilitation of language: Intensive language-action therapy (also known as Constraint-Induced Aphasia Therapy or Constraint-Induced Language Therapy) is a method my colleagues and I developed in the late 1990s, which is successful in the therapy of patients suffering from chronic post-stroke language deficits. This method applies recent insights from neuroscience about functional interactions between the brain systems for language and action. We develop this method further and explore its usefulness for treating language and communication deficits [10, 11].
1 Garagnani, M., et al. (2008) A neuroanatomically-grounded Hebbian learning model of attention-language interactions in the human brain. European Journal of Neuroscience 27, 492-513
2 Garagnani, M. and Pulvermuller, F. (2013) Neuronal correlates of decisions to speak and act: Spontaneous emergence and dynamic topographies in a computational model of frontal and temporal areas. Brain and language
3 Pulvermüller, F. and Fadiga, L. (2010) Active perception: Sensorimotor circuits as a cortical basis for language. Nature Reviews Neuroscience 11, 351-360
4 Pulvermüller, F. and Shtyrov, Y. (2006) Language outside the focus of attention: the mismatch negativity as a tool for studying higher cognitive processes. Progress in Neurobiology 79, 49-71
5 Pulvermüller, F., et al. (2013) Brain basis of meaning, words, constructions, and grammar. In Oxford Handbook of Construction Grammar (Hoffmann, T. and Trousdale, G., eds), pp. 397-416, Oxford University Press
6 Pulvermüller, F. (2005) Brain mechanisms linking language and action. Nature Reviews Neuroscience 6, 576-582
7 Pulvermüller, F. (2013) How neurons make meaning: Brain mechanisms for embodied and abstract-symbolic semantics. Trends Cognit Sci 17, 458-470
8 Egorova, N., et al. (2013) Early and parallel processing of pragmatic and semantic information in speech acts: neurophysiological evidence. Frontiers in human neuroscience 7, 1-13
9 Pulvermüller, F., et al. (2014) Motor cognition – motor semantics: Action-perception theory of cognitive and communicative cortical function. Neuropsychologia in press
10 Berthier, M.L. and Pulvermüller, F. (2011) Neuroscience insights improve neurorehabilitation of post-stroke aphasia. Nature Reviews Neurology 7, 86-97
11 DiFrancesco, S., et al. (2012) Intensive language action therapy: the methods. Aphasiology 26, 1317-1351
Development of a computational formalism for construction grammar, which supports parsing, production, learning, and evolution. The formalism is called Fluid Construction Grammar (FCG). It builds further on the state of the art in current language technologies (e.g. by using feature structures and unification) but there are some embryonic ideas on neural implementation. In-depth case studies have already been made to develop grammars for subdomains of different languages, such as a case grammar for German, the tense-aspect system of Russian, verb phrases in Spanish, English noun phrases, a.o.
Development of a formalism for embodied grounded semantics, called IRL, which is also fully operational and integrated with FCG. We are using it for all our humanoid robot experiments, particularly experiments in spatial and action language.
Experiments in the cultural evolution of language in the form of agent-based simulations, some of it using real robots. We have looked with my group at the emergence of vocabularies (for color, space, body action), the emergence of phrase structure, case grammar, functional grammar, agreement systems, quantifiers, and the simulation of grammaticalization processes (particularly for the German article system)
Currently I am particularly interested in the question ‘why do apes use sequences of gestures?’ I believe analyses searching for the ‘meaning’ of gestures and sequences have been too narrow, and that repetition and variation until a goal is achieved is not always the purpose of strings of gestures. Chest beating and other species-typical display elements in gorillas are understood by researchers to be gestures, yet display goes beyond to resemble the expressive performance of proto-music and dance. Insights from early work on display, for instance that of George Schaller, have been neglected. Display can transform itself and vary in a multitude of ways, incorporating a variety of differing objects and forming complex and repeated combinations. In my videotaping of the gorillas of the SF zoo over the years, I have found many bouts of interactive display between gorillas of different ages and sexes, and I am currently working to achieve a more technical analysis. The social interchange and bonding that arise from audible beating movements performed in synchrony is a dyadic phenomenon that requires not only perception of the other’s behavior and beat production, but also the ability to reproduce it. This seems like the kind of job that mirror neurons do, though perhaps differently from neurons directing functional actions. It would be interesting to discover whether there is different brain activation during performance of display sequences vs. usage of action-directive gestures.
Other collaborators (not present at the 2014 workshop):