Research

Full publication list here.


CCIS @ Northeastern University (2017–)

Active Goal Recognition


MLR @ University of Stuttgart (2013–2017)

Decoding the Geometry out of Relational Descriptions of the Environment

application of relational feature density model to a room domain

Human cooperation very often depends on a specific type of interaction, in which one agent Alice defines a task via a high-level description of the goal state, and the other agent Bob interprets the goal state and performs the task. For this to take place, Alice has to be able to encode her desired geometric state and compress it into a predicate-based relational description, while Bob has to be able to take such description and decode it into a belief over the possible wanted geometric states.

Identification of Unmodeled Objects via Relational Descriptions

Successful Human-Robot Interaction (HRI) requires advanced communication algorithms which are able to parse instances of human language and associate the symbolic components with perceptual representations of the environment. The identification problem is one instance where this is required, and is defined as the problem of correctly identifying an object out of many given a relational description.

The description problem is alike to a standard classification problem—each object representing a separate class—with the key difference that the number of classes and their associated semantics are not predefined, but rather differ in problem instance. As a further consequence, the output classes also have features associated with them, e.g. a mug which could be the object of identification could have measurable features relative to its color, shape, or position.

tabletop identification scenario with multiple blocks and the respective identification probabilities

To address the description problem, we propose a logistic-regression-like stochastic model which outputs a likelihood over all objects. The model exploits contextual information by weighing the significance of each description predicate by how much other objects in the scenario exhibit that same property, which in turn allows the given descriptions to be flexibly given as long as they focus in one way or another on combinations of properties which make the referent object distinguishible from the rest of the environment.

Relevant Publications

  • link pdf mp4 — Identification of Unmodeled Objects from Symbolic Descriptions

Temporal Segmentation of Concurrent Asynchronous Demonstrations

extracted instruction set for the assembly of a tool box

A general human demonstration of a complex task is typically composed of a large number of concurrent and asynchronous interactions with the environment which we call interaction phases. The success of an autonomous system which learns from human demonstrations hinges on its ability to semantically parse such demonstrations and deconstruct them into their atomic components, thus learning a representation for how and why they are performed.

In this work, we use a conditional random field (CRF) to model and infer interactions between objects from their joint motions. The model is applicable both to hand-object pairs, in order to extract purposeful interactions with the environment, and to object-object pairs, to extract changes of state in assembly tasks.

Relevant Publications

  • link pdf mp4 — Temporal Segmentation of Pair-Wise Interaction Phases in Sequential Manipulation Demonstrations
  • link pdf mp4 — Robot Programming from Demonstration, Feedback and Transfer


CVAP @ KTH (2012)

Implicit Feature Space Embeddings for Sequential Structures

stretches corresponding to selected paths binary path matrix denoting similarity at character level selected paths overlapped on the path matrix

At the Computer Vision and Active Perception (CVAP) lab, I worked on my MSc thesis on the topic of kernel functions for sequential data. The main result of this research has been the development of a novel and theoretically sound kernel function—the Path Kernel—for discrete and finite sequences of arbitrary symbols.

Given a lexicon $\Sigma$ and strings $s, t \in \Sigma^+$, the Path Kernel $k_\texttt{PATH}$ computes a general measure of similarity which is interpretable in terms of the number of matching substrings, their length and their location. $k_\texttt{PATH}$ is agnostic with respect to the nature of the lexicon, and depends numerically on a freely chosen Symbol Kernel $k_\Sigma\colon \Sigma\times\Sigma\to\mathbb{R}$ which is valid for the lexicon $\Sigma$, and through which context and prior knowledge about the application domain can be embedded.

Relevant Publications