R&D

OVERVIEW

At Cogent Labs, our goal is to build a preeminent team of engineers and researchers in the fields of AI and Deep Learning. Just as the machines of the industrial revolution multiplied the physical power of man by thousands, we believe that if applied to the right domains and problems, Deep Learning has the potential to multiply the cognitive and creative power of humanity by millions.

In order to leverage this opportunity, Cogent Labs aims to bridge the gap between abstract theoretical research and practical applications to real-world business needs. Besides fostering creative approaches to known problems, the large diversity of backgrounds as well as the open and collaborative working environment of our research team promotes the identification of entirely new tasks that can be solved by AI. Our team has competence in a wide range of domains within Deep Learning, and we are constantly pushing to expand on this.

If you are passionate about research and discovery and share our vision of fundamentally transforming the world using AI technology please get in touch.

Below is a short summary of some of our existing lines of research:

Image Recognition

Vision is arguably the most important sense we employ to take in information from our environment and interact with the world. A disproportionate fraction of our brain’s processing power is dedicated to processing this constant stream of visual information. Yet, despite the apparent ease with which humans can make sense of even highly complex visual impressions, automating image analysis through machine learning has proven to be remarkably challenging until relatively recently.

Supervised image classification is a prime example of a task that has recently turned from being barely solvable by machines, to one where machines can reach super-human performance using a wide range of off-the-shelf algorithms. More challenging problems, such as 3D reconstruction, image segmentation, or handwriting recognition, are now just at the threshold of reaching a similar level of performance. This in turn will allow for even more complex downstream tasks. Some of our current research efforts include:

Handwriting Recognition

Cogent Labs' handwriting technology Tegaki is the first step towards developing a framework that can perform automatic extraction of the information from documents. While being a relatively simple task for humans, (offline) handwriting recognition provides several unique challenges to deep learning systems, such as having to segment images into sequences of unknown lengths. Low-resource languages such as Japanese, having previously received less commercial and academic attention, are still open research areas.

Hierarchical Document Understanding

We are developing systems for full hierarchical document understanding that can take in an entire document, segment it into its individual components such as text, figures and tables, and digitize them in an appropriate and useful way. Our research efforts are currently particularly focused on semi- and unsupervised learning approaches to achieve this segmentation and labeling of complex documents.

Natural Language Processing

Developing a capability to analyze and understand natural language is one of the key capabilities required of any AI that is to be of use to humans and that can make sense of human generated data beyond simple number-crunching. Such an AI also allows users an intuitive and easily interpretable way to interact with it.

Traditional methods in Natural Language Processing (NLP) often employ highly simplified statistical approaches that treat texts as simple “bag-of-words” and compute average statistics on these, completely ignoring contextual information. Other methods rely on a large sets of hand-crafted linguistic rules and features that are completely rigid, time-consuming to define, and do not adapt to changes in the data. In recent years, Deep Learning has shown great potential in overcoming these issues, starting from simple word-embeddings that can improve on bag-of-words representations, to fully end-to-end architectures that can summarize large texts, perform sentiment analysis, answer questions posed in natural language, as well as many more language related tasks.

At Cogent Labs we are interested in the full spectrum of Deep Learning applications in NLP and committed to advancing the state of the art of AI language understanding capabilities. Some examples of our current research activities include:

Sequence-to-Sequence Learning

At the core of many deep learning based natural language systems is a sequence-to-sequence architecture that can compress text into an embedding vector, and then decompress the embedding into another text sequence. A wide range of tasks can be cast into this framework, and combined with for example attention mechanisms, these models form the backbone of many current systems performing neural translation, summarization, and many more.

Latent Variable Models

The text embedding vectors encountered in sequence-to-sequence models are not only useful as a feature in more complex downstream tasks such as summarization, but are also useful in their own right. We are particularly interested in latent variable models that allow for unsupervised tasks such as clustering and similarity analysis, as well as semi-supervised tasks such as automatic labeling and categorization, and sentiment analysis. Our research team’s unique background in statistics and various relevant branches of physics provide us with a unique view on variational approaches and high-dimensional embedding spaces.

Information Extraction

Another active area of research at Cogent Labs is the extraction of information from textual data. Despite the advances of deep learning, most approaches are still merely statistical models of language, lacking a true understanding of meaning and global context. Starting from simple entity extraction and leading up to automatic construction of knowledge graphs based on entire corpora of text, we believe that information extraction will play a key role in future advances in AI, not only assisting other NLP tasks such as allowing summarization that depends on global context, but also providing other systems direct access to a constantly growing knowledge database.

Time-Series

A vast number of real world processes generate data that can be cast as time-series. Thus, being able to analyze, understand and predict a given time-series, and potentially its underlying process, opens up an enormous range of potential applications. Examples of time-series based applications include time-sensitive event classification (medical diagnostics, speaker recognition), forecasting (financial market prediction), anomaly detection (industrial processes or machinery operation protection) and many others.

Time-series are characterized by often long-range temporal dependencies, which can cause two otherwise identical points in time to represent different classes or to predict different future states. This temporal correlation of data points generally increases the difficulty of the time-series analysis or prediction. A majority of current state-of-the-art techniques depended on hand-crafted features that usually require expert knowledge in the field and are expensive to develop. However, deep learning now allows for very flexible and data-driven design of adaptive models to deal with time-series-related problems.

At Cogent Labs, we focus on the development of deep learning techniques for time-series classification, forecasting and fault detection with a special emphasis on the end-to-end deep learning processing pipelines, allowing for fully data-driven and adaptive methods. We are currently pursuing the following:

Financial Time-Series Forecasting

Being able to automatically predict market trends and directions with high accuracy provides a tremendous advantage in many areas of finance. Cogent Labs’ research team is applying its extensive finance expertise and cutting-edge deep learning methodology to this domain. The power of deep learning enables us to effectively fuse and jointly analyze multimodal information streams such as stock prices, FX, and other real-time news feeds. Integrating a multitude of data feeds into unified, actionable predictions is one of our core research motifs.

PROFILES

  • David Cournapeau, Ph.D.

    Principal Machine Learning Engineer

    David Cournapeau has ten years of experience working at the interface between science and software engineering. David is the original author of one of the most widely used machine learning libraries, SciKit Learn, and is also a major contributor to the popular scientific libraries SciPy and NumPy. He holds a Ph.D. in Natural Language Processing from Kyoto University.

  • Thierry Sousbie, Ph.D.

    Principal Research Scientist

    Thierry holds a Master in engineering from INSA as well as a Master in Theoretical Physics and Ph.D. in Astrophysics from ecole Normale Superieure de Lyon. After obtaining a permanent position at CNRS to pursue his career in Astrophysics and working there for 6 years, he spent 1 year at University of Tokyo as invited researcher, and decided to join Cogent Labs due to his growing interest in deep learning and AI. Thierry contributed to more than 30 scientific articles published in reviewed journals and developed several open source projects used by numerous researchers around the world.

  • Stefano Peluchetti, Ph.D.

    Principal Research Scientist

    Stefano started his quantitative studies with a Ph.D. in statistics specializing in Monte Carlo methods. He is also the sole author of an open source computing framework based on LuaJIT. Prior to joining Cogent Labs he worked for 6 years as a quantitative analyst and as a data scientist at HSBC in London. Stefano Peluchetti's current research interests focus on methodological advances at the intersection of statistics and deep learning.

  • David Malkin, Ph.D.

    Artificial Intelligence Architect

    David's Ph.D. in Computer Science at University College London focused on the optimization of complex networks, including neural networks, using genetic algorithms. Following his studies David worked as a quantitative trader using machine learning to develop high frequency trading algorithms. David's work focuses on applying the latest research, especially with non-Euclidian data, towards solving business problems.

  • Tiago Ramalho, Ph.D.

    Lead Research Scientist

    Tiago holds an M.Sc. in Theoretical and Mathematical Physics and a Ph.D. in Biophysics from Ludwig-Maximilians Universität München. After graduating, he moved on to DeepMind where he worked on several high profile projects such as the differential neural computer (DNC) and published papers in international journals including Nature. His research focus is on neural memory, conceptual learning and uncertainty. With these capabilities he hopes to create systems that can generalize beyond the data they have seen in the same way that humans do.

A world-class team of
researchers and industry professionals
Cogent Labs Team