Interview with Tiago Ramalho, Lead Research scientist -From Global AI Corporation to Tokyo AI Start-Up
Tiago Ramalho joined Cogent Labs in October 2018. He is currently a Lead Research Scientist investigating meta-learning, grounded language and uncertainty in neural networks.
We sat down with Tiago to learn about his experience and views as a leading AI researcher and one of the newest members of Cogent Labs. This interview series will feature three installments: Tiago’s career to date, especially his time at the world-renowned AI research company DeepMind; the state of AI and machine learning research and how his own research fits; and what it’s like working at Cogent Labs.
- Topic 1: Career – From Global AI Corporation to Tokyo AI Start-Up
- Q: Please tell us about your academic background.
- Q: What was it like working at DeepMind?
- Q: And then you recently joined Cogent Labs. How has that been?
- Q: What are some of the lessons you have learned so far in your education and in your career?
- Q: Do you have any long-term career goals you are working towards?
Topic 1: Career – From Global AI Corporation to Tokyo AI Start-Up
Q: Please tell us about your academic background.
When I had to decide what studies to take at university, I chose physics. Natural sciences had always appealed to me and physics was particularly attractive because it combined the search for the elegant hidden structures in the physical laws of the universe and the rigor of mathematics. I did my bachelor’s in physics engineering at the University of Lisbon in Portugal. It was a mix of physics as well as programming, chemistry, electronics, and a variety of other subjects. The course was very broad and was designed to prepare students to be able to move on to a variety of fields from theoretical physics to engineering.
After that, I became more interested in theoretical physics and decided to study for a master’s in that field at Ludwig-Maximilians-Universität in Munich, Germany. I spent two years studying very dense mathematical physics and ultimately ended up pursuing statistical physics. When I started to do research, I looked at pattern formation in biological systems. Specifically I was interested in how stem cells acquire structure and become differentiated tissues in a developing embryo.
After my master’s, I stayed on to earn a PhD under the same advisor. I continued to do research in the field of biological physics, a branch of statistical physics. Biological data is very noisy and the systems are highly complex, making it very difficult to make sense of everything. We had to use a variety of statistical techniques to start to disentangle the experimental data, which is how I began learning about and applying machine learning. I did a lot of work applying machine learning to these biological systems and learning more about them through this process. I spent a few years doing that but eventually decided that pure academia was not for me. That is when I joined DeepMind, which is part of Google.
Q: What was it like working at DeepMind?
I worked at DeepMind for three years and participated in numerous projects in the organization. As a research engineer, I moved between different projects and did a mix of programming and research. First, I worked with a group which investigates how neural networks work. Our goal was to determine whether we can understand how a trained neural network works. We took a rigorous scientific approach, using knowledge from physics and neuroscience, and characterized things very quantitatively and precisely.
Then I moved to the language team. At the time the focus was on reinforcement learning. We created agents that learn to act in an environment in a way that depends on a language instruction provided to them. I became interested in the idea of whether these agents actually understood the world they are operating in and whether they relate the language instructions to objects and actions in the same way humans do. I would give them language descriptions and see if they could recreate them as drawings, and also do the opposite – see if they could generate a language description of a picture. We have made some progress there but the problem is still unsolved in its full generality. That is a research area that I am still very interested in.
My last engagement was on meta-learning. The usual way of training machine learn models is to optimize their parameters for optimal performance for a fixed dataset or task. However, when they are deployed in the real world the data they see might be completely different from what they were trained on. The goal of meta-learn is to instead teach the models to learn from what they are currently observing, so that they can adapt to changing patterns in the data they observed.
Recently we wrote a more detailed post on this research.
Q: And then you recently joined Cogent Labs. How has that been?
I joined Cogent Labs in October 2018 and I am still getting used to the fast pace of a startup. I have been learning a lot and enjoying talking about AI research with everyone around me.
Cogent Labs attracted me with the opportunity to have greater research freedom and the challenge of independently creating a research agenda. Working in a small but dynamic AI startup also provides me with a lot of development opportunities, as we tackle the challenge of adapting state of the art AI research to real world products that directly impact people.
Q: What are some of the lessons you have learned so far in your education and in your career?
There are a few key lessons I learned during my PhD. One is to be very careful about checking what you are doing. When you have an idea and implement it in code, it is almost guaranteed that you will make a mistake in your code or have a logic bug somewhere. So you need to always check very methodically that what you have done is correct, by carefully testing each part of your implementation.
You also need to be very careful about how you design experiments. This is actually where engineering can help you become a better scientist. Rather than developing a very large program, you want to break things down into modules that you can test. This allows you to write unit tests and confirm which parts are working well and which are not.
Similarly, when designing your experiment, you should avoid the temptation of writing throwaway code. Suppose you write some code to calculate an integral and you leave it mixed in with other parts of the code. If you later want to change the integral calculation you will have to change your whole experiment, potentially introducing new bugs. However, if you separate things out cleanly, it is much easier to very quickly iterate. There is often an upfront cost to designing your model properly, but after you pay it, it actually enables you to iterate much faster. If you are methodical in this way, you can speed up your research, and avoid costly mistakes and rewriting everything.
Another lesson is to actually understand what the goal of your research idea is. It is very easy to fall into the trap of reading a number of papers, coming up with some ideas that seem appealing, and then spending a lot of time working on those ideas, only to realize that there is no point or story to them. I learned that before actually beginning a piece of research, you should think about what story you are going to tell people and what they are going to learn by reading about your research. This helps prune ideas much more quickly. Will your idea contribute to the larger base of knowledge and allow people to build on it, or will it be a loose end that is interesting but nobody will build on? If it is the latter, then what is the point? A similar principle applies to business. Will people be willing to pay for a product that implements this idea?
I also obviously learned a lot during my time at DeepMind. There is a very rigorous engineering culture so there were a lot of lessons around software quality, which helps inform how you design scalable machine learning systems. I learned just how important it is to have both strong mathematics and engineering skills when doing machine learning work. I think that is why there is such a high barrier to entry in machine learning, and why machine learning researchers are in high demand and short supply.
Fortunately, I happened to have both skillsets because my background is in natural sciences and I have long had a personal interest in computers. People are starting to realize that a strong foundation in mathematics coupled with technical knowhow is a significant advantage in the age of AI and automation, and I hope to see educational efforts reflect this reality more.
Another important lesson is that you cannot work alone. With machine learning, you need to be collaborative and to work as a group. Machine learning is not about some lone genius. You need to work as a team because these are very complex systems that require many moving pieces to be put together. You need to bring them together and find a unified goal.
At DeepMind, I had to learn not only to guide the people working under me, but also convince my peers to work together on a unified project and merge our efforts, so we could build on each other’s work and not just create our own silos. To do that, you need a culture of openness, where you are open to other people’s ideas and to having your own ideas challenged. You need to try things out to see what works and what does not, and let the results speak for themselves. It is also important to empower people to try things out, make mistakes, and not be afraid to own up to them.
Q: Do you have any long-term career goals you are working towards?
My goal is to make significant progress in the field of machine learning. To me, career advancement is secondary to producing good research. I feel that if you are producing great research, then you will make progress in your career. It does not work the other way around. I want to pursue ambitious research goals, and work on what I think is important and pursue ideas that will hopefully contribute to better AI systems. Then the career aspect should manifest itself naturally.