I have 6 years working experience as a machine learning engineer (though job title not exact “machine learning engineer”, but the responsibility is), started this role before the deep learning boom. My job is translate business problems into machine learning solvable problems and productionize the models with backend engineers. Sometimes, I also need to develop the data pipeline. I will share about the math that machine learning engineers need in this article.
The Math Used at Work
- Basic concept of statistics, it’s helpful for data exploration.
- Understand the concept of the machine learning model you use, it’s helpful for model and parameter selection.
- In deep learning, at least understand the loss function. Sometimes I combine different losses for training. Sometimes no existing loss fits my problem, so I have to write my own loss, then let TensowFlow or PyTorch optimize the loss function for me.
- Understand the metric, the choice of model and loss function depend on the metric.
To understand metric and loss function is very important, it tells you what to do when you get a bad performance on the metric.
Another opportunity to use math is when reading research papers. When reading a paper, the first thing I look at is how the experiment was conducted and what the results are. If I feel that there is potential for performing well on my own data, I will continue reading. The second thing I look for is whether there is existing code that I can use. The last thing I will do is trying to understand the methods in the paper (usually the loss function). As for the theorems inside, many of the assumptions may have discrepancies with the real world, or they may be proving something like error bounds that are not particularly useful even if I understand them, so I usually skip over them.
The most important thing in work is business value. The technical or mathematical complexity behind it is not too important. It sounds impressive to those who don’t understand when talking about every model. If rule-based methods work well enough, I’ll use them first. Of course, generally speaking, machine learning methods performs better than rule-based methods, so there are not many opportunities to use rule-based methods.
Math is important in helping you choose metrics and models properly, and in making model tuning more efficient. This means you don’t have to run all the combnination of parameters, which can save both time and money - two big costs for the company. Having a good understanding of math can bring significant business value to a company.
With regards to these requirements, I think that there is no need for very advanced math. The main areas needed are probability, statistics, linear algebra, and calculas (without integration). Anything more difficult than that is rare to be used. The most important thing is clear concepts of math, because they are applied to the daily work of machine learning engineers.
Finally, coding is a basic skill for engineers, and data structures and algorithms can also be considered as mathematics. All ideas need to be coded to be implemented in the real world. Machine learning often dealing with big data and models needing to go production in the real world. I don’t know how strong one’s skills in machine learning need to be for a company to be willing to assign someone specifically to write code for them.
My experience is that colleagues usually help with the architecture design, while I have to write the code for model inference myself. As for the data pipeline, my colleagues will organize the data and possibly store it in the parquet format on S3. I have to write the code to transform the data into features for model training, and then deploy the model onto the production machine.
The code for an experiment is entirely up to you, and the efficiency of the code will affect how long the experiment runs and the scale it can handle.
The Math Used at Job Interviews
For the past six years, I have interviewed for over 20 machine learning engineer positions at companies of various sizes in Taiwan, Japan, and Singapore. I have also assisted in interviewing machine learning engineers at two companies where I have worked. Typically, machine learning interviews are giving you a problem and asked how you would use machine learning to solve it, such as designing a recommendation system, and then further questions based on your answer. I believe that as long as you understand the basic concepts and have a certain level of understanding of the models used to solve the problem, you should be fine. I have never been asked any questions about proving any theorem. I have not interviewed for a research scientist position, so I am not sure about those interviews.
Except for a few small companies that don’t ask about coding, almost all companies do, and it seems that the larger the company, the harder the questions they ask. Therefore, it’s still necessary to practice on Leetcode, especially for larger companies.
Requirements of Job Interviews
Recently, my company had a lot of headcount for machine learning engineers, so I helped interview about ten people and ultimately hired three. I participated in the second-round interview, so I don’t know how the first round was conducted. We were mainly looking for people with NLP experience, but none of the resumes I saw had top conference papers. Only one person had Kaggle rankings, one had a master’s degree in NLP, and the others were invited because of their relevant work experience. After they joined, I didn’t see any issues with their job performance, and they were able to make business contributions to the company.
The basic requirements for machine learning that I expect in interviews are:
- Ability to choose appropriate metrics based on the problem.
- Selecting appropriate models based on the chosen metrics.
- Understanding the concepts behind the models and identifying important parameters.
- Knowing how to evaluate model performance.
- Knowing how to debug and tune the model when they do not perform well.
The above has nothing to do with whether one has published a paper or not. Of course, those who have published top conference papers should be able to answer these questions well, but even those who have not published can do so. On the contrary, because these questions are very important, I personally value Kaggle. Only the metrics is given by Kaggle, participants need to have good abilities in other areas to achieve good results. For Kaggle, I only recognize competition results with cash prizes. I believe that those who have won bronze medal or higher should have basic abilities and can be invited for an interview, and those who have won silver medal or higher should have a good understanding in machine learning. I would consider those who have won gold medal must be an expert.
Of course, in competitions related to deep learning, the participants will have advantage if they have good equipments, but there are also people who have won gold medals using the free Google Colab. If you only aim for a bronze medal, using the free Colab is definitely possible. For some competitions, Google also sponsors $300 to let you use GCP, and I think equipment won’t be a problem if you don’t aim for top results.
Interview Preparation as a Newcomer to Machine Learning
Employers usually look for candidates with relevant work experience or a master’s degree in machine learning, or those with Kaggle achievements when they interview candidates. If you are switching fields and don’t have the first two, you can try to win a Kaggle bronze medal. Even if you can’t achieve it, you will at least learn about machine learning, and exploring your interests. Your interests is also important in life. I think Kaggle is fun, and I always learn a lot of new things after each competition. It just takes too much time; otherwise, I would also like to participate every day.