As the world’s largest machine learning competition platform, Kaggle always has ongoing competitions with prizes. More importantly, if you are looking for a machine learning or data science-related job, achieving good results on Kaggle can significantly enhance your resume.
What is Kaggle
Kaggle is the world’s largest machine learning competition platform, originally founded in Australia and acquired by Google in 2017. On the competition page, there are always having competitions ongoing. Achieving good results in Kaggle competitions is recognized worldwide.
It’s a bit like Leetcode for machine learning, but in addition to practicing, good results in Kaggle competitions are widely recognized on resumes, unlike Leetcode where no one really looks at how many problems you’ve solved. This is especially helpful for people who want to switch domains but can’t go back to school to get a degree or gain relevant work experience. The easiest way to do so is to achieve a good result in a Kaggle competition, which can serve as a stepping stone towards a career in machine learning or data science.
Categories of Competition
If you go to the competition page, you can see a dropdown menu called All Categories
, which shows how many categories the competition has (introducing some common ones):
- Featured: usually organized by companies, with higher prize money.
- Research: usually organized by academic institutions or conferences, with less prize money.
- Playground: usually organized by Kaggle, with no prize money, only small rewards, and results are not included in the ranking.
Generally, only competitions with prizes and rankings (Featured or Research) are recognized on resumes, as these competitions are more challenging, and winning them demonstrates high skill.
The Way of Conducting a Competition
Generally, the competition lasts for two to three months and can be participated in by individuals or teams. There are two sets of data, train and test, and the answers are provided for the train data for model training. After training the model, predictions are made for each data in the test set, and the resulting scores determine the ranking.
The test data is divided into two parts, public and private, but we can know how they are divided only until the competition ends. Each time a prediction is uploaded, the score and ranking for the public portion can be seen, but the final ranking is determined by the private portion.
Finally, depending on the number of participants, approximately the top ten can win gold medals, the top fifty can win silver medals, and the top one hundred can win bronze medals. The more participants there are, the more medals there will be, and vice versa. More details on the official website. Generally speaking, having a medal-winning performance in a competition is a plus for your resume.
Deep Learning Competition
Usually, deep learning competitions require a GPU for sufficient computational resources to participate. Most people may not purchase a computer with a GPU specifically, so Kaggle kindly provides a free GPU for everyone to use. As long as the programming is written on the Kaggle website interface, it can be executed using Kaggle’s GPU. The only thing to note is that the usage is limited, with only a few dozen hours available per week.
If the usage limit has been reached, there is still Google Colab, which also provides free GPU usage, but the limitation is that the browser must be kept open. It can only be used for daily purposes, and if Google detects that you have started running a program and then went to sleep while using too many resources, it will stop your program. However, if the budget is limited, it is a viable option.
Non-deep Learning Competition
For competitions that do not involve deep learning, there is no need to worry about GPU resources. If the data is not too large, a regular laptop can usually handle training the model.