Piece of Writings

QuantumBlack (McKinsey) Singapore Data Scientist Interview

December 21, 2019

7-Minute Read

1327 words

Record of my interview experience with QuantumBlack (McKinsey) as a data scientist in Singapore in 2019.

How to Solve AUC Less than 0.5 Problem

March 11, 2019

2-Minute Read

384 words

Someone asked this question on PTT (in Chinese). He trained a rectal cancer detection model on MRI images with 5 fold cross validation, but out-of-fold AUC were less than 0.5 in every folds. After some searched on Internet, he found someone said: oh, if you reverse the label (switch class 0 and 1), than you can get AUC better than 0.5, your model still learnt something. In my humble opinion, it is very dangerous to reverse label on a worst than random model. So, how to solve it?

How to Train Word2vec and FastText Embedding on Wikipedia Corpus

February 8, 2019

3-Minute Read

583 words

This article will introduce how to download Wikipedia corpus and train word embedding on it. All the code will be on Github. Downloading time and training time is extremely long, so I also uploaded my pretrained embedding. You can download my pretrained embedding here: Chinese Word2Vec, Chinese FastText, English Word2Vec, English FastText.

Install TensorFlow 1.8.0 on Ubuntu 18.04

May 15, 2018 Latest update at:
May 20, 2023

2-Minute Read

339 words

Recently bought a new computer, encountered some strange problems and stuck for a long time. Just document the steps for setting up the environment here.

Version 2.0.0 - July 20, 2020

QuantumBlack (McKinsey) Singapore Data Scientist Interview

How to Solve AUC Less than 0.5 Problem

How to Train Word2vec and FastText Embedding on Wikipedia Corpus

Install TensorFlow 1.8.0 on Ubuntu 18.04

Recent Posts

Building a Standout Entry-Level Machine Learning Engineer Resume

Y Combinator Startup School Online Course

Lightweighting Models using ONNX

Thoughts on Elon Musk's Biography

The Dangers of the 4% Rule

Categories

Tags