interview ds
Contents
Probability/Statistics Questions 1. conditional probability 如何判断独立性? 2. bayes rule problem: base rate fallacy
sampling: bootstrap, reservoir
A/B test, p value
Poisson, Binomial以及Exponential queuing question: The most efficient way to queue is single line, multiple servers. It is efficient because: You don’t have servers being idle because their queue is empty even though others are full. You don’t have stalls in service due to a single customer being slow All customers wait approximately the same time. https://en.wikipedia.org/wiki/M/M/c_queue
Coding Questions
Machine Learning/Modeling Questions 1. explain SVM, why we need different kernels? 2. explain nn, backward propagation? 3. decision tree, entropy 4. problems of decision tree, how to solve overfitting 5. bagging, random forest 6. precision, recall, ROC, AUC 7. when sampling 8. prevent overfitting
Problem Solving Questions 1. recommendation system for moive streaming, ecommerce 2. fraud detection model 3. bid for ads, CTR modling, conversion modeling 4. search relevance, CTR, 5. predict churn rate.
Hypothesis testing 1. state the hypotheses 2. identify the test statistic and its probobility dis 3. significance level 4. state the decision rule 5. collect data adn perform the calculation 6. make statistical decision 7. make the economic decision
causal inference A/B experiment design one sample, two sample t-test confidence interval pitfalls chi-square test ANOVA
profit curve
latent
matrix factorization
Author Chen Tong
LastMod 2017-09-22