- How would you design an ETL pipeline ?
- Any Dev ops experience?
- Docker
- Kubernetes / Docker Swarm
- Any experience with non-relational databases? What was the use case?
- Have you ran Spark in standalone mdoe or cluster mode?
- What was the use case for using a graph database?
- Considering Customer Churn project that you did, how was it made a data product? As in, how was it productionise?What was the ROI of this project?
- How do you do EDA (Exploratory Data Analysis)
- What is your way of doing feature engineering? Lets say Customer Churn analytics, what features did you consider in the project? What was the prediction accuracy?
- What sort of analysis you did on social media data and what was the data product out of it?
- When would you decide to use a logistic regression vs. SVM ?
- When would you use Kafka and Storm?