1. How would you design an ETL pipeline ?
  2. Any Dev ops experience?
  3. Any experience with non-relational databases? What was the use case?
  4. Have you ran Spark in standalone mdoe or cluster mode?
  5. What was the use case for using a graph database?
  6. Considering Customer Churn project that you did, how was it made a data product? As in, how was it productionise?What was the ROI of this project?
  7. How do you do EDA (Exploratory Data Analysis)
  8. What is your way of doing feature engineering? Lets say Customer Churn analytics, what features did you consider in the project? What was the prediction accuracy?
  9. What sort of analysis you did on social media data and what was the data product out of it?
  10. When would you decide to use a logistic regression vs. SVM ?
  11. When would you use Kafka and Storm?