2020 Full-time Interview CCJ’s Preparation (4): Machine Learning and Computer Vision Questions

see: 2020 Full-time Interview CCJ’s Preparation (1): Commonly Asked C++ Interview Questions, at this link.

see: 2020 Full-time Interview CCJ’s Preparation (2): Commonly Asked C++ Interview Questions in a Table, at this link.


1. Explain the difference between supervised and unsupervised machine learning?

In supervised machine learning algorithms, we have to provide labelled data, for example, prediction of stock market prices, whereas in unsupervised we need not have labelled data, for example, classification of emails into spam and non-spam.

2. Explain the difference between KNN and k-means clustering?

KNN is a supervised machine learning algorithm where we need to provide the labelled data to the model it then classifies the points based on the distance of the point from the nearest points.
Whereas, on the other hand, K-Means clustering is an unsupervised machine learning algorithm thus we need to provide the model with unlabelled data and this algorithm classifies points into clusters based on the mean of the distances between different points

3. What is the difference between classification and regression?

Classification is used to produce discrete results, classification is used to classify data into some specific categories .for example classifying e-mails into spam and non-spam categories.
Whereas, We use regression analysis when we are dealing with continuous data, for example predicting stock prices at a certain point of time.

4. How to ensure that your model is not overfitting?

5. List the main advantage of Naive Bayes?

A Naive Bayes classifier converges very quickly as compared to other models like logistic regression. As a result, we need less training data in case of naive Bayes classifier.

6. Explain Ensemble learning.

In ensemble learning, many base models like classifiers and regressors are generated and combined together so that they give better results. It is used when we build component classifiers that are accurate and independent. There are sequential as well as parallel ensemble methods.

Ensemble learning helps improve machine learning results by combining several models. This approach allows the production of better predictive performance compared to a single model. Basic idea is to learn a set of classifiers (experts) and to allow them to vote.

Alt text

Alt text

7. Types of Ensemble Methods:

Ensemble Methods: Predict class label for unseen data by aggregating a set of predictions (classifiers learned from the training data)

Types of Ensemble Methods: bagging, random forests and boosting.

1) Bagging

Alt text

Alt text

2) Random Forests

How to achieve randomness?

Alt text