# NPTEL Introduction to Machine Learning Assignment 7 Answers

## About Introduction To Machine Learning

With the increased availability of data from varied sources, there has been increasing attention paid to the various data-driven disciplines such as analytics and machine learning. In this course, we intend to introduce some of the basic concepts of machine learning from a mathematically well-motivated perspective. We will cover the different learning paradigms and some of the more popular algorithms and architectures used in each of these paradigms.

CRITERIA TO GET A CERTIFICATE

Average assignment score = 25% of the average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF THE AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

## NPTEL Introduction to Machine Learning Assignment 7 Answers 2022 {July – Dec}

1. You have 2 binary classifiers A and B. A has accuracy=0% and B has accuracy=50%. Which classifier is more useful?

a. A
b. B
c. Both are good
d. Cannot say

`Answer:- a`

2. You have 2 multi-class classifiers A and B. A has accuracy=0% and B has accuracy=50%. Which classifier is more useful?

a. A
b. B
c. Both are good
d. Cannot say

`Answer:- d`

3. Using the bootstrap approach for sampling, the new dataset will have _________ of the original samples on expectation.

a. 50.0%
b. 56.8%
c. 63.2%
d. 73.6%

`Answer:- c`

4. You have a special case where your data has 10 classes and is sorted according to target labels. You attempt 5-fold cross validation by selecting the folds sequentially. What can you say about your resulting model?

a. It will have 100% accuracy.
b. It will have 0% accuracy.
c. Accuracy will depend on how good the model does.
d. Accuracy will depend on the compute power available for training.

`Answer:- b`

5. Given the following information

What is the precision and recall?

a. 0.5, 0.4375
b. 0.7, 0.636
c. 0.6, 0.636
d. 0.7, 0.4375
e. None of the above

`Answer:- d`

6. AUC for your newly trained model is 0.5. Is your model prediction completely random?

a. Yes
b. No
c. ROC curve is needed to derive this conclusion
d. Cannot be determined even with ROC

`Answer:- c`

7. What is the effect of using bagging on weak classifiers for variance?

a. Increases variance
b. Reduces variance
c. Does not change

`Answer:- b`

8. You are building a model to detect cancer. Which metric will you prefer for evaluating your model?

a. Accuracy
b. Sensitivity
c. Specificity
d. MSE

`Answer:- b`

9. You are building a model to detect a mild medical condition for which further testing costs are extremely expensive. Which metric will you prefer for evaluating your model?

a. Accuracy
b. Sensitivity
c. Specificity
d. MSE

`Answer:- c`

10. A: Boosting takes many weak learners and combines them into a strong learner.
B: Boosting determines the proportion of importance each weak learner should be assigned and weighs its prediction by it and combines them to make the final prediction.

a. A is True. B is True. B is the correct explanation for A.
b. A is True. B is True. B is not the correct explanation for A.
c. A is True. B is False.
d. Both A and B are False.

`Answer:- a`

## NPTEL Introduction to Machine Learning Assignment 7 Answers 2022 {Jan – June}

Q1. In LOO Cross Validation, you get K estimators. (excluding the final estimator that may be an ensemble of these K estimators)
If size of dataset = N, K =?

• N/2
• N-1
• None of the above

Q2. Given the following information

What is the precision and recall?

• 0.5, 0.6
• 0.3, 0.8
• 0.3, 0.6
• 0.5, 0.8
• None of the above

Q3. To plot ROC curve, you first order the data points in _______ order of their likelihood of being positive.

• Descending
• Ascending
• Random
• Doesn’t matter

Q4. Which of the following are true?TP – True Positive, TN  True Negative, FP – False Positive, FN – False Negative

Q5. Consider the following two statements:

A: In bagging, the estimators can be trained parallel.
B: Each estimator in bagging uses the same algorithm.

•  A is True. B is True. B is the correct explanation for A.
• A is True. B is True. B is not the correct explanation for A.
• A is True. B is False.
• Both A and B are False.

Q6. For a binary classification problem, consider the two statements below:

A: A classifier with AUC=0 is the least useful classifier.
B: A classifier with AUC=0.5 is the least useful classifier.

{Hint: For A, what if the labels were reversed?}

• A is True. B is False.
• A is False. B is True.
• Both are False.
• Their ensemble will be the worst classifier.

Q7. The relationship between their recall is:

• Recall(A) > Recall(B)
• Recall(A) < Recall(B)
• Recall(A) = Recall(B)
• Cannot be determined

Q8. True/False: Model A is equivalent to a random model based on its confusion matrix.

• True
• False

Q9. Consider the following two statements:

A: The estimators in Boosting can be trained in parallel.
B: Boosting is simply Bagging with a different sample distribution.

• A is True. B is True. B is the correct explanation for A.
• A is True. B is True. B is not the correct explanation for A.
• A is True. B is False.
• Both A and B are False.