# NPTEL Data Analytics With Python Assignment 11 Answers 2022

NPTEL Data Analytics With Python Assignment 11 Answers 2022:- All the Answers provided below to help the students as a reference, You must submit your assignment at your own knowledge.

## What is Data Analytics with Python?

Data Analytics with Python is a fun-filled whirlwind tour of 30 hrs, covering everything you need to know to fall in love with the most sought-after skill of the 21st century. This course includes examples of analytics in a wide variety of industries, and we hope that students will learn how you can use analytics in their careers and life. One of the most important aspects of this course is that you, the student, are getting hands-on experience creating analytics models we, the course team, urge you to participate in the discussion forums and to use all the tools available to you while you are in the course

## CRITERIA TO GET A CERTIFICATE

Average assignment score = 25% of the average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF THE AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

## NPTEL Data Analytics with Python Assignment 11 Answers 2022:-

Q1. ________ is used for calculating distance measures in clustering using python

a. distance_matrix
b. spatial_matrix
c. scipy_matrix
d. distance.matrix

Q2. The formula for dissimilarity computation between two objects for categorical variables is –
Here p is a categorical variable and m denotes the number of matches.

• D(i,j) = p-m / p
• D(i,j) = p-m / m
• D(i,j) = m-p / p
• D(i,j) = m-p / m

Q3. Select the correct option for a data set with 7 objects and an interval-scaled variable ‘f’ we have the following measurements: f = (1, 2, 3, 4, 5, 8, 50) containing one outlying value.

• Std deviation (std_f) and mean absolute deviation (s_f) are equally affected
• Mean absolute deviation (s_f) is more affected by the outlier
• Std deviation (std_f) is more affected by the outlier
• None of these

Q4. Which of the following is true for K-means clustering?

• It comes under the partitioning method
• The number of clusters is predefined for this method
• Cluster similarity is measure in regard to the mean value of the objects in a cluster
• All of the above

Q5. Which of the following can act as possible termination conditions in K-Means?

1. For a fixed number of iterations.
2. Assignment of observations to clusters does not change between iterations. Except for cases with a bad local minimum.
3. Centroids do not change between successive iterations.
4. Terminate when Residual Sum of Squares (RSS) falls below a threshold.
• 1,3 and 4
• 1,2,3 and 4
• 2 and 3
• None of these

Q6. In the figure below, if you draw a horizontal line on y-axis for y=2. What will be the number of clusters formed?

Q7. Which of the following clustering requires merging approach?

Q8. State True or False: Hierarchical clustering should primarily be used for exploration

Q9. State True or False: For finding dissimilarity between two clusters in hierarchical clustering, average-link is the only metric used

If there are any changes in answers will notify you on telegram so you can get a 100% score, So Join

Q10. Hierarchical clustering can either be an agglomerative or divisive algorithm