Multiple Choice Questions on Data Science
- Which of the following languages is mostly used in Data science?
(A) PHP
(B) HTML
(C) Java
(D) Python
- Which of the following is(are) the stage(s) in the Data science lifecycle?
(A) Data preparation
(B) Model building
(C) Data visualization
(D) All of the above
- The main purpose of Data science is to
(A) Writing books
(B) Analyzing and interpreting complex data
(C) To prevent a virus attack
(D) Building websites
- In statistics, mean of a dataset is the
(A) Rangle
(B) Most frequent value
(C) Average value
(D) Mid value
- A confusion matrix measures
(A) Prediction accuracy
(B) Dataset shape
(C) Data cleaning level
(D) Algorithm speed
- A systematic error in a dataset or model that skews results and leads to inaccurate or unfair conclusions
(A) Anomaly
(B) Bias
(C) Error
(D) Variance
- Data wrangling means
(A) Cleaning and transforming data
(B) collecting data
(C) Visualizing data
(D) Modelling data
- Data duplication means
(A) Regression
(B) Sampling
(C) Redundancy
(D) Overfitting
- SVM stands for
(A) Sample validation Model
(B) Statistical Vector Model
(C) Support Vector Machine
(D) Structured Value Matrix
- The function of the SELECT statement in SQL is to
(A) Delete data
(B) Retrieve data
(C) Update data
(D) Insert data
- DataFrame is a
(A) Unstructured data
(B) 2D labelled data structure
(C) Series of operations
(D) Single value
- EDA stands for
(A) Engineered Data Analysis
(B) Exploratory Data Analysis
(C) Enhanced Data Attributes
(D) External Data Analysis
- Data science ethics include
(A) Accountability
(B) Fairness
(C) Privacy
(D) All of the above
- Data anonymization is
(A) Encrypting data
(B) Removing identity from data
(C) Replacing data with null
(D) Deleting sensitive data
- Data leakage is
(A) Data sharing
(B) Missing values
(C) Test data influencing model training
(D) Data corruption
ANSWERS:
1-(D), 2-(D), 3-(B), 4-(C), 5-(A), 6-(B), 7-(A), 8-(C), 9-(C), 10-(B), 11-(B), 12-(B), 13-(D), 14-(B), 15-(C)