#1
Which of the following measures is affected by extreme values?
Range
ExplanationRange is sensitive to extreme values as it is the difference between the maximum and minimum values in a dataset.
#2
What is the primary purpose of a scatter plot?
To compare two variables
ExplanationScatter plots are primarily used to visually compare the relationship or correlation between two variables.
#3
Which of the following is a measure of central tendency?
Mean
ExplanationMean is a measure of central tendency that represents the average of a dataset.
#4
What does the interquartile range measure in a dataset?
The spread of the middle 50% of the data
ExplanationInterquartile range (IQR) measures the spread of the central 50% of the data, providing insights into the variability within the dataset.
#5
Which graph is most effective for displaying the relationship between two categorical variables?
Bar chart
ExplanationBar charts are ideal for visualizing the relationship between categorical variables by displaying the frequency or proportion of each category.
#6
What is the null hypothesis in a statistical test?
There is no significant difference between groups
ExplanationThe null hypothesis assumes that there is no significant difference or relationship between groups or variables being compared in a statistical test.
#7
What does a p-value less than 0.05 typically indicate?
There is a significant difference
ExplanationA p-value less than 0.05 indicates that there is significant evidence against the null hypothesis, suggesting a significant difference.
#8
Which test would you use to compare means from two related samples?
Paired samples t-test
ExplanationPaired samples t-test is employed to assess the difference between two means from related samples.
#9
What is the purpose of using a box plot in data analysis?
To show the distribution of data
ExplanationBox plots are used to display the distribution of a dataset and to identify outliers and quartiles.
#10
In hypothesis testing, what is Type I error?
Rejecting the null hypothesis when it is true
ExplanationType I error occurs when the null hypothesis is incorrectly rejected, indicating that there is a significant difference when there isn't.
#11
What statistical test is used to compare the means of more than two groups?
ANOVA
ExplanationAnalysis of Variance (ANOVA) is used to compare means across multiple groups.
#12
Which of the following correlation coefficients represents the strongest relationship between two variables?
-0.8
ExplanationA correlation coefficient of -0.8 indicates a strong negative linear relationship between two variables.
#13
In a linear regression model, what does R-squared represent?
The proportion of the variance in the dependent variable that is predictable from the independent variable
ExplanationR-squared measures the proportion of the variance in the dependent variable that is explained by the independent variable(s) in a regression model.
#14
Which of the following is not an assumption of linear regression?
All variables are categorical
ExplanationNot all variables in linear regression need to be categorical; they can also be continuous.
#15
What does the term 'multicollinearity' refer to in multiple regression analysis?
A strong correlation between two or more predictor variables
ExplanationMulticollinearity occurs when two or more predictor variables in a regression model are highly correlated.
#16
In a regression analysis, what does the beta coefficient represent?
The slope of the regression line, indicating the change in the dependent variable for a one-unit change in the independent variable
ExplanationBeta coefficient represents the change in the dependent variable for a one-unit change in the independent variable, indicating the slope of the regression line.
#17
What is the main purpose of principal component analysis (PCA)?
To reduce the dimensionality of the data while preserving as much variability as possible
ExplanationPrincipal Component Analysis (PCA) is used to reduce the dimensionality of data while retaining most of its variability by transforming variables into linearly uncorrelated components.
#18
In time series analysis, what does seasonality refer to?
Fluctuations in the data that are dependent on seasonal factors
ExplanationSeasonality refers to recurring patterns or fluctuations in data that occur at regular intervals and are influenced by seasonal factors like time of year.