#1
Which of the following measures is affected by extreme values?
Median
Mode
Range
All of the above
#2
What is the primary purpose of a scatter plot?
To compare two variables
To track changes over time
To display distribution of a single variable
To show how a variable is affected by categories
#3
Which of the following is a measure of central tendency?
Variance
Standard deviation
Mean
Range
#4
What does the interquartile range measure in a dataset?
Variability around the median
The range between the minimum and maximum values
The average value of the dataset
The spread of the middle 50% of the data
#5
Which graph is most effective for displaying the relationship between two categorical variables?
Histogram
Scatter plot
Bar chart
Line graph
#6
What is the null hypothesis in a statistical test?
There is a significant difference between groups
There is no significant difference between groups
The sample data does not represent the population
The observed data is due to chance
#7
What does a p-value less than 0.05 typically indicate?
The null hypothesis is true
There is no significant difference
There is a significant difference
The sample size is too large
#8
Which test would you use to compare means from two related samples?
Independent samples t-test
ANOVA
Paired samples t-test
Chi-square test
#9
What is the purpose of using a box plot in data analysis?
To show the distribution of data
To display the mean of the data
To plot the correlation between two variables
To represent categorical data
#10
In hypothesis testing, what is Type I error?
Accepting the null hypothesis when it is false
Rejecting the null hypothesis when it is true
Failing to reject the null hypothesis when it is false
Both A and C
#11
What statistical test is used to compare the means of more than two groups?
ANOVA
Independent samples t-test
Paired samples t-test
Chi-square test
#12
Which of the following correlation coefficients represents the strongest relationship between two variables?
#13
In a linear regression model, what does R-squared represent?
The proportion of the variance in the dependent variable that is predictable from the independent variable
The correlation between the dependent and independent variables
The average distance of the data points from the regression line
The slope of the regression line
#14
Which of the following is not an assumption of linear regression?
Homoscedasticity
Normality of residuals
Independence of observations
All variables are categorical
#15
What does the term 'multicollinearity' refer to in multiple regression analysis?
A strong correlation between two or more predictor variables
A linear relationship between the predictor and outcome variables
Multiple regression models used in parallel
Collinearity within a single predictor variable
#16
In a regression analysis, what does the beta coefficient represent?
The intercept of the regression line
The slope of the regression line, indicating the change in the dependent variable for a one-unit change in the independent variable
The degree of spread in the data
The correlation between the dependent and independent variables
#17
What is the main purpose of principal component analysis (PCA)?
To classify the data into distinct groups
To reduce the dimensionality of the data while preserving as much variability as possible
To predict the outcome of a dependent variable based on independent variables
To test the hypothesis of no association between two variables
#18
In time series analysis, what does seasonality refer to?
The trend of the data over time
Cyclic patterns that repeat at irregular intervals
Fluctuations in the data that are dependent on seasonal factors
Random variations in the data