#1
What is the mean of the following data set: 2, 4, 6, 8, 10?
6
ExplanationMean is the sum of values divided by the number of values.
#2
What is the median of the following data set: 4, 7, 2, 9, 5?
5
ExplanationMedian is the middle value in a sorted dataset.
#3
What does 'p-value' represent in hypothesis testing?
Probability of rejecting a true null hypothesis
ExplanationP-value is the probability of obtaining results as extreme as observed, assuming the null hypothesis is true.
#4
What is the formula to calculate standard deviation?
√(Σ(x - μ)² / n)
ExplanationStandard deviation measures the amount of variation or dispersion in a set of values.
#5
What does a confidence interval represent in statistics?
The range of values within which the population parameter is likely to fall
ExplanationConfidence interval provides a range of values where we expect the true population parameter to lie with a certain level of confidence.
#6
What is the purpose of a z-test in statistics?
To test the difference between two population means
ExplanationZ-test assesses if the difference between sample and population means is statistically significant.
#7
What is skewness in statistics?
A measure of the degree of symmetry in a distribution
ExplanationSkewness measures the asymmetry of a probability distribution.
#8
In a chi-squared test, if the calculated chi-squared value is greater than the critical chi-squared value, what does it indicate?
Null hypothesis is rejected
ExplanationThe calculated chi-squared value being greater suggests a significant difference, leading to rejecting the null hypothesis.
#9
What is the formula to calculate correlation coefficient (Pearson's r)?
Σ(x - μ)(y - ν) / √(Σ(x - μ)²Σ(y - ν)²)
ExplanationCorrelation coefficient measures the strength and direction of a linear relationship between two variables.
#10
What is the purpose of a one-way ANOVA test?
To compare means of more than two independent groups
ExplanationOne-way ANOVA assesses if there are any statistically significant differences between the means of three or more independent groups.
#11
What is the null hypothesis in a t-test for independent samples?
There is no difference between the means of the two groups
ExplanationNull hypothesis in t-test assumes no significant difference between the means of two independent groups.
#12
What is the purpose of a Mann-Whitney U test?
To compare medians of more than two independent groups
ExplanationMann-Whitney U test assesses if there is a difference between the medians of two independent groups.
#13
What is the F-statistic used for in ANOVA?
To test for differences between group means
ExplanationF-statistic in ANOVA tests if the means of different groups are significantly different.
#14
What is the coefficient of determination (R-squared) in linear regression?
A measure of the proportion of the variance in the dependent variable that is predictable from the independent variable
ExplanationR-squared indicates the proportion of the variance in the dependent variable explained by the independent variable.
#15
What is the purpose of a Box-Cox transformation?
To reduce skewness and make data more symmetric
ExplanationBox-Cox transformation is used to stabilize variance and make the data more closely approximate a normal distribution.
#16
What is the difference between Type I and Type II errors in hypothesis testing?
Type I error is rejecting a true null hypothesis, while Type II error is accepting a false null hypothesis.
ExplanationType I error occurs when we incorrectly reject a true null hypothesis, while Type II error occurs when we fail to reject a false null hypothesis.
#17
What is the difference between correlation and causation?
Correlation indicates a relationship between two variables, while causation indicates that one variable directly influences the other.
ExplanationCorrelation implies a statistical association, while causation indicates a direct cause-and-effect relationship.
#18
What is multicollinearity in regression analysis?
It occurs when the independent variables in a regression model are highly correlated.
ExplanationMulticollinearity exists when independent variables in a regression model are strongly correlated, making it difficult to separate their individual effects.
#19
What is the Akaike Information Criterion (AIC) used for in model selection?
To compare the goodness of fit of different statistical models.
ExplanationAIC is a measure that quantifies the trade-off between the goodness of fit and the complexity of a model, aiding in model selection.
#20
What is the purpose of a Kaplan-Meier survival analysis?
To compare survival curves between two or more groups.
ExplanationKaplan-Meier survival analysis is used to estimate the survival function and compare survival curves between different groups.
#21
What is the difference between a parametric and non-parametric statistical test?
Parametric tests assume specific population parameters, while non-parametric tests do not make such assumptions.
ExplanationParametric tests assume certain population parameters, while non-parametric tests make fewer assumptions about the underlying population distribution.
#22
What is the purpose of a receiver operating characteristic (ROC) curve?
To compare the performance of different classification models.
ExplanationROC curve evaluates the performance of classification models by plotting the trade-off between true positive rate and false positive rate.
#23
What is the difference between a one-tailed and two-tailed hypothesis test?
A one-tailed test examines whether the sample mean is greater or less than a specified value, while a two-tailed test examines whether the sample mean is significantly different from a specified value.
ExplanationOne-tailed tests focus on one direction (greater or less than), while two-tailed tests assess if the sample mean is significantly different from a specified value in either direction.
#24
What is the purpose of a log-rank test?
To compare survival curves between two or more groups.
ExplanationLog-rank test is a non-parametric test used to compare survival curves between different groups.
#25
What is the difference between a population parameter and a sample statistic?
A population parameter describes the entire population, while a sample statistic describes a subset of the population.
ExplanationPopulation parameters characterize the entire population, while sample statistics describe characteristics of a subset of the population.