What would be a "reasonable" minimal number of observations to look for a trend over time with a linear regression? what about fitting a quadratic model? I work with composite indices of inequality in health (SII, RII), and have only 4 waves of the survey, so 4 points (1997, 2001, 2004, 2008).
In the probability model underlying linear regression, X and Y are random variables. if so, as an example, if Y = obesity and X = age, if we take the conditional expectation E (Y|X=35) meaning, whats the expected value of being obese if the individual is 35 across the sample, would we just take the average (arithmetic mean) of y for those observations where X=35? That's right. In general, you ...
For your first question, I don't think that a linear regression model assumes that your dependent and independent variables have to be normal. However, there is an assumption about the normality of the residuals. For your second question, there is two different things you could consider : Check different kind of models.
With linear regression with no constraints, R2 R 2 must be positive (or zero) and equals the square of the correlation coefficient, r r. A negative R2 R 2 is only possible with linear regression when either the intercept or the slope are constrained so that the "best-fit" line (given the constraint) fits worse than a horizontal line.
Often times a statistical analyst is handed a set dataset and asked to fit a model using a technique such as linear regression. Very frequently the dataset is accompanied with a disclaimer similar...
I'm not used to using variables in the date format in R. I'm just wondering if it is possible to add a date variable as an explanatory variable in a linear regression model. If it's possible, how c...
Taking logarithms allows these models to be estimated by linear regression. Good examples of this include the Cobb-Douglas production function in economics and the Mincer Equation in education.
The assignment is to calculate the linear regression analysis/regression equation for a data set containing years and the percentage of unemployment in the population at that time.
Therefore, the second and third plots, which seem to indicate dependency between the residuals and the fitted values, suggest a different model. But why does the second plot suggest, as Faraway notes, a heteroscedastic linear model, while the third plot suggest a non-linear model?