Multicollinearity-Econometrics

 

 Multicollinearity and it's detection 

 

Multicollineary is a state of very high intercorrelations or inter-associations among the independent variables. It is therefore a type of disturbance in the data, and if present in the data the statistical inferences made about the data may not be reliable. Multicollinearity occurs when two or more independent variables are highly correlated with one another in a regression model.

Y= c+X1+X2+X3+X4+X5+X6

Crop production= c+ amount of water + rainfall + pesticides + soil quality

Crop production= c+ rainfall + pesticides + soil quality

 

There are certain reasons why multicollinearity occurs:

It is caused by an inaccurate use of dummy variables Multicollinearity-

  • .
  • It is caused by the inclusion of a variable which is computed from other variables in the data set.
  • Multicollinearity can also result from the repetition of the same kind of variable.
  • Generally occurs when the variables are highly correlated to each other.

 

 

When to fix Multicollinearity

The good  that it is not always mandatory to fix the multicollinearity. It all depends on the primary goal of the regression model.

 

The degree of multicollinearity greatly impacts the p-values and coefficients but not predictions and goodness-of-fit test. If your goal is to perform the predictions and not necessary to understand the significance of the independent variable, it is not a mandate to fix the multicollinearity issue.

a) Find out the correlation between indepednet variable

b) If the correlation between any two independent variable is more than 90% or 90% then drop one variable from the regression equation.

Detection of Multicollinearity

 the researcher might get a mix of significant and insignificant results that show the presence of multicollinearity. Suppose the researcher, after dividing the sample into two parts, finds that the coefficients of the sample differ drastically. This indicates the presence of multicollinearity. This means that the coefficients are unstable due to the presence of multicollinearity. Suppose the researcher observes drastic change in the model by simply adding or dropping some variable.   This also indicates that multicollinearity is present in the data.

Multicollinearity can also be detected with the help of tolerance and its reciprocal, called variance inflation factor (VIF). If the value of tolerance is less than 0.2 or 0.1 and, simultaneously, the value of VIF 10 and above, then the multicollinearity is problematic.

 

Variation Inflation Factor (VIF)

A correlation plot can be used to identify the correlation or bivariate relationship between two independent variables whereas VIF is used to identify the correlation of one independent variable with a group of other variables. Hence, it is preferred to use VIF for better understanding.

VIF = 1 → No correlation

VIF = 1 to 5 → Moderate correlation

VIF >10 → High correlation

 

Following is the example of detecting Multicollinearity




 

 


 thank you readers if you find any doubts comment or mail at anchusharma3030@gmail.com 

 

 

 

0 comments