What Does Variance Inflation Factor Mean?

The variance inflation factor (VIF) is a metric for determining how much multicollinearity there is in a set of multivariate regression variables. The VIF for a regression model variable is equal to the ratio of the total model variance to the variance of a model that just includes that single independent variable in mathematics. For each independent variable, this ratio is determined. A high VIF shows that the linked independent variable has a high degree of collinearity with the model’s other variables.

What does the variance inflation factor mean to you?

Inflation factors for variance range from one to ten. The numerical number for VIF indicates how much the variance (i.e. the standard error squared) is inflated for each coefficient (in decimal notation). A VIF of 1.9, for example, indicates that the variance of a specific coefficient is 90% higher than what you’d expect if there was no multicollinearity that is, if there was no connection with other variables.

The exact size of a VIF that causes problems is a point of contention. What is known is that when your VIF increases, your regression results will become less dependable. In general, a VIF greater than 10 shows substantial correlation and should be considered concerning. Some authors recommend a threshold of 2.5 or more as a more conservative level.

A high VIF is not always a cause for concern. For example, if you use products or powers from other variables in your regression, such as x and x2, you can achieve a high VIF. It is usually not a problem to have large VIFs for dummy variables representing nominal variables with three or more categories.

What is an appropriate inflation variance factor?

The majority of research studies use a VIF (Variance Inflation Factor) > 10 as a criterion for multicollinearity, however some use a lower threshold of 5 or even 2.5.

When deciding on a VIF threshold, keep in mind that multicollinearity is less of an issue with a big sample size than it is with a small one.

As a result, here is a list of references for various VIF thresholds that are recommended for detecting collinearity in a multivariable (linear or logistic) model:

What does a VIF of one indicate?

A VIF of 1 indicates that the jth predictor and the remaining predictor variables have no association, and hence the variance of bj is not inflated at all.

What if VIF is too high?

It’s a metric for determining how multicollinear a set of multivariate regression variables is. The higher the VIF value, the stronger the link between one variable and the others. If the VIF number is greater than 10, it is usually assumed that the independent variables are highly correlated.

What can I do about a high VIF?

If multicollinearity is an issue in your model if a factor’s VIF is at or above 5, for example the solution may be straightforward. Consider one of the following:

  • Remove predictors that are highly linked from the model. Remove one of the factors with a high VIF if you have two or more. Because they provide redundant information, eliminating one of the linked components seldom reduces the R-squared significantly. To eliminate these variables, consider utilizing stepwise regression, best subsets regression, or specialized knowledge of the data set. The model with the highest R-squared value should be chosen.
  • Use regression approaches like Partial Least Squares Regression (PLS) or Principal Components Analysis, which reduce the number of predictors to a smaller number of uncorrelated components.

It’s simple to utilize the tools in the Stat > Regression menu in Minitab Statistical Software to quickly test several regression models to identify the best one. If you haven’t tried Minitab yet, we offer you to do so for free for 30 days.

Have you ever had to deal with multicollinearity issues? How did you come up with a solution to the problem?

What is the purpose of VIF?

In an ordinary least square (OLS) regression analysis, the variance inflation factor (VIF) is used to detect the degree of multicollinearity. The variance and type II error are inflated by multicollinearity. It makes a variable’s coefficient consistent yet unreliable.

What does a low VIF score mean?

Small VIF values suggest low correlation among variables under ideal conditions VIF. VIF is the reciprocal of the tolerance value. If the number is fewer than ten, though, it is allowed.

Is VIF suitable for logistic regression?

There is no formal cutoff value to use with VIF to determine the presence of multicollinearity, just as there is no formal cutoff value to use with tolerance. VIF values more than 10 are often thought to indicate multicollinearity, although in weaker models, such as logistic regression, values greater than 2.5 may be cause for concern.

Is a VIF of 1 a good number?

The lower the value of the VIFs, the better. The minimum number for VIF is 1, which means that there is no collinearity at all. VIFs of 1 to 5 indicate that the association is not severe enough to justify corrective action.