Interaction terms in linear REGRESSION. Technote (troubleshooting) Problem(Abstract) How can I include interaction terms in a multiple regression analysis with the REGRESSION procedure? Can I ask for the predictors to be centered? Resolving the problem. In the REGRESSION procedure, the interaction between two predictors must be represented as a variable to be included in the list of predictors. This variable can be created with the COMPUTE command. A common interaction term is a simple product of the predictors in question. For example, a product interaction between VARX and VARY can be computed and called INTXY with the following command. COMPUTE INTXY = VARX * VARY. The new predictors are then included in a REGRESSION procedure. In these examples, the dependent variable is called RESPONSE. REGRESSION /VARIABLES = RESPONSE VARX VARY INTXY /STATISTICS = ALL /DEPENDENT = RESPONSE /METHOD = ENTER . Suppose you have a five-level categorical variable which is represented in your data set as a set of indicator (or dummy) variables called D1 to D4. An interaction between a continuous variable, called VARX, and the categorical variable would be represented as the set of products of VARX with each of D1, D2, D3, and D4. Each of these products would be created with a separate COMPUTE statement, as follows. COMPUTE INTXD1 = VARX * D1. COMPUTE INTXD2 = VARX * D2. COMPUTE INTXD3 = VARX * D3. COMPUTE INTXD4 = VARX * D4. REGRESSION /VARIABLES = RESPONSE VARX D1 TO D4 INTXD1 TO INTXD4 /STATISTICS = ALL /DEPENDENT = RESPONSE /METHOD = ENTER . Centering predictors around their mean (so that the mean of the new predictor is 0) is one way of reducing the multicollinearity problems that may arise as a result of including predictors plus their product terms in a regression. The individual predictor variables would be centered before their product term was computed. The interaction term would be the product of the centered predictors. The Aggregate procedure could be used to save the means of the predictors as new variables in the active file.. These new variables with means could then be plugged into COMPUTE commands to create the centered variables. In the example below, the Aggregate procedure is used to save the means of varx and vary as new variables x_mean and y_mean, respectively. The centered variables CX and CY are then created and then multiplied to form the interaction term, IXY. The variable BRK is created to serve as a BREAK variable for the Aggregate run. BRK is computed as a constant across all cases in the file, so the means are computed across all cases. If you wanted a variable to be centered within levels of a group variable, then you would use that group variable as the BREAK variable in Aggregate. compute brk = 1. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=brk /x_mean y_mean = MEAN(varx vary) . COMPUTE CX = varx - x_mean . COMPUTE CY = vary - y_mean . COMPUTE IXY = CX * CY. REGRESSION /VARIABLES = RESPONSE CX CY IXY /STATISTICS = ALL /DEPENDENT = RESPONSE /METHOD = ENTER. Note that an alternate approach for entering interactions of predictors into a regression model is to use the UNIANOVA command (Analyze->General Linear Model->Univariate). Continuous predictors are entered into the model as covariates. Interactions between covariates (or between covariates and group factors or between group factors) can be specified in the Model dialog box (the /DESIGN subcommand) without the need to create interaction variables. UNIANOVA will not center the predictors for you. In the UNIANOVA command below, the centered predictors CX and CY are entered as predictors for RESPONSE and the interaction between CX and CY is added to the model in the /DESIGN subcommand. (Note that the parameter estimates must be requested in UNIANOVA, using /PRINT PARAMETER, whereas they are part of the default output in REGRESSION.) UNIANOVA response WITH CX CY /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT=PARAMETER DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=CX CY CX*CY .
Комментариев нет:
Отправить комментарий