Module 11 Assignment

 12.1: Set up additive model for the ashina data, as part of ISwR package. This data contains additive effects on subjects, period, and treatment. Compare the results with those obtained from t test.

> # 12.1
> library(ISwR)
> data(ashina)
> str(ashina)
'data.frame':	16 obs. of  3 variables:
 $ vas.active: int  -167 -127 -58 -103 -35 -164 -3 25 -61 -45 ...
 $ vas.plac  : int  -102 -39 32 28 16 -42 -27 -30 -47 8 ...
 $ grp       : int  1 1 1 1 1 1 1 1 1 1 ...
> # Add subject as a factor
> ashina$subject <-factor(1:16)
> # Create two data frames: active and placebo
> attach(ashina)
> act <- data.frame(vas = vas.active, subject, treat = 1, period = grp)
> plac <- data.frame(vas = vas.plac, subject, treat = 0, period = grp)
> combined <- rbind(act, plac)
> combined$treat <- factor(combined$treat)
> combined$period <- factor(combined$period)
> # Fit the additive model
> model <- lm(vas ~ subject + period + treat, data = combined)
> summary(model)

Call:
lm(formula = vas ~ subject + period + treat, data = combined)

Residuals:
   Min     1Q Median     3Q    Max 
-48.94 -18.44   0.00  18.44  48.94 

Coefficients: (1 not defined because of singularities)
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -113.06      27.39  -4.128 0.000895
subject2       51.50      37.58   1.370 0.190721
subject3      121.50      37.58   3.233 0.005573
subject4       97.00      37.58   2.581 0.020867
subject5      125.00      37.58   3.326 0.004604
subject6       31.50      37.58   0.838 0.415070
subject7      119.50      37.58   3.180 0.006215
subject8      132.00      37.58   3.513 0.003142
subject9       80.50      37.58   2.142 0.049003
subject10     116.00      37.58   3.087 0.007518
subject11     121.50      37.58   3.233 0.005573
subject12     154.50      37.58   4.111 0.000925
subject13     131.00      37.58   3.486 0.003318
subject14     125.00      37.58   3.326 0.004604
subject15      99.00      37.58   2.634 0.018768
subject16      80.50      37.58   2.142 0.049003
period2           NA         NA      NA       NA
treat1        -42.87      13.29  -3.227 0.005644
               
(Intercept) ***
subject2       
subject3    ** 
subject4    *  
subject5    ** 
subject6       
subject7    ** 
subject8    ** 
subject9    *  
subject10   ** 
subject11   ** 
subject12   ***
subject13   ** 
subject14   ** 
subject15   *  
subject16   *  
period2        
treat1      ** 
---
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 37.58 on 15 degrees of freedom
Multiple R-squared:  0.7566,	Adjusted R-squared:  0.4969 
F-statistic: 2.914 on 16 and 15 DF,  p-value: 0.02229

> # Compare with paired t-test
> t.test(vas.active, vas.plac, paired = TRUE)

	Paired t-test

data:  vas.active and vas.plac
t = -3.2269, df = 15, p-value = 0.005644
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 -71.1946 -14.5554
sample estimates:
mean difference 
        -42.875

In this section, I used the ashina data set from the ISwR package to build an additive model in order to examine the effects of treatment, subject, and period. I combined both the active and placebo treatments into one dataset and fit the model lm(vas ~ subject + period + treat). The results showed that the active treatment significantly reduced the pain (estimate = -42.87, p= .005644). Many of the subject effects were also significant, which makes sense since pain responses vary across individuals. Period 2 variable showed a singularity, likely due to how R handles factor coding. A paired t-test produced nearly identical results (mean difference = -42.875, p= .005644), confirming that the treatment's effect. But the additive model provides a clearer picture by accounting for subject and period differences.

12.3: Consider the following definitions
Note:
The rnorm() is a built-in R function that generates a vector of normally distributed random numbers. The rnorm() method takes a sample size as input and generates that many random numbers.

Your assignment:
Generate the model matrices for models z~a*b, z~a:b, etc. In your blog posting, discuss the implications. Carry out the model fits and notice which models contain singularities.

> # 12.3
> # Define the g1 function and generate a , b
> g1 <- function(from, to, length.out)
+   factor(rep(from:to, length.out = length.out))
> a <- g1(2, 3, 8)
> b <- g1(2, 4, 8)
> x <- 1:8
> y <- c(1:4, 8:5)
> z <- rnorm(8)
> # Generate model matrices
> model.matrix(~ a * b)
  (Intercept) a3 b3 b4 a3:b3 a3:b4
1           1  0  0  0     0     0
2           1  1  1  0     1     0
3           1  0  0  1     0     0
4           1  1  0  0     0     0
5           1  0  1  0     0     0
6           1  1  0  1     0     1
7           1  0  0  0     0     0
8           1  1  1  0     1     0
attr(,"assign")
[1] 0 1 2 2 3 3
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.treatment"

attr(,"contrasts")$b
[1] "contr.treatment"

> model.matrix(~ a : b)  
  (Intercept) a2:b2 a3:b2 a2:b3 a3:b3 a2:b4 a3:b4
1           1     1     0     0     0     0     0
2           1     0     0     0     1     0     0
3           1     0     0     0     0     1     0
4           1     0     1     0     0     0     0
5           1     0     0     1     0     0     0
6           1     0     0     0     0     0     1
7           1     1     0     0     0     0     0
8           1     0     0     0     1     0     0
attr(,"assign")
[1] 0 1 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.treatment"

attr(,"contrasts")$b
[1] "contr.treatment"

> model.matrix(~ a + b)
  (Intercept) a3 b3 b4
1           1  0  0  0
2           1  1  1  0
3           1  0  0  1
4           1  1  0  0
5           1  0  1  0
6           1  1  0  1
7           1  0  0  0
8           1  1  1  0
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.treatment"

attr(,"contrasts")$b
[1] "contr.treatment"

> # Fit models and check for singularities
> summary(lm(z ~ a * b))

Call:
lm(formula = z ~ a * b)

Residuals:
         1          2          3          4 
-9.956e-01  5.472e-01  0.000e+00 -5.551e-17 
         5          6          7          8 
 4.996e-16  5.551e-17  9.956e-01 -5.472e-01 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.1023     0.8033  -0.127    0.910
a3            0.4713     1.3913   0.339    0.767
b3           -1.2353     1.3913  -0.888    0.468
b4           -0.5064     1.3913  -0.364    0.751
a3:b3         0.2805     1.9676   0.143    0.900
a3:b4         1.9707     2.1253   0.927    0.452

Residual standard error: 1.136 on 2 degrees of freedom
Multiple R-squared:  0.7063,	Adjusted R-squared:  -0.0279 
F-statistic: 0.962 on 5 and 2 DF,  p-value: 0.5807

> summary(lm(z ~ a : b))

Call:
lm(formula = z ~ a:b)

Residuals:
         1          2          3          4 
-9.956e-01  5.472e-01  0.000e+00  0.000e+00 
         5          6          7          8 
 0.000e+00  1.665e-16  9.956e-01 -5.472e-01 

Coefficients: (1 not defined because of singularities)
            Estimate Std. Error t value Pr(>|t|)
(Intercept)    1.833      1.136   1.614    0.248
a2:b2         -1.936      1.391  -1.391    0.299
a3:b2         -1.464      1.607  -0.911    0.458
a2:b3         -3.171      1.607  -1.974    0.187
a3:b3         -2.419      1.391  -1.739    0.224
a2:b4         -2.442      1.607  -1.520    0.268
a3:b4             NA         NA      NA       NA

Residual standard error: 1.136 on 2 degrees of freedom
Multiple R-squared:  0.7063,	Adjusted R-squared:  -0.0279 
F-statistic: 0.962 on 5 and 2 DF,  p-value: 0.5807

> summary(lm(z ~ a + b))

Call:
lm(formula = z ~ a + b)

Residuals:
      1       2       3       4       5       6 
-0.7824  0.4275 -0.6656 -0.4263  0.2393  0.6656 
      7       8 
 1.2087 -0.6668 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.3155     0.6138  -0.514    0.634
a3            1.1108     0.7223   1.538    0.199
b3           -1.2614     0.8340  -1.513    0.205
b4            0.3724     0.9008   0.413    0.701

Residual standard error: 0.978 on 4 degrees of freedom
Multiple R-squared:  0.5647,	Adjusted R-squared:  0.2382 
F-statistic:  1.73 on 3 and 4 DF,  p-value: 0.2986

This part of the assignment helped me understand how interactions work in model matrices and what happened when they're included or excluded in a linear model. I used custom function to create two factors (a and b), to generate a numeric response variable z with random normal values. Once I fixed an issue where oneof the factor had only one level, I test three models: a full factorial (z~a*b), interaction-only (z~a:b), and additive-only (z~a+b). The full model included both the main and interaction effects but had limited degrees of freedom due to the small sample size. The interaction-only model worked but showed singularity, meaning one combination didn't have enough data. The additive model was simpler and still showed one significant effect. Overall, this assignment showed how R handles factor coding, why singularities occur, and the importance of picking a model that fits your data instead of automatically using the most complex one.

Comments

Popular posts from this blog

Module 5 Assignment

Module 2 Assignment LIS4273

Module 6 Assignment