RetiredWebsite/tutorial_Horger.Rmd at master · mhorger/RetiredWebsite · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
---
title: "lme tutorial"
author: "mhorger"
date: "May 4, 2019"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```


# nlme is a package for fitting and comparing linear and nonlinear mixed effects models
It let's you specify variance-covariance structures for the residuals.
It is well suited for repeated measure or longitudinal designs.

###Similar packages
One similar package is lme4.
It allows you to fit outcomes whose distribution is not Gaussian and allows for crossed random effects.
It stores data more effiently due to the use of sparse matrices.
It is more suited for clustered data sets.

##What's included?
nlme includes sample data, statistics, matrices, and a lattice framework.


#Using the nlme package

## Begin by installing the nlme package

Found on the CRAN repository


Website: https://svn.r-project.org/R-packages/trunk/nlme

```{r, eval=FALSE}
install.packages("nlme")
```


## Load the package (and other relevant packages)
```{r}
library(ggplot2)
library(nlme)
library(dplyr)
library(knitr)
```


## Try the lme function
This generic function fits a linear mixed-effects model in the formulation described in Laird and Ware (1982) but allowing for nested random effects. The within-group errors are allowed to be correlated and/or have unequal variances.

### Some important considerations

1. Need repeated measures from a single subject The data may be longitudinal, but they also may not.

2. Can account for correlations within individuals within the random effects

3. Uses maximum likelihood estimates

###The arguments for this function

lme(model, data, fixed, random, groups, start, correlation, weights,
subset, method, na.action, naPattern, control, verbose)

### **An example: Does the number of daily naps impact infant performance on a thing?**

```{r}

#creating a data set

Subs <- rep(c(seq(1:10)), 4)

Month <- c(rep(c(1), 10), rep(c(5), 10), rep(c(10), 10), rep(c(15), 10))

Naps <- c(rep(c(3), 10), 2, 3, 2, 1, 2, 3, 2, 3, 2, 3, 2, 2, 2 ,2, 3, 3, 2, 2, 1, 2, 3, 1, 2, 2, 1, 1, 2, 1, 2, 1 )

Napsfactor <- as.factor(Naps)

#Let's assume that infants' performance will get better with time. I altered the possible sampling distributions to reflect this.

scores <- c(runif(10, 1, 7), runif(10, 8, 15), runif(10, 16, 22), runif(10, 23, 30))


dataset <- data.frame(Subs, Month, Naps, scores, Napsfactor)

#Data should be set up in long format and look similar to this.
print(dataset)


demos <- dataset %>%
    group_by(Month, Naps) %>%
  summarise(mean_score = mean(scores, na.rm=TRUE))

```


####This is a 4x3 within subject design.
Infants are assessed at 4 time points - 1 month, 5 months, 10 months, and 15 months.
There are 3 levels of napping - 1, 2, or 3 naps per day.


#### We will run a conditional growth model because we are including predictors. Subsequent fixed and random effects are now “conditioned on” the predictors (age and number of naps).

```{r}
#Conditional growth model
tutorial<-lme(scores ~ Month * Naps, random = ~ Month | Subs, data=dataset)


#Because we are using a random sample, may need to rerun the scores several times for this piece of code to run effectively
```

**lme(model, random, data)**


**model** - scores ~ Month * Naps

We expect scores will be influenced by how old infants are (Month) and the number of Naps they take per day. There may be an interaction between these predictors.


**random** - random = ~ Month | Subs

Random error comes from the fact that this is a within subjects design. The same infants are assessed at 1 month, 5 months, 10 months, and 15 months.


**data** - data=dataset

Using the data set we created previously.


####We can move the results to a nicer table using the summary function

```{r}
#summarize an lme object - our solution
tut<- summary(tutorial)
tabl = tut$tTable
tabl

```


From this analysis, we would conclude that there is a main effect of age, infants performance improved with age, but there is no effect of number of naps. It was trending in the correct direction as the only negative slope.

#### We can also graph our results.

```{r}

plot<-
ggplot2::ggplot(dataset, aes(x=Month, y=scores,  color=Napsfactor, shape = Napsfactor, group = Subs), xlim(1, 15), ylim(0, 25), xlab(Month) ) +
  geom_point()+
  geom_line(color="grey")

plot + scale_x_continuous(name="Age (in months)", limits=c(1, 15), breaks = Month) +
  scale_y_continuous(name="Scores", limits=c(0, 30))

```


This kind of graph allows us to see the developmental trajectory of individual infants. It highlights the fact that there are 2 developmental trends occuring here- Infants' performance on the assessment is improving with time and the average number of naps they take is decreasing with time.

### **Single main effect VS. two main effects or an interaction**

When making the original data set, I intentionally biased the data to show a developmental curve by increasing the sampling distribution for each age range. I could do something similar to bias our data to support the impact of taking fewer naps

```{r}

#Create a new data set
Subs <- rep(c(seq(1:10)), 4)

Month <- c(rep(c(1), 10), rep(c(5), 10), rep(c(10), 10), rep(c(15), 10))

Naps <- c(rep(c(3), 10), 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 3, 3, 3 ,2,2,  2, 2,2,1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1)
Napsfactor <- as.factor(Naps)

secondscores <- c(runif(10, 1, 10), runif(5, 5, 10),runif(5, 9, 17), runif(3, 10, 15), runif(5, 14, 22), runif(2, 20, 25), runif(5, 18, 23), runif(4,22, 27), runif(1, 27, 30) )


```


```{r}
seconddataset <- data.frame(Subs, Month, Naps, secondscores, Napsfactor)

print(seconddataset)
```


### Did the manipulation work?

```{r}
#Summary stats from our first dataset
demos <- dataset %>%
    group_by(Month, Naps) %>%
  summarise(mean_score = mean(scores, na.rm=TRUE))


#Summary stats from our second dataset
seconddemos <- seconddataset %>%
    group_by(Month, Naps) %>%
  summarise(mean_secondscore = mean(secondscores, na.rm=TRUE))


print(demos)
print(seconddemos)

```

It may or may not because we're still drawing a random sample, but the trend should be clearer than during the first example.


###Now run the analysis again

```{r}
#Run the analysis again

secondtutorial<-lme(secondscores ~ Month * Naps, random = ~ Month | Subs, data=seconddataset)
```


```{r}
#Create a table

secondtut<- summary(secondtutorial)
secondtabl = secondtut$tTable
secondtabl
```


```{r}

#Graph the results

secondplot<-
ggplot2::ggplot(seconddataset, aes(x=Month, y=secondscores,  color=Napsfactor, shape = Napsfactor, group=Subs), xlim(1, 15), ylim(0, 25), xlab(Month) ) +
  geom_point()+
  geom_line( color="grey")

secondplot + scale_x_continuous(name="Age (in months)", limits=c(1, 15), breaks = Month) +
  scale_y_continuous(name="Scores", limits=c(0, 30))
```


##References
Curran, P. J., Obeidat, K., & Losardo, D. (2010). Twelve frequently asked questions about growth curve modeling. Journal of Cognition and Development : Official Journal of the Cognitive Development Society, 11(2), 121–136. doi:10.1080/15248371003699969

Magnusson, K. (2015). Uing R and lme/lmer to fit different two- and three- level longitudinal models. R Psychologist. Retrieved from https://rpsychologist.com/r-guide-longitudinal-lme-lmer#longitudinal-two-level-model

Maindonald, J. (2007). Chapter 10: Multi-level models and repeated measures. In J. Maindonald & J. Braun (Eds.), Data analysis and graphics using R: An example-based approach. Cambridge: Cambridge University Press.

Pinheiro, J. & Bates, D. (2019). Package 'nlme'. Retrieved from https://cran.r-project.org/web/packages/nlme/nlme.pdf