This package implements the algorithms proposed in Zhou et al. (2021), which aims to estimates the microbial interaction network based on Gaussian graphical model and irregularly-spaced longitudinal data. There are two related models in Zhou et al. (2021) which can handle two different questions. The first model is called homogeneous stationary Gaussian graphical model (SGGM) in which all the subjects share a same auto-correlation parameter \(\tau\). For homogeneous SGGM, an iterative graphical lasso algorithm is proposed to compute the \(l_1\) penalized maximum likelihood estimate (MLE) of the network. The second model is called heterogeneous SGGM in which each subject possesses his/her own auto-correlation parameter \(\tau_i\). For heterogeneous SGGM, given the complex form of its likelihood function, an expectation-maximization (EM) type algorithm is devised to compute the \(l_1\) penalized MLE of the network and \(\tau_i\)’s, which integrates the EM algorithm with conventional graphical lasso algorithm (Friedman (2019)) and therefore can be implemented very efficiently. For the details of the algorithms and the output of the packages, please see Zhou et al. (2021).

There are two main functions in this package. The first one is \(lglasso\) which can output a sparse network representing the precision matrix by maximizing the \(l_1\) penalized likelihood function. The second function if \(mle\), which can compute the maximum likelihood estimates of precision matrix and the correlation parameter \(\tau\) for a given network structure produced from function \(lglasso\).

##The sample data are subset of a larger longitudinal data set  from a clinical project. 
##There are 13 clusters  are involved in the sample data.
sample_data[1:5,1:5]
#>   subject age Bifidobacterium Escherichia.Shigella Phocaeicola
#> 1     103  14      -2.1892985            -4.279322   -8.907273
#> 2     103  15       0.1344729            -1.731091   -7.606770
#> 3     103  16       3.4309103             5.274568   -5.501496
#> 4     103  17      -1.3676941            -2.221970   -8.463670
#> 5     104   1      -5.7846815            -6.242466   -7.040644
dim(sample_data)
#> [1] 100  22
sample_data=as.matrix(sample_data)

Example 1: Heterogeneous model with dampening correlation.

a=lglasso(data = sample_data, rho = 0.7, heter=T, ty=1)
#> [1] "Iteration 1: Mean value of tau's=0"
#> [1] "Iteration 2: Mean value of tau's=3.35045525902669"
#> [1] "Iteration 3: Mean value of tau's=2.48789638932496"
#> [1] "Iteration 4: Mean value of tau's=2.26833594976452"
##Number of edge in the selected network
(length(which(a$omega!=0))-ncol(a$omega))/2##A sub-network for first five variables
#> [1] 107
##A sub-network
a$omega[1:5,1:5]
#>             [,1]        [,2]        [,3]        [,4]        [,5]
#> [1,]  0.21081752 -0.02224875  0.00000000 -0.05166254 -0.01086864
#> [2,] -0.02224659  0.15323374 -0.01275027  0.00000000 -0.02743247
#> [3,]  0.00000000 -0.01275300  0.15824442  0.00000000 -0.01388505
#> [4,] -0.05167309  0.00000000  0.00000000  0.27793039  0.00000000
#> [5,] -0.01088620 -0.02743711 -0.01388423  0.00000000  0.21909828
##Compute the MLE
estimates=mle(data = sample_data,network = a$omega,heter = TRUE,ty = 1)
##MLE of precision matrix
estimates$network[1:5,1:5]
#>              [,1]        [,2]        [,3]        [,4]         [,5]
#> [1,]  0.165480054 -0.01879124  0.00000000 -0.06497258  0.006096931
#> [2,] -0.018791237  0.13758231 -0.02432155  0.00000000 -0.017709792
#> [3,]  0.000000000 -0.02432155  0.12179165  0.00000000  0.005571500
#> [4,] -0.064972582  0.00000000  0.00000000  0.26943178  0.000000000
#> [5,]  0.006096931 -0.01770979  0.00557150  0.00000000  0.187811869
##MLE of correlation parameters tau
estimates$tau
#>  [1] 3.068163 1.233265 3.475918 3.068163 1.844898 2.660408 1.029388 3.883673
#>  [9] 1.233265 1.844898 1.437143 1.029388 2.660408
##MLE of popular parameter alpha 
estimates$alpha
#> [1] 0.4566374

Example 2: Homogeneous model with dampening correlation.

b=lglasso(data = sample_data, rho = 0.7, heter=F,ty=1)
#> [1] "Iteration  1 : tau= 2.25265306122449"
#> [1] "Iteration  2 : tau= 2.04877551020408"
#> [1] "Iteration  3 : tau= 2.04877551020408"
##Final estimate of correlation parameters
b$tau
#> [1] 2.048776
##Number of edge in the estimated network
(length(which(b$omega!=0))-ncol(b$omega))/2
#> [1] 104
##A sub-network 
b$omega[1:5,1:5]
#>             [,1]        [,2]        [,3]        [,4]        [,5]
#> [1,]  0.21583518 -0.02498340  0.00000000 -0.05197318 -0.01353468
#> [2,] -0.02498119  0.15300657 -0.01411958  0.00000000 -0.02596700
#> [3,]  0.00000000 -0.01412028  0.15965027  0.00000000 -0.01232379
#> [4,] -0.05198484  0.00000000  0.00000000  0.28403683  0.00000000
#> [5,] -0.01356253 -0.02597896 -0.01232004  0.00000000  0.22190005
##Compute the MLE 
estimates=mle(data = sample_data,network = a$omega,heter = FALSE,ty = 1)
##MLE of precision matrix
estimates$network[1:5,1:5]
#>              [,1]        [,2]        [,3]        [,4]         [,5]
#> [1,]  0.165480054 -0.01879124  0.00000000 -0.06497258  0.006096931
#> [2,] -0.018791237  0.13758231 -0.02432155  0.00000000 -0.017709792
#> [3,]  0.000000000 -0.02432155  0.12179165  0.00000000  0.005571500
#> [4,] -0.064972582  0.00000000  0.00000000  0.26943178  0.000000000
#> [5,]  0.006096931 -0.01770979  0.00557150  0.00000000  0.187811869
##MLE of correlation parameter
estimates$tau
#> [1] 1.844898

Example 3: Heterogeneous model with uniform correlation.

c=lglasso(data = sample_data, rho = 0.7, heter=T, ty=0)
#> [1] "Iteration 1: Mean value of tau's=0"
#> [1] "Iteration 2: Mean value of tau's=6.59681318681319"
#> [1] "Iteration 3: Mean value of tau's=5.90676609105181"
#> [1] "Iteration 4: Mean value of tau's=5.81266875981162"
##Individual level estimate of correlation parameters tau
c$tau
#>  [1] 5.106939 5.106939 4.903061 5.922449 6.126327 5.922449 5.106939 5.922449
#>  [9] 6.534082 5.310816 6.941837 6.534082 6.126327
##Number of edge in the estimated network
(length(which(c$omega!=0))-ncol(c$omega))/2##A sub-network for first five variables
#> [1] 108
##A sub-network
c$omega[1:5,1:5]
#>             [,1]        [,2]         [,3]       [,4]         [,5]
#> [1,]  0.21733766 -0.02888988  0.000000000 -0.0571558 -0.012669001
#> [2,] -0.02888514  0.15929251 -0.016394147  0.0000000 -0.025299453
#> [3,]  0.00000000 -0.01639361  0.156477859  0.0000000 -0.009106644
#> [4,] -0.05715627  0.00000000  0.000000000  0.2889248  0.000000000
#> [5,] -0.01269282 -0.02531099 -0.009103976  0.0000000  0.218785891
##Compute the MLE
estimates=mle(data = sample_data,network = a$omega,heter = TRUE,ty = 0)
##MLE of precision matrix
estimates$network[1:5,1:5]
#>              [,1]        [,2]        [,3]        [,4]         [,5]
#> [1,]  0.165480054 -0.01879124  0.00000000 -0.06497258  0.006096931
#> [2,] -0.018791237  0.13758231 -0.02432155  0.00000000 -0.017709792
#> [3,]  0.000000000 -0.02432155  0.12179165  0.00000000  0.005571500
#> [4,] -0.064972582  0.00000000  0.00000000  0.26943178  0.000000000
#> [5,]  0.006096931 -0.01770979  0.00557150  0.00000000  0.187811869
##MLE of  popular parameter alpha 
estimates$alpha
#> [1] 0.1758333
##MLE of correlation parameters tau
estimates$tau
#>  [1] 5.106939 4.903061 4.699184 5.718571 5.922449 5.718571 4.699184 5.922449
#>  [9] 6.737959 5.106939 6.941837 6.534082 5.922449

[1] Zhou, J., Gui, J., Viles, W.D., Chen, H., Madan, J.C., Coker, M.O., Hoen, A.G., 2021a. Identifying Microbial Interaction Networks Based on Irregularly Spaced Longitudinal 16S rRNA sequence data. https://doi.org/10.1101/2021.11.26.470159

[2] Friedman J., Hastie T., Tibshirani R. (2019) Graphical Lasso: Estimation of Gaussian Graphical Models, Version: 1.11. https://CRAN.R-project.org/package=glasso.