Introduction

The R package qtlpoly (v. 0.2.4) (Pereira et al. 2020) is an under development software to map multiple quantitative trait loci (QTL) in full-sib families of outcrossing autopolyploid species. It is based on the following random-effect model:

\[\boldsymbol{y} = \boldsymbol{1}\mu + \sum_{q=1}^Q \boldsymbol{g}_q + \boldsymbol{e}\]

where the vector of phenotypic values from a specific trait \(\boldsymbol{y}\) is a function of the fixed intercept \(\mu\), the \(q = 1, \dots, Q\) random QTL effects \(\boldsymbol{g}_q \sim \mathcal{N}(\boldsymbol{0}, \boldsymbol{G}_q\sigma^2_q)\), and the random environmental error \(\boldsymbol{e} \sim \mathcal{N}(\boldsymbol{0}, \boldsymbol{I}\sigma^2)\). \(\boldsymbol{G}_q\) is the additive relationship matrix between full-sibs derived from the genotype conditional probabilities of QTL \(q\). See model details in Pereira et al. (2020).

Variance components associated with putative QTL (\(\sigma^2_q\)) are tested using score statistics, as proposed in Qu, Guennel, and Marshall (2013). Final models are fitted using residual maximum likelihood (REML) from the R package sommer (v. 4.1) (Covarrubias-Pazaran 2016). Plots for visualizing the results are based on ggplot2 (v. 3.1.0) (Wickham 2016).

This tutorial used R version 4.2.2 Patched (2022-11-10 r83330) running on Ubuntu 20.04 LTS (64-bit).

Install and load the qtlpoly package and data

qtlpoly package is available in its stable version on CRAN and an under-development version at GitHub. You can install the stable version from CRAN by running:

> install.packages("qtlpoly")

To use the development version, please run the following commands to install qtlpoly and all of its dependencies:

> ## install.packages('devtools')
> devtools::install_github("gabrielgesteira/qtlpoly")

Then, use the function library() – or require() – to load the package:

> library(qtlpoly)

qtlpoly includes preloaded datasets that are simpler, so one can run the functions along with this tutorial using a regular personal computer (with 4 cores and 6 GB of RAM, minimum). For real data analyses, one may need to run an R script in a cluster with more cores and RAM. In general, the computational needs depend on ploidy level, population size, and the number of markers.

The dataset used in this tutorial was based on a full-sib family (\(N = 156\)) from a biparental potato population. You can view the maps with mappoly after installing it:

> ## install.packages('mappoly')
> library(mappoly)
## Warning in fun(libname, pkgname): couldn't connect to display ":0"
## =====================================================
## MAPpoly Package [Version 0.3.22022-12-02 08:00:02 UTC]
## More information: https://github.com/mmollina/MAPpoly
## =====================================================
> plot_map_list(maps4x)

> plot(maps4x[[1]])

> summary_maps(maps4x)
## 
## Your dataset contains removed (redundant) markers. Once finished the maps, remember to add them back with the function 'update_map'.
##       LG Genomic sequence Map length (cM) Markers/cM Simplex Double-simplex
## 1      1                1          205.88       2.31     102            122
## 2      2                2          125.11       3.44     131            139
## 3      3                3          134.07       2.85     151             21
## 4      4                4           165.9       2.56     114             87
## 5      5                5           106.2       2.93     124             53
## 6      6                6           142.7       2.78      73             75
## 7      7                7          126.09       3.28     136             94
## 8      8                8          136.68        2.5      61             98
## 9      9                9          144.49       2.46      76            101
## 10    10               10          114.67       1.81      47             36
## 11    11               11          110.66       2.89      95             52
## 12    12               12          117.54       1.91     115             32
## 13 Total             <NA>         1629.99       2.64    1225            910
##    Multiplex Total Max gap
## 1        252   476    9.01
## 2        160   430     4.9
## 3        210   382    9.09
## 4        224   425    5.74
## 5        134   311    4.22
## 6        249   397    3.83
## 7        184   414    5.65
## 8        183   342    7.32
## 9        179   356    7.79
## 10       124   207    5.27
## 11       173   320    5.41
## 12        78   225    6.09
## 13      2150  4285    6.19

The resulting linkage map consists of 12 linkage groups (LGs) with 4,285 markers, spanning 1629.99 centiMorgans (cM) of the potato genome. Please see details on the map construction for this simulated cross using mappoly here. The phenotypic data contains three traits in a simple table, where traits are distributed in columns and individuals in rows. For example:

> head(pheno4x)
##            FM07  FM08  FM14
## B2721.001 75.13 53.63 91.50
## B2721.002 62.25 88.63 89.50
## B2721.003 86.00 82.88 91.50
## B2721.004    NA 61.88 86.63
## B2721.005 51.75 49.25 88.63
## B2721.006 80.63 88.88 88.63

Conditional probabilities

Our qtlpoly package is fully integrated with mappoly (Mollinari and Garcia 2019), so if you have a map built using another software, you will need to convert or re-estimate it using functions from mappoly. Please refer to its tutorial in order to learn more about the software. mappoly uses a hidden Markov model (HMM) adapted from Lander and Green (1987) to autopolyploids to estimate the genotype conditional probabilities of putative QTL given an ordered set of markers comprising a linkage map (Mollinari and Garcia 2019). These conditional probabilities are ultimately used for QTL mapping by qtlpoly.

To compute the conditional probabilities for each linkage group, one should use the function calc_genoprob() from mappoly:

> library(mappoly)
> genoprob4x = lapply(maps4x, calc_genoprob)
##  Ploidy level: 4
##  Number of markers: 476
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 430
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 382
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 425
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 311
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 397
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 414
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 342
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 356
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 207
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 320
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......
##  Ploidy level: 4
##  Number of markers: 225
##  Number of individuals: 156
##  ..................................................
##  ..................................................
##  ..................................................
##  ......

Prepare data

Once the conditional probabilities have been calculated, you can use the function read_data() to read both genoprob4x and pheno4x objects. Row names (individuals) on pheno4x must match the individual names from the genoprob4x object. The argument ploidy = 4 informs the ploidy level of the cross, i.e., a tetraploid cross in this case. Finally, step = 1 provides the step size so that only positions at every 1 cM (approximately) will be tested:

> data = read_data(ploidy = 4, geno.prob = genoprob4x, pheno = pheno4x, step = 1)
## Reading the following data: 
##   Ploidy level:       4
##   No. individuals:    156
##   No. linkage groups: 12
##   Step size:          1 cM 
##   Map size:           1630.01 cM (1102 positions) 
##   No. phenotypes:     3

If you want to see detailed information on the map (number of positions per linkage group) and the phenotypes (number of individuals with non-missing values per trait), use a print() function with detailed = TRUE:

> print(data, detailed = TRUE)
## This is an object of class 'qtlpoly.data'
##   Ploidy level:       4
##   No. individuals:    156
##   No. linkage groups: 12
##   Step size:          1 cM 
##   Map size:           1630.01 cM (1102 positions) 
##     LG 1: 205.88 cM (123 positions) 
##     LG 2: 125.11 cM (101 positions) 
##     LG 3: 134.07 cM (90 positions) 
##     LG 4: 165.9 cM (106 positions) 
##     LG 5: 106.2 cM (71 positions) 
##     LG 6: 142.7 cM (105 positions) 
##     LG 7: 126.09 cM (96 positions) 
##     LG 8: 136.68 cM (95 positions) 
##     LG 9: 144.49 cM (100 positions) 
##     LG 10: 114.67 cM (67 positions) 
##     LG 11: 110.66 cM (82 positions) 
##     LG 12: 117.54 cM (66 positions) 
##   No. phenotypes:     3
##     Trait 1: 'FM07' (141 individuals) 
##     Trait 2: 'FM08' (143 individuals) 
##     Trait 3: 'FM14' (140 individuals)

Notice that regardless of the number of markers per linkage group, we will rather evaluate only a limited number of positions, given the step argument (in our example, every 1 cM). Because of the quite large linkage disequilibrium blocks in mapping populations, especially in full-sib families with relatively small sample sizes (let us say, \(N < 300\)), there is no need to test every single marker in the map. Moreover, with such marker density (e.g., 5+ markers/cM), a lot of markers will contain redundant information for QTL mapping purposes anyway. step = NULL will allow to test every single marker in a high density map if one wants to do so, but be sure that computational time (and needed RAM for loading the qtlpoly.data object) will be hugely increased.

Score-based resampling to assess genome-wide significance

Rather than guessing pointwise significance levels for declaring QTL, you can use the score-based resampling method to assess the genome-wide significance level (Zou et al. 2004). This method is relatively intensive, because it involves score statistics computation for every position in the map repeated 1,000 times (resampling):

> ## data.sim = simulate_qtl(data = data, mu = 0, h2.qtl = NULL, var.error =
+ ## 1, n.sim = 1000, missing = TRUE, seed = 123) score.null =
+ ## null_model(data = data.sim$results, n.clusters = 6, plot = NULL)
+ ## min.pvl = unlist(lapply(score.null$results, function(x)
+ ## return(x$pval[which.max(x$stat)]))) quantile(sort(min.pvl), c(0.2,
+ ## 0.05)) 20% 5% 0.0011493379 0.0002284465

Perform QTL detection

Building a multiple QTL model is considered a model selection problem, and there are several ways to approach it. Here, we adapted the algorithm proposed by Kao, Zeng, and Teasdale (1999) for fixed-effect multiple interval mapping (MIM) for diploids to our random-effect MIM (REMIM) for polyploids, which is summarized as follows:

  1. Null model: for each trait, a model starts with no QTL \[\boldsymbol{y} = \boldsymbol{1}\mu + \boldsymbol{e}\]
  2. Forward search: QTL (\(q = 1, \dots, Q\)) are added one at a time, conditional to the one(s) (if any) already in the model, under a less stringent threshold (e.g., \(P < 0.01\)) \[\boldsymbol{y} = \boldsymbol{1}\mu + \sum_{q=1}^Q \boldsymbol{g}_q + \boldsymbol{e}\]
  3. Model optimization: each QTL \(r\) is tested again conditional to the remaining one(s) in the model under a more stringent threshold (e.g., \(P < 10^{-4}\)) \[\boldsymbol{y} = \boldsymbol{1}\mu + \boldsymbol{g}_r + \sum_{q \neq r} \boldsymbol{g}_q + \boldsymbol{e}\] Steps 1 and 2 are repeated until no more QTL can be added to or dropped from the model, and positions of the remaining QTL do not change. After the first model optimization, the following forward searches use the more stringent threshold (e.g., \(P < 10^{-4}\)) as the detection power is expected to increase once QTL have already been added to the model.
  4. QTL profiling: score statistics for the whole genome are updated conditional to the final set of selected QTL.

It is worth to mention that there is no definitive way to perform model selection. One may try to come with alternative ways of building a multiple QTL model, some of which will be mentioned later. qtlpoly tries to provide functions flexible enough so that users can build a multiple QTL model on their own, manually. One strategy we have been testing and using implements the above-mentioned algorithm of building a multiple QTL model, which will be demonstrated below.

Plot profiles

Given the final profiled models, you can plot either individual or joint \(LOP\) profiles using the function plot_profile(). For individual plots, one need to specify the pheno.col argument:

> for (p in remim.mod$pheno.col) plot_profile(data = data, model = remim.mod,
+     pheno.col = p, ylim = c(0, 10))

Triangles indicate the estimated position of the QTL peaks. By providing values to the y-axis limits using ylim, one guarantees that all plots will be in the same scale.

If one wants to plot some or all QTL profiles at once for comparison, pheno.col needs to be specified as a vector of numbers representing the phenotype columns. As default, pheno.col = NULL includes all evaluated traits:

> plot_profile(data = data, model = remim.mod, grid = TRUE)

> plot_profile(data = data, model = remim.mod, grid = FALSE)

The argument grid organizes the multiple plots as a grid if TRUE, or superimposed profiles if FALSE.

Support intervals

You can visualize the QTL distributed along the linkage map, together with their supporting intervals with the function plot_sint():

> plot_sint(data = data, model = remim.mod)

Notice that for highly significant QTL (\(P < 2.22 \times 10^{-16}\)), computing support intervals may need adjustment due to the floating point precision limit.

Fit multiple QTL models

Once final models have been defined, one may use REML to estimate their parameters with the function fit_model():

> fitted.mod = fit_model(data = data, model = remim.mod, probs = "joint", polygenes = "none")
## There are 2 QTL in the model for trait 1 'FM07'. Fitting model... Done! 
## 
## There is 1 QTL in the model for trait 2 'FM08'. Fitting model... Done! 
## 
## There are 3 QTL in the model for trait 3 'FM14'. Fitting model... Done!

Here, we used the joint probabilities of the putative QTL genotypes (probs= "joint"), instead of the parent marginal ones (probs = "marginal"). The latter may be interesting if one of the parents seems to not contribute much for a specific QTL (because all QTL allele effects are about the same, for example). Also, if polygenes = "none", all individual QTL have their variance component estimated. One may want to define polygenes = "most" if all but one QTL have to be combined as a polygenic effect, or polygenes = "all" if all QTL have to be combined as a polygenic effect. Finally, keep = TRUE (default) saves matrices used on the calculations and additional results from sommer. This is useful in case one wants to assess additional metrics there, such as standard errors from parameter estimates, for example.

A summary() function shows parameter estimates, together with the QTL heritabilities, computed from the variance component estimates:

> summary(fitted.mod)
## This is an object of class 'qtlpoly.fitted'
## 
## * Trait 1 'FM07' 
##   LG   Pos Nmrk Intercept   Var(g)  Var(e)        h2
## 1  5 26.19  438        NA 344.8765      NA 0.5680876
## 2  7 60.68  641        NA 86.57081      NA 0.1426012
## 3 NA    NA   NA  72.92765       NA 175.636 0.7106888
## 
## * Trait 2 'FM08' 
##   LG   Pos Nmrk Intercept   Var(g)   Var(e)       h2
## 1  5 26.19  438  78.77196 1001.341 490.0927 0.671395
## 
## * Trait 3 'FM14' 
##   LG   Pos Nmrk Intercept   Var(g)   Var(e)        h2
## 1  5  7.42  427        NA   8.6028       NA 0.3129561
## 2  5 44.03  449        NA 2.917533       NA 0.1061352
## 3  7 66.12  645        NA 4.766193       NA 0.1733865
## 4 NA    NA   NA  90.49077       NA 11.20231 0.5924778

Plot QTL

After models have been fitted, you can visualize a summary of QTL metrics with the function plot_qtl():

> plot_qtl(data = data, model = remim.mod, fitted = fitted.mod, drop.pheno = FALSE)

Dots are located on the respective chromosome positions of the QTL peaks. The dots’ sizes correspond to the specific QTL heritability, while their colors correspond to the \(P\)-values, which may help to identify the most or the less significant QTL. drop = FALSE makes sure all traits are displayed, even if they do not had any QTL detected.

Estimate allele effects

Additive effects contributing to the overall mean by each allele individually, as well as their combinations within each parent, may be computed using the function qtl_effects():

> est.effects = qtl_effects(ploidy = 4, fitted = fitted.mod)
## There are 2 QTL in the model for trait 1 'FM07'. Computing effects for QTL ... 438 ... 641. Done! 
## 
## There is 1 QTL in the model for trait 2 'FM08'. Computing effects for QTL ... 438. Done! 
## 
## There are 3 QTL in the model for trait 3 'FM14'. Computing effects for QTL ... 427 ... 449 ... 645. Done!

A plot() function allows the user to visualize these contributions graphically:

> plot(est.effects)

For each QTL, one can see which alleles contribute the most to increase or decrease the phenotypic mean. This is interesting to plan marker-assisted selection strategies, for example, as specific alleles can be selected based on the marker dosage at or around the QTL peak and with the support of the population linkage map from mappoly.

Predict breeding values

Finally, with the estimates of the final models in hands, one can use them to perform predictions of the breeding values as follows:

> y.hat = breeding_values(data = data, fitted = fitted.mod)

A plot() function shows the distribution of the genotypic values for the population:

> plot(y.hat)

This may be interesting for those populations whose individuals have been genotyped, but not phenotyped, and you still want to consider them when selecting the best genotypes.

Compare with a fixed-effect approach

A previous fixed-effect interval mapping (here named FEIM) model has been proposed as a first approach to map QTL in autopolyploid species (Hackett, Bradshaw, and Bryan 2014). It consists of a single-QTL model, where every position is tested according to the model:

\[Y = \mu_C + \sum_{i=2}^{m} \alpha_i X_i + \sum_{i=m+2}^{2m} \alpha_i X_i\]

where \(\mu_C\) is the intercept, and \(\alpha_i\) and \(X_i\) are the main effects and indicator variables for allele \(i\), respectively, and \(n\) is the ploidy level. The constraints \(\alpha_{1} = 0\) and \(\alpha_{m+1} = 0\) are imposed to satisfy the condition \(\sum_{i=1}^m X_i = m/2\) and \(\sum_{i=m+1}^{2m} X_i = m/2\), so that \(\mu_C\) is a constant hard to interpret due to these constraints. Notice that the higher the ploidy level, the more effects have to be estimated, i.e., tetraploid models have six main effects, hexaploid models have 10 effects, octoploid models will have 14 effects (i.e. \(2m-2\)), and so on.

Different from REMIM (where we test the variances), here the interest is to know if the average allele effects are different from zero (the null hypothesis) using likelihood-ratio tests (LRT). Commonly, the tests are presented as “logarithm of the odds” (LOD scores), where \(LOD = \frac{LRT}{2 \times \log_e(10)}\).

In order to evaluate significance (declare a QTL), empirical LOD thresholds are computed for each trait using permutations as proposed by Churchill and Doerge (1994).

LOD threshold permutations

Using the same object data from REMIM analyses, one can first compute the thresholds for all or specific traits. The number of simulations is given by n.sim, and 1,000 permutations are generally used:

> perm = permutations(data = data, n.sim = 1000, n.clusters = 4)
## INFO: Using 4 CPUs for calculation
## 
## Permutations for trait 1 'FM07' 
##   95% LOD threshold = 5.9
##   Calculation took 333.23 seconds
## 
## Permutations for trait 2 'FM08' 
##   95% LOD threshold = 6
##   Calculation took 277.44 seconds
## 
## Permutations for trait 3 'FM14' 
##   95% LOD threshold = 5.99
##   Calculation took 287.11 seconds

Once the analyses are done, one may print the results by specifying a vector with the probabilities of interest. By default, probs = c(0.95, 0.90), so that 95% and 90% quantiles are shown:

> print(perm)
## This is an object of class 'qtlpoly.perm'
## 
## * Trait 1 'FM07' 
##  90%  95% 
## 5.55 5.90 
## 
## * Trait 2 'FM08' 
##  90%  95% 
## 5.63 6.00 
## 
## * Trait 3 'FM14' 
##  90%  95% 
## 5.54 5.99

As probs = c(0.95, 0.90) were also the default in permutations() function, we already stored the genome-wide significance LOD threshold for 0.05 and 0.10 levels, respectively. In order to use the 95% quantiles in the subsequent FEIM analyses, one can do:

> (sig.lod = perm$sig.lod$`0.95`)
## FM07.95% FM08.95% FM14.95% 
## 5.903135 6.004204 5.985607

Interval mapping

feim function tests every position from the specified step size from data - here, every 1 cM. Besides the sig.lod vector containing the thresholds for each trait, one needs to provide a window size (e.g., w.size = 15), which will be used to select QTL peaks within the same linkage group with a minimum distance of the given window size:

> feim.mod = feim(data = data, w.size = 15, sig.lod = sig.lod)
## FEIM for trait 1 'FM07' 
##   QTL was found on LG 5 at 26.19 cM (position number 438)
##   QTL was found on LG 7 at 56.07 cM (position number 637)
## 
## FEIM for trait 2 'FM08' 
##   QTL was found on LG 5 at 26.19 cM (position number 438)
## 
## FEIM for trait 3 'FM14' 
##   QTL was found on LG 5 at 26.19 cM (position number 438)
## 
## Calculation took 5.19 seconds

A print function shows detailed information on the detected QTL:

> print(feim.mod)
## This is an object of class 'qtlpoly.feim'
## 
## * Trait 1 'FM07' 
##   LG   Pos Nmrk      Mrk   LRT   LOD     AdjR2
## 1  5 26.19  438 c2_22986 88.75 19.27 0.4432602
## 2  7 56.07  637 c2_23097 35.67  7.75 0.1887396
## 
## * Trait 2 'FM08' 
##   LG   Pos Nmrk      Mrk   LRT   LOD     AdjR2
## 1  5 26.19  438 c2_22986 96.57 20.97 0.4685352
## 
## * Trait 3 'FM14' 
##   LG   Pos Nmrk      Mrk   LRT   LOD     AdjR2
## 1  5 26.19  438 c2_22986 49.62 10.78 0.2667912

Remember that in this case, one should not sum adjusted \(R^2\) for the same trait, as each one was obtained from a single-QTL model (i.e., not estimated conditional to the others).

Finally, one may want to plot the profiles and compare to the plot profiles from REMIM:

> plot_profile(data = data, model = feim.mod, grid = TRUE)

Exporting to VIEWpoly

One can export the results from MAPpoly and QTLpoly to visualize them in the R package VIEWpoly with the commands below:

> save(maps4x, file = "mappoly.maps.RData")
> save(data, file = "qtlpoly.data.RData")
> save(remim.mod, file = "qtlpoly.remim.mod.RData")
> save(fitted.mod, file = "qtlpoly.fitted.mod.RData")
> save(est.effects, file = "qtlpoly.est.effects.RData")

Acknowledgments

This package has been developed as part of the Genomic Tools for Sweetpotato Improvement (GT4SP) and SweetGAINS projects, both funded by Bill & Melinda Gates Foundation.

References

Churchill, G A, and R W Doerge. 1994. Empirical threshold values for quantitative trait mapping. Genetics 138 (3): 963–71.
Covarrubias-Pazaran, Giovanny. 2016. Genome-assisted prediction of quantitative traits using the R package sommer.” PLoS ONE 11 (6): 1–15. https://doi.org/10.1371/journal.pone.0156744.
Hackett, Christine A., John E. Bradshaw, and Glenn J. Bryan. 2014. QTL mapping in autotetraploids using SNP dosage information.” Theoretical and Applied Genetics 127 (9): 1885–1904. https://doi.org/10.1007/s00122-014-2347-2.
Kao, C-H, Z-B Zeng, and R D Teasdale. 1999. Multiple interval mapping for quantitative trait loci.” Genetics 152 (3): 1203–16.
Lander, E S, and P Green. 1987. Construction of multilocus genetic linkage maps in humans.” Proceedings of the National Academy of Sciences of the United States of America 84: 2363–67.
Mollinari, Marcelo, and Antonio Augusto Franco Garcia. 2019. Linkage Analysis and Haplotype Phasing in Experimental Autopolyploid Populations with High Ploidy Level Using Hidden Markov Models.” G3: Genes|Genomes|Genetics 9 (10): 3297–3314. https://doi.org/10.1534/g3.119.400378.
Pereira, Guilherme da Silva, Dorcus C. Gemenet, Marcelo Mollinari, Bode A. Olukolu, Joshua C. Wood, Federico Diaz, Veronica Mosquera, et al. 2020. Multiple QTL Mapping in Autopolyploids: A Random-Effect Model Approach with Application in a Hexaploid Sweetpotato Full-Sib Population.” Genetics 215 (3): 579–95. https://doi.org/10.1534/genetics.120.303080.
Qu, Long, Tobias Guennel, and Scott L. Marshall. 2013. Linear score tests for variance components in linear mixed models and applications to genetic association studies.” Biometrics 69 (4): 883–92. https://doi.org/10.1111/biom.12095.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer.