Estimation of Genetic Parameters of Fat and Protein Contents in Tunisian Holstein Dairy Cows Using Gibbs Sampling

Pdf : Views  Download 

Houcine Ilahi*

Citation: Estimation of Genetic Parameters of Fat and Protein Contents in Tunisian Holstein Dairy Cows Using Gibbs Sampling. American Research Journal of Genetics; vol 1, no. 1: 1-10

Copyright This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Data including 15 343 records of 5 881 first parity Holstein cows collected from 2005 to 2017 in 118 herds located in north of Tunisia were analysed for fat and protein contents. Bayesian segregation analyses using a Monte Carlo Markov Chains (MCMC) method were used to investigate genetic determinism of both traits and to estimate its genetic parameters. The postulated major gene in both traits was assumed to be additive biallelic locus with Mendelian transmission probabilities and priors used for variance components were uniform. Gibbs sampling was used to generate a chain of 600 000 samples, which were used to obtain posterior means of genetic parameters. Estimated marginal posterior means ± posterior standard deviations of variance components of fat and protein contents were 1.63 ± 0.37 0.60 ± 0.57, 0.29 ± 0.32 0.03 ± 0.01, 0.77 ± 0.83 0.33 ± 0.25 and 9.30 ± 0.20 3.32 ± 0.05 for polygenic variance (σu2), permanent environmental variance (σpe2), major gene variance (σG2) and error variance (σe2), respectively. Results showed the postulated major locus for both traits fat and protein contents was not significant, since the 95% highest posterior density regions (HPDs95%) of most major gene parameters included 0, and particularly for the major gene variances. Estimated transmission probabilities for the 95% highest posterior density regions (HPDs95%) of both traits were overlapped. Genetic parameters estimates of fat and protein contents were very similar under both mixed inheritance and polygenic models. These results indicated that the genetic determinism of fat and protein contents in Tunisian Holstein dairy cows is purely polygenic. Based on 30 000 Gibbs samples, polygenic heritability estimates for fat and protein contents were 0.18 and 0.20, respectively. The corresponding repeatability estimates were 0.24 and 0.22, respectively.

Keywords: Holstein cows, fat content, protein content, Bayesian segregation analysis, major gene; genetic parameters.



Genetic improvement of dairy cattle would increases benefits of dairy farmers. The size of Holstein cow population has substantially increased over the recent years in Tunisia through the importation of cattle and semen from developed countries (USA, the Netherlands, and Germany). Hammami et al. (2007) and Ben Zaabza al. (2016) reported that 60% of all inseminations of dairy cows in Tunisia used Holstein semen. Recently, more attention has been given and placed on milk quality traits in breeding programmes. Estimates of genetic parameters for milk yield in dairy cows are abundant in the literature (Hammami et al., 2008a, 2009a, Ben Gara et al., 2006, 2012; Ilahi et al., 2012). However, investigation of genetic parameters of milk quality traits is lacking. Cows enrolled in the A4 official milk recording system since the 1960s, were about 10% of the total Holstein population in 2000 (Rekik et al., 2003). Alternate and owner farm recording systems are being highly encouraged to increase the number of Holstein cows enrolled in the national milk recording system. The data generated by the milk recording system is not sufficiently and not adequately used and valorized as well, especially because of lack of genetic evaluation (Hammami et al., 2008b).

In farm animal populations, large sizes of phenotypic observations/records are often available at low costs and it is worthwhile to use them to look for statistical evidence of major genes or quantitative trait loci (QTL) by statistical analysis.

Segregation analysis is the most powerful statistical method to identify a single gene when DNA marker information is unavailable. With segregation analysis, it is possible to determine, using only phenotypic data sets, whether the inheritance of a certain trait is controlled, at least in part, by a single gene with a large effect. Therefore, the existence of major genes has been investigated in several different studies in livestock species: Janss et al. (1995) for various traits of Dutch Meishan crossbreds; Ilahi (1999) and Ilahi et al. (2000) for milking speed in dairy goats; Pan et al. (2001) for somatic cell scores in dairy cattle; Hagger et al. (2004) for selection response in laying hens; Ilahi and Kadarmideen (2004) for milk flow in dairy cattle; Ilahi and Othmane (2011a, b) for milk yield in dairy sheep; Ilahi et al. (2012) for milk yield in dairy cattle.

Segregation analysis using pedigreed animal populations is impossible by analytical approaches due to the existence of many (inbreeding) loops and due to the family sizes, which do not allow to sum and integrate out genotypes and polygenic effects from the likelihood or posterior density. This problem has been simplified by the development of Gibbs sampling, a Monte Carlo Markov chain (MCMC) methodology (Guo and Thompson, 1992) and its applications to livestock populations by Sorensen et al. (1994), Janss et al. (1995), Janss et al. (1997), and Ilahi and Kadarmideen (2004).

The aim of this paper was to investigate whether a segregating major gene affects fat and protein contents and to estimate the genetic parameters of fat and protein contents in Tunisian Holstein dairy cows via Bayesian analysis approach.



Data were provided by the Tunisian Genetic Improvement Center, Livestock and Pasture Office, Tunis. Original data from the official recording database were recorded between 2005 and 2017 in 118 dairy herds located in north of Tunisia. After editing on the herd size (≥ 5 records), and records only that included fat and protein contents were retained. Cows without pedigree information were discarded. Finally, data set included 15 343 records of 5 881 first parity from 5 042 cows for fat (35.38 g/kg ± 3.80) and protein (36.00 g/kg ± 3.98) contents were used for genetic analysis. Pedigree was traced as far back as possible and all pedigree information available was included in the analyses. Thus, the pedigree included 15 171 animals with 1 048 different sires.

Genetic Models

Mixed Inheritance Model

To estimate variance components and to investigate the presence of a major gene for fat and protein contents in Holstein dairy cows, the following mixed inheritance single trait model was used:

where y is the vector of observations (fat and protein contents), β is a vector of non-genetic fixed effects including: herd, year of calving, and lactation number, u is a random vector of individual polygenic effects, pe is a random vector of permanent environmental effects, W is a design matrix that contains the genotype of each individual (i.e., AA, AB, BB), m is the vector of genotype means (i.e., -a, 0, a), e is a random vector of residual effects, and X, Z and Q are incidence matrices relating the observations to their respective effects. In the term modelling the single gene, both W and m are unknown and have to be estimated from data by using segregation analysis.

The number of levels of all effects included in the model and the number of animals in the pedigree are shown in Table 1.

The major gene was modelled as an additive autosomal biallelic (A and B) locus with Mendelian transmission probabilities. Allele A is defined to decrease the phenotypic value and allele B is defined to increase the phenotypic value (or favourable allele). With these two alleles A and B, with frequencies p and q = 1 − p where p is the estimate of A allele frequency in the founder population in which the Hardy-Weinberg equilibrium was assumed, three genotypes AA, AB or BA and BB can be encountered, with genotype means m = (−a, 0, a), where a is the additive major gene effect.

Distributional assumptions for polygenic effects were,  where A is the numerator relationship matrix. The distribution of the permanent environmental effects were, Residual effects were assumed to be distributed as and are polygenic, permanent environmental and residual variances, respectively. The relationship matrix of the full pedigree A was used in the analyses. The variance attributable to the major gene  was calculated as:
Uniform prior distributions were assumed in the range (−∞, +∞) for non-genetic effects and effects at the major locus, in the range (0, +∞) for variance components, and in the range [0, 1] for allele frequencies (Janss et al. 1995).
Gibbs sampling algorithm with blocked sampling of genotypes W was used for inference in the mixed inheritance model and implemented using the ‘iBay’ software package version 1.46 developed by (Janss 2008).
A single run of the Monte Carlo Markov Chains (MCMC) consisted of 620 000 Gibbs samples, with the first 20 000 samples used for burn-in period to allow the Gibbs chains to reach equilibrium. Thereafter each 20th was collected to obtain 30 000 Gibbs samples in total.
From the mixed general model, marginal posterior densities of the following parameters were directly estimated in each Gibbs cycle: variance components  and additive effect at the major gene a, allele frequency p, and the Mendelian transmission probabilities. Using variance components for polygenes and major genes, following Janss (2008), the heritabilities and repeatabilities were computed as:

For heritability and repeatability: 
For total heritability and repeatability:
Polygenic Model

The objective of fitting a polygenic model to analyse again this data was to obtain genetic parameter estimates of fat and protein contents in Tunisian Holstein dairy cows, and to compare them with those obtained using the mixed inheritance model, to check the mode of inheritance (genetic determinism) of these both analysed traits.
Based on the same statistical model used in mixed inheritance analysis described above, the variance components under a polygenic model of fat and protein contents were estimated by Bayesian analysis, using the ‘iBay’ software package version 1.46 as well (Janss, 2008).


Marginal posterior means and standard deviations of parameters estimates of fat and protein contents using Bayesian segregation analyses implemented by Gibbs sampling are shown in Tables 2 and 3. These estimates are based on 600 000 Gibbs samples. Posterior marginal distributions of all variance components of both traits are presented in Figures 1 and 2.

* . Transmission probabilities, presented as the probabilities to inherit a B allele from BB, BA, and AA genotypes (Elston and Stewart (1971)).

According to Box and Tiao (1973), the highest posterior density regions (HPD), based on a non-parametric density estimate using the averaged shifted histogram technique (Scott, 1992), were obtained for all model parameters and for both trait. These highest regions were constructed to include the smallest possible region of each sampled parameter values. The highest posterior density regions at 95% (HPDs95%) of the additive gene effect (a) and the variance at the major locus (σG2of fat and protein contents included zero (Table 2). The estimated polygenic variances were 1.63 and 0.60 for fat and protein contents respectively, these estimates were significantly higher than the major gene variances 0.77 and 0.33 for fat and protein contents respectively. Janss et al. (1997) and Miyake et al. (1999) also suggested the use of the magnitude of the major gene variances as an indicator for the existence of segregating a major gene. However, following Elston (1980), the evidence of a significant segregating major gene in quantitative trait requires three conditions: statistical significance of the major gene component in the model, statistical differences among the transmission probabilities and these transmission probabilities are significantly different from an environmental model.

To check the statistical significance of the major gene component in the model Janss (1998) proposed to check the 95% highest posterior density region (HPD95%) of the postulated major gene variance: if the 95% HPD not include zero (the postulated major gene is statistically significant) or include zero (not significant). The Mendelian transmission (probabilities 1, 1/2, and 0) for both traits was tested by checking if the highest posterior density regions at 95% (HPDs95%) were overlapped or not. Mendelian transmission probabilities for the 3 genotypes were estimated (Table 3) as suggested by Elston and Stewart (1971). These probabilities are parameterised to indicate the Mendelian transmission of the favourable allele, with probabilities of B allele transmission of 1, 1/2, and 0 for genotypes BB, BA, and AA, respectively.

Table 3, shows the three estimated posterior means of Mendelian transmission probabilities were not significantly different, and as well their highest posterior density regions at 95% (HPDs95%) for the three genotypes were overlapped. Furthermore, the density of marginal posterior distribution for the major gene variances as shown Figure 1 and 2 were unimodal marginal density with mode = 0, suggested the absence of a major gene for both analysed traits (Janss et al., 1995; Pan et al., 2001).

Following these results obtained from segregation analysis via Gibbs sampling based only on phenotypic data sets, we can conclude that the postulated major gene was not significant and the genetic inheritance of both traits fat and protein contents in Holstein dairy cows is polygenic.

Tables 2 and 4 shows the estimates of genetic parameters are consistent across models mixed and polygenic models. This finding confirmed again that the postulated major gene is not significant on fat and protein contents in Holstein dairy cows.

The variance components under polygenic model of fat and protein contents were estimated by using Bayesian analysis approach. These estimates were based on 30 000 Gibbs samples and were shown in Table 4.

Heritability estimates for fat and protein contents using polygenic model were 0.18 ± 0.02 and 0.20 ± 0.01, respectively, which were in agreement with estimates obtained by Abdullahpour et al. (2010) and Abdullahpour et al. (2010). However, the results of this study were lower than those reported in the literature (Rzewuska and Strabel, 2013; Missanjo et al., 2013; Sneddon et al., 2015). The repeatability estimates were 0.24 ± 0.01 and 0.22± 0.01 for fat and protein contents, respectively. These estimates were lower than those reported by Boujenane (2002); Missanjo et al., (2013) and Sneddon et al., (2015). The low estimates of genetic parameters might be explained by limited production levels in Tunisian dairy cattle population and incomplete and/or inaccurate pedigree information on imported semen of some sires (Rekik et al. 2003).


The mode of inheritance and genetic parameters of fat and protein contents in Tunisian Holstein dairy cows were investigated and estimated under mixed inheritance and polygenic single trait models via Gibbs sampling. The finding of this paper showed no existence of major gene and the genetic determinism of fat and protein contents is purely polygenic.

Estimates of genetic parameters of both traits were generally lower than those commonly reported in the literature, which may be due to limited production level of Tunisian dairy cattle populations, and the incomplete pedigree information on the fitted data.


Authors thank Dr. L.L.G. Janss for supplying iBay software for the analysis. Authors also thank Tunisian Livestock and Pasture Office for providing the data.


1. Abdullahpour. R., Shahrbabak. M.M., Nejati-Javaremi. A., and Torshizi. R.V., 2010. Genetic analysis of daily milk, fat percentage and protein percentage of Iranian first lactation Holstein cattle. World Appl. Sci. J., 10 (9), 1042-1046.

2. Abdullahpour. R., Shahrbabak. M.M., Nejati-Javaremi. A., Torshizi. R.V., and Morde. R., 2013. Genetic analysis of milk yield, fat and protein contents in Holstein dairy cows in Iranian: Legendre polynomial random regression model applied, Arch, Tierz. 56 (48), 407-508.

3. Ben Zaabza. H., Ben Gara, A., Hammami, H., Fechichi. M.A., and Rekik, B., 2016. Estimation of variance components of milk, fat and protein yields of Tunisian Holstein dairy cattle using Bayesian and REML methods. Arch. Anim. Breed., 59, 243-248.

4. Ben Gara, A., Rekik, B., Bouallègue, M., 2006. Genetic parameters and evaluation of Tunisian dairy cattle population for milk yield by Bayesian and BLUP analyses. Livest. Prod. Sci. 100, 142-149.

5. Ben Gara, A., Jemmali, B., Hammami, H., Rouissi, H., Bouallegue, M., and Rekik, B.. 2012. Milk Production of Holsteins Under Mediterranean Conditions: Case of the Tunisian Population, in: Milk production, Nova Sciences Publisher, NY, USA.

6. Boujenane. I., 2002, Estimates of genetic and phenotypic parameters for milk production in Moroccan Holstein-Friesian cows, Revue Elev. Med. Vet. Pays Trop., 55 (1), 63-67,

7. Box, G.E.P., Tiao, G., 1973. Bayesian Inference in Statistical Analysis, Reading Addison-Wesley.

8. Elston, R.C., 1980. Segregation Analysis. Current developpements in anthropological genetics, vol. 1. Edited by Mielke J. H and Crawford M. H Plenum Publishing Corporation. 327-354.

9. Elston, R.C., Stewart, J.M., 1971. A general model for the genetic analysis of pedigree data. Human Heredity. 21, 523-542.

10. Guo, S.W., Thompson, E.A., 1992. Monte Carlo method for combined segregation and linkage analysis. Am. J. Hum. Genet. 51, 1111-1126.

11. Hagger, C., Janss, L.L.G., Kadarmideen, H.N., Strazinger, G., 2004. Bayesian inference on major loci in related multigeneration selection lines of laying hens. Poultry Sci. 83, 1932–1939.

12. Hammami, H, Croquet, C., Stoll, J., Rekik, B., and Gengler, N., 2007. Genetic diversity and joint–pedigree analysis of two importing Holsteins populations. J. Dairy Sci. 90, 3530-3541.

13. Hammami, H., Rekik, B., Soyeurt, H., Bastin, C., Stoll, J., and Gengler N. 2008a. Genotype x Environment Interaction for Milk yield in Holsteins Using Luxembourg and Tunisian Populations, J. Dairy Sci. 91, 3661– 3671.

14. Hammami,H., Rekik, B., Soyeurt, H., Ben Gara, A., and Gengler, N. 2008b: Genetic Parameters for Tunisian Holsteins using a test-day Random Regression Model, J. Dairy Sci., 91, 2118–2126.

15. Hammami, H., Rekik, B., Bastin., C., Soyeurt, H., Bormann, J., Stoll, J., and Gengler, N., 2009a. Environmental Sensitivity for MilkYield in Luxembourg and Tunisian Holsteins by Herd Management Level, J. Dairy Sci., 92, 4604–4612.

16. Ilahi, H., 1999. Variabilité génétique de débit de traite chez les caprins laitiers. Thèse de doctorat. ENSARennes, France.

17. Ilahi, H., Manfredi, E., Chastin, P., Monod, F., Elsen, J.M., Le Roy, P., 2000. Genetic variability in milking speed of dairy goats. Genet. Res. 75, 315-319.

18. Ilahi, H. Kadarmideen, H,N., 2004. Bayesian segregation analysis of milk flow in Swiss dairy cattle using Gibbs sampling. Genet. Sel. Evol. 36, 563–576.

19. Ilahi. H., Othmane M.H., 2011a. Complex segregation analysis of total milk yield in Churra dairy ewes. AsianAustr. J. Anim. Sci. 24, 330-335.

20. Ilahi H., Othmane M.H., 2011b. Bayesian segregation analysis of test-day milk yield in Tunisian Sicilo-Sarde dairy sheep. J. Animal and Feed Science. 20, 161-170.

21. Ilahi. H., Ben Hammouda. M., Othmane. M.H., 2012. Bayesian genetic analysis of milk yield in Tunisian Holstein dairy cattle population. Open Journal of genetics. 2, 103-105.

22. Janss, L.L.G., Thompson, R., Van Arendonk, J.A.M., 1995. Application of Gibbs sampling for inference in a mixed major gene-polygenic inheritance model in animal populations. Theor. Appl. Genet. 91, 1137-1147.

23. Janss, L.L.G., Van Arendonk, J.A.M., Brascamp, E.W., 1997. Bayesian statistical analyses for presence of single major genes affecting meat quality traits in crossed pig population. Genetics. 145, 395-408.

24. Janss, L.L.G., 1998. “MAGGIC” a package of subroutines for genetic analyses with Gibbs sampling, in: Proc. 6th World Congr. Genet. Appl. Livest. Prod., 11– 6 January 1998, Vol. 27, University of New England, Armidale, Australia, pp. 459–460.

25. Janss, L.L.G., 2008. “iBay manual version 1.46”. Janss Bioinformatics, Lieden, Netherlands.

26. Missanjo, E., Imbayarwo–Chikosi, V., and Halimani, T., 2013. Estimation of genetic and phenotypic parameters for production traits and somatic cell count for Jersey dairy cattle in Zimbabwe, Veterinary Sciences, 1–5.

27. Misztal, I.,Lawlor, T.J., Short, T.H., Van Renden, P.M., 1992. Multiple trait estimation of variance components of milk yield and type traits using an animal model. J. Dairy Sci. 75, 544-551.

28. Miyake T., Gaillard, C., Moriya, K., Sasaki, Y., 1999. Accuracy of detection of major genes segregating in outbred population by Gibbs sampling using phenotypic values of quantitative traits, J. Anim. Breed. Genet. 116, 281–288.

29. Pan, Y., Boettcher, J., Gibson, J., 2001. Bayesian segregation analyses of somatic cell scores of Ontario Holstein cattle. J. Dairy Sci. 84, 2796–2802.

30. Rzewuska, K., and Strabel, T., 2013. Genetic parameters for milk urea concentrations and milk traits in Polish Holstein–Friesian cows. J. Appl. Genetics, 54, 473–482.

31. Rekik, B., Ben Gara, A., Ben Hammouda, M., and Hammami, H., 2003. Fitting lactation curves of dairy cattle in different types of herds in Tunisian. Livest. Prod. Sci. 83,309-315.

32. Scott, D.W., 1992. Multivariate Density Estimation, Wiley and Sons, New York.

33. Sneddon, N.W., Lopez–Villalobos, N., Davis, S.R., Hickson, R,E., and Shallo, L., 2015. Genetic components including lactose from test day records in New Zealand dairy herd, New Zealand Journal of Agricultural Research, 58 (2), 97- 107.

34. Sorensen, D., Anderson, S., Jensen, J., Wang, C.S., Gianola, D., 1994. Inferences about genetic parameters using the Gibbs sampler, in: Proceedings of the 5th World Congress on Genetics Applied to Livestock Production, Guelph, Canada, Vol 18, 321-328.