kurpav00
kurpav00

Reputation: 188

Plotting mixture of univariate normal distributions using the R package EMCluster

I would like to visualize a mixture of univariate normal distributions fitted by the EMCluster R package. I would like to get a set of Gaussian curves plotted for each group separately onto a histogram of my data like this: Desired output plot produced by commercial software Statgraphics

However, there does not seem to be any plotting function in EMCluster doing that, and the EMCluster help files do not cover this topic. I am aware of other R packages having this functionality (e.g., mixtools or mclust), but these do not do the fitting job right unlike EMCluster, which as the only one fits the distribution correctly.

This is is the code used to fit the model:

# input data
ratio <- c(2.3082202,2.2682008,2.3430013,2.3131582,2.3442648,2.2866254,2.4438874,2.4360583,2.4907970,2.4785809,2.4368449,2.4589041,2.2404580,2.2568378,2.2305135,2.2316156,2.1975250,2.2003426,2.3995671,2.2367229,2.2380535,2.2695250,2.2795190,2.2804133,2.2873157,2.3025447,2.2981834,2.2774566,2.1899404,2.2675393,2.2863328,2.2702749,2.2173223,2.1615549,2.3052489,1.4739972,1.4703164,1.8989637,1.6309663,1.4742799,1.6040551,1.5125876,2.3090968,2.4032732,2.2939723,2.4932422,2.4572377,2.2887470,2.2541456,2.3912709,2.3709839,2.2961881,2.3252021,2.4311603,2.4526981,2.2712559,2.4556190,2.4480402,2.2345277,2.2832188,2.3259353,2.3486292,2.3477749,2.3160682,2.3025502,2.3645101,2.2972784,2.3914385,2.4182051,2.3144094,2.3182206,2.3032635,2.3304741,2.4386540,2.4750668,1.5558920,1.5520053,1.5679544,1.5674089,1.5318896,1.5378909,1.5780276,1.5884973,1.5535807,2.4912484,1.5725149,1.6011670,1.5669198,1.5581934,1.5688439,1.5578162,1.5593121,1.5563160,1.5468341,1.5433628,1.5141012,1.5842708,1.5632946,1.5233117,1.5867471,1.5010637,1.5865281,1.5836973,1.6140125,1.6285195,1.5592994,1.5584742,1.6113194,1.6074361,1.5850861,1.5612799,1.5905453,2.3874244,2.4045643,2.3996815,2.3605345,2.3346210,2.3865179,2.4406780,2.3688209,2.3649233,2.3668617,2.4675781,2.4767129,2.4664701,2.3560843,2.5487013,2.4604951,2.5133258,2.4387729,1.4955564,1.5082731,1.5440476,1.5911176,2.4114691,2.4188795,2.4320730,2.4896641,2.4439351,2.4726592,2.4819837,2.4053318,2.4694447,2.3929463,2.3948703,2.3993741,2.4535933,2.4556870,2.3877090,2.4466891,2.4426443,2.3879938,2.3402072,2.3448416,2.4782167,2.5233350,1.5917363,1.6355997,1.5816622,1.6164543,1.5466306,1.5081628,1.4992875,1.6306420,1.6762845,1.6866838,2.4210023,2.4406259,2.3720587,2.4113856,2.3190864,2.4250728,2.4457677,2.4429783,2.4122941,2.4424428,2.4527037,2.4966437,2.4035152,2.4137089,2.3954934,2.4212645,2.4087689,2.4105695,2.5525013,1.5290941,1.5728092,1.5543364,1.5641066,1.6006943,1.5889370,1.5321614,1.5586370,1.5415335,1.5491241,1.5541842,1.5852345,1.5902462,1.5801461,1.5930383,1.5523744,1.5575027,1.5546879,1.5726514,1.5938686,1.6222843,1.5881920,1.5889394,1.6246987,1.5791531,1.6059873,1.5860296,1.5701796,1.5763713,1.6315066,1.6206847,1.5685662,1.5892450,1.6186198,1.6242586,1.5481500,2.5555498,2.5258320,1.6016010,1.6078024,1.7627993,2.5282046,2.5283876,1.6267890,1.6268478,1.6069182,1.5574029,1.5672845,1.6389830,1.5810995,1.5809019,1.5621111,1.6275267,1.5630592,1.5361265,1.5750796,1.5447555,1.5755329,1.5092036,1.5607702,1.5812295,1.6181003,1.6067561,1.6846355,1.6251237,1.7229672,1.6319477,2.0122061,1.6008821,1.4929003,1.5977780,1.6336924,1.7014357,1.7084902,1.5106016,1.6115451,1.5218875,1.5394796,1.5841880,1.6021178,1.6286143,1.6322306,1.6479547,1.6051730,1.6903781,1.5605062,1.6159015,1.5648148,1.6410461,1.5526851,1.5951099,1.6702494,1.6863064,1.5488679,1.5722265,1.5789145,2.3835058)
ratio <- c(ratio,2.4312322,2.4034768,2.4384596,2.4436314,2.4303576,2.4266296,2.3727400,2.3769620,1.4916100,2.0513010,2.0537072,1.6218816,1.6315519,1.6242272,1.5328026,2.4791800,2.4943584,1.5233619,1.5520066,1.5211411,1.5946195,1.5954327,1.5099963,1.6025225,1.4824897,1.5543129,1.6041963,1.9676651,1.8662227,1.9669943,1.5584472,1.5664382,1.5551137,1.5563330,1.5637281,1.6058321,2.3605461,1.4467948,2.4481076,2.2974519,2.4029441,2.4499736,2.4637541,2.4147364,2.4483680,2.4730025,2.4344831,2.4246536,2.3520329,2.3551388,2.4119589,2.4621961,2.3978363,2.4289587,2.4244460,2.3955454,2.4106017,2.4766059,2.5505131,2.5904964,2.4405065,2.5502677,2.5418027,2.5946520,2.3462233,2.3910095,2.3808582,2.4900000,2.4929675,2.5957481,2.6473263,2.6821399,2.5863157,2.5706163,2.5873251,2.3470114,2.3549401,2.4447632,2.3895783,2.4598023,1.9194875,1.9098258,1.5563858,1.9655190,1.9843544,1.5075750,1.9197088,1.8606831,1.5003864,1.5420225,1.5832182,1.4812740,1.5309472,1.5425323,1.6757806,1.6147225,1.5449271,1.6736108,1.6062912,1.5309035,1.5375653,1.5230470,1.4537051,1.5208274,1.6269986,1.6722362,1.5627355,1.5736269,1.5597532,1.6075006,1.5912113,1.4857339,1.4865164,1.5915908,1.5626763,1.6001816,1.5838017,1.6547792,1.6521739,1.5541260,1.5589220,1.6041114,1.6052877,1.6188839,1.5851005,1.5210468,1.5914289,1.5612923,1.5543678,1.6405511,1.5491183,1.5760101,1.5988843,1.5453074,1.5629486,2.3506363,2.3900338,2.3692994,2.3774440,2.3778159,1.4801504,1.5272571,1.5429716,1.5863007,1.5595683,1.5373692,1.5547804,1.6523496,1.6553747,2.4441022,2.3405286,2.4331678,2.4992773,2.4313793,2.3579883,2.4041657,2.3764722,2.3899039,2.4933925,2.4881635,2.3814493,2.4341179,2.5575816,2.4556496,2.5024020,2.4669698,2.4933790,2.5047439,2.4975195,2.4015712,2.4469022,2.4567287,2.5280128,2.4781728,2.5028472,2.5189290,2.5043568,2.4434804,2.4769850,2.4082988,2.4613414,2.3871942,2.4679938,2.4711610,2.3716282,2.4665808,2.4719631,2.4671665,2.3450332,2.3711461,2.3567956,2.4531954,2.4627118,2.4688380,2.4264698,2.4634535,2.4754286,2.4152280,2.4200143,2.4366610,2.4748473,2.3998817,2.4435630,2.4505969,2.4825692,2.4722832,2.3908582,2.5265103,2.4205017,2.4975109,1.8273926,1.8191600,1.7649152,1.8441525,1.8362518,1.7798777,1.7749614,1.8514624,1.8231416,1.7980675,1.8139444,1.7813007,1.8112133,1.8506497,1.7888465,1.7981612,1.7877846,1.8448715,1.8428863,1.7778798,1.8004113,1.8293173,1.8581288,0.7568285,0.7681167,0.7569872,0.7478473,0.7586729,0.7574590,0.7707119,0.7654296,0.7560802,0.7223322,0.7634792,0.7577954,0.7630233,0.7439842,0.7501500,0.7666750,0.7503895,0.7622286,0.7635520,0.7534987,0.7599475,0.7748956,0.7590139,0.7568013,0.7586510,0.7564873,0.7576587,0.7756744,0.7637029,0.7579558,0.7735586,0.7692345,0.7611547,0.7551704,0.7715118,0.7736083,0.7635049,0.7559560,0.7667778,0.7618454,0.7648308,0.7693512,0.7653699,0.7589299,2.3590999,2.3619440,2.3519471,2.3453709,2.3486040,2.4295357,2.4219361,2.4488210,2.4213269,2.3923218,1.5402144,0.7652187,0.7650076,0.7486516,0.7643569,0.7677992,0.7586368,0.7584026,0.7662775,0.7740569,0.7713377,0.7740677,0.7686109,0.7490695,0.7681402,0.7639513,0.7896797,0.7660664,0.7720809,2.4936417,2.3834268,0.7626155,0.7566950,0.7824697,0.7735270,2.4219853,2.5196019,0.7636966,0.7570312,2.4539218,2.4338622,2.4396830,0.7733773,0.7632390,0.7697633,0.7677199,0.7563686,0.7569366,0.7785563,2.3883388,2.4508850,2.5454545,2.4078138,2.4295884,2.5461211,2.3826057,2.4070722,2.4136287,2.4216196,2.3854115,2.4173648,2.4553045,2.4356992,2.3863270,2.4346182,2.3953523,2.4102911,2.4179386,2.4251441,2.4217263,2.4586260,2.4059818,2.4069762,2.4099267,2.4387039,2.4458400,2.4573490,2.4759171,2.4772782,2.4684424,2.4434910,2.4804417,2.4192554,2.4454277,2.3955812)
ratio <- c(ratio,2.4749573,2.4724740,2.4209932,2.4249042,0.7610100,2.3671184,2.3697960,2.4023920,2.4053188,2.4351452,2.4386339,2.4094596,2.3468609,2.3155974,2.3104149,2.3627904,2.2658038,2.3885135,2.3038569,2.3350929,2.3612206,2.2781386,2.2077955,2.2325030,2.2679324,2.2823724,2.4542063,2.1998876,2.3031394,2.2881462,2.3198411,2.1884115,2.2710756,2.1827032,2.3180018,2.2993929,2.2874717,2.2829849,2.3455578,2.2977960,2.3410225,2.4081727,2.3638018,2.2795117,2.3976837,2.3932511,2.4075917,2.4015616,2.3418264,2.3349013,2.4240838,2.4173924,2.3960085,2.4031925,2.3945173,2.3856273,2.4690719,2.4686576,2.6263546,2.4718199,2.3463103,2.3652738,2.4379702,2.4796148,2.4287476,2.2620077,2.3805959,2.3458551,2.3461876,2.3171412,2.3897433,2.3912464,2.3614724,2.5442086,1.7773526,1.7178727,2.3689272,2.3640495,2.2916104,2.3356112,2.3964407,2.4015147,1.7590361,2.3414859,2.2694388,2.3567392,2.2708003,2.2956377,2.3101355,2.3125120,2.3004763,2.2705803,2.2875544,2.2712565,2.3089819,2.2945688,2.3205069,2.2732516,1.7467849,1.7259182,1.7686619,1.7095978,2.2485174,2.4754472,2.4495703,2.4074395,2.4539152,2.3435356,2.3811995,2.4362843,2.4403479,2.5180754,0.7635700,0.7702974,0.7722231,0.7511177,0.7551106,0.7601832,0.7361963,0.7794874,0.7934396,0.7709755,0.7844490,0.7596845,0.7930613,0.7504008,0.7627609,0.7532478,0.7581377,0.7705301,0.7721196,0.7567074,0.7828257,0.7780374,0.7669904,0.7812995,0.7834886,0.7889141,0.7859603,0.7682853,0.7478378,0.7673373,0.7629394,0.7715725,0.7791851,0.7666474,0.7714516,0.7496962,0.7533639,0.7579321,0.7737278,0.7853168,0.7645964,0.7725103,0.7677113,1.6156064,1.5390375,1.5211357,1.5385035,1.4920242,1.6010156,1.5269279,1.5407954,1.4766077,1.5165483,2.4101888,2.4461818,2.4832620,2.4234102,2.3891139,2.4782895,2.4514298,2.4655178,1.5269570,1.5543206,1.5397302,1.5121552,1.5089835,1.5259734,1.5593889,1.5634023,1.5243902,1.5291230,1.5649990,2.3870746,2.4165985,2.4108952,2.5178706,2.4626444,2.4056250,2.4241015,2.2767663,2.1800944,2.2449665,2.2155462,2.2102232,2.1888782,2.1780130,2.1797660,2.2164432,2.1815838,2.2745605,2.2695021,2.1916002,2.1620086,2.1604276,2.1606576,2.1582508,2.1380764,2.1869745,2.1967953,2.1466257,2.1625341,2.2056076,2.1491951,2.2865496,2.2079291,2.1408957,2.1638916,2.5895003,2.1258483,2.1444175,2.2055520,2.1909330,2.2072956,2.2051386,2.4054279,2.4282109,2.4143596,2.4389768,2.4716971,2.4614070,2.4076957,2.4289880,2.4710598,2.4395144,2.4251923,2.4528039,2.4597420,2.4047160,2.4214961,2.4052775,2.4002135,2.3944374,2.4051049,2.4450801,1.5737999,1.5879028,1.4740943,1.5251975,1.5084994,1.5176669,1.5827066,1.8177106,2.4441547,2.4143742,2.4699203,2.3762657,2.3688625,2.4603598,2.4191691,2.4546725,2.4275861,2.4220888,2.4021738,2.4979091,2.4092485,2.4321795,2.4349427,2.4462136,2.4662561,2.3542403,2.4312644,2.4911342,2.4228929,2.4431390,2.4214492,2.4497284,2.4559866,2.4404771,2.4329174,2.4118930,2.4634206,2.4609880)
ratio <- c(ratio,2.4827362,2.3897923,2.4143036,2.4375662,2.4641043,2.4222062,2.4024445,2.4035411,2.4564304,2.4355113,2.3968869,2.4147049,2.4618251,2.4281002,2.3793209,2.4370193,2.3959920,2.3882394,2.4338975,2.4221606,2.4012736,2.4163148,2.4014352,2.3909576,2.4497283,2.3379209,2.2898380,2.3337816,2.2496425,2.3140063,2.2480368,2.3382981,2.4370101,2.3976064,1.5515221,1.6095634,1.5416116,1.5449938,1.5357950,1.6159637,2.5099319,2.4895688,2.5469330,2.4921635,2.4849558,2.5936576,2.4149410,2.3313097,2.3891210,2.3958461,2.4481496,2.3889537,2.3672844,2.3638062,2.3213211,2.2783695,2.3841916,2.3915916,2.4455549,2.3920387,2.3327502,2.3814912,2.3998684,2.5413619,2.4137805,2.4694112,2.2786062,2.3302104,2.3343673,2.3937296,2.3613661,2.4505636,2.4254633,2.5044136,2.4344560,2.3947676,2.3527630,2.3999480,2.3888254,2.4736389,2.4171134,2.4438647,2.2072640,2.4066542,2.4867753,2.4228284,2.4383648,2.4733934,2.3856568,2.3750662,2.2850762,2.3953039,2.4272293,2.4479172,2.4583161,2.4372287,2.4391820,2.4068650,2.3923539,2.4295108,2.4074787,2.4039005,2.4916317,2.4349603,2.4786877,2.4942217,2.2279876,2.4491045,2.2402399,2.3116072,2.3713333,2.3000641,2.2848940,2.4239510,2.4702209,2.4698362,2.4101398,2.3967287,2.4586776,2.4489123,2.4533943,2.4532398,2.4235131,2.4177043,2.4007880,2.4442298,2.4343659,2.4388599,2.3495992,2.4192377,2.3928401)
library(EMCluster)
ret <- init.EM(as.data.frame(ratio),nclass=6,method="Rnd.EM") # fitting model
# now how to plot the distribution curves?

Thank you in advance for your advice.

EDIT: The structure of the resulting object is this:

> str(ret,vec.len = 10)
List of 13
 $ pi       : num [1:6] 0.0537 0.234 0.299 0.1124 0.2225 0.0784
 $ Mu       : num [1:6, 1] 1.794 1.568 2.429 0.765 2.316 2.474
 $ LTSigma  : num [1:6, 1] 0.011002 0.001946 0.001355 0.000124 0.008215 0.006486
 $ llhdval  : num 435
 $ nc       : int [1:6] 52 252 406 119 195 35
 $ class    : num [1:1059] 5 5 5 5 5 5 3 3 3 3 3 3 5 5 5 5 5 5 3 5 5 5 5 5 5 ...
 $ conv.iter: int 93
 $ conv.eps : num 9.81e-07
 $ flag     : num 0
 $ n        : int 1059
 $ p        : int 1
 $ nclass   : num 6
 $ method   : chr "Rnd.EM"
 - attr(*, "class")= chr "emret"

Upvotes: 2

Views: 399

Answers (1)

gung - Reinstate Monica
gung - Reinstate Monica

Reputation: 11903

This is pretty straightforward. There's a little bit of calculation involved in getting the heights of the curves to line up reasonably, but otherwise it's pretty basic. First plot a histogram. If you want a box around it, like your example, do that. Then you'll need to call lines() 6 times to plot the 6 normals. In R, lines are just a sequence of interpolated points—(x,y) coordinates—so make a sufficiently fine-grained set of x coordinates, then compute the normal density for each component using dnorm() and the fitted parameters. You'll need to multiply those y-values by the appropriate proportion and a height adjustment factor to get the heights of the curves right. It turns out that the highest bin in your histogram is 82, which is approximately the peak of your third component, but since that represents only 30%, you need to rescale the adjustment factor by that. You may want to choose your own colors. Consider:

xs  <- seq(min(ratio), max(ratio), length.out=1000)

windows()
  h <- hist(ratio, breaks=seq(0, 3, by=0.02)); box()
  # max(h$counts)  # [1] 82
  height <- 82/dnorm(ret$Mu[3,1], mean=ret$Mu[3,1], sd=sqrt(ret$LTSigma[3,1]))
  height <- height/ret$pi[3]
  for(i in 1:6){
    lines(xs, dnorm(xs, mean=ret$Mu[i,1], sd=sqrt(ret$LTSigma[i,1]))*height*ret$pi[i], 
          lwd=2, col=i)
  }

enter image description here

Upvotes: 3

Related Questions