.

Wednesday, December 12, 2018

'The Relationship Between Life Expectancy at Birth and Gdp Per Capita\r'

'The congressship amid carriage hope at digest and gross domestic product per capita (PPP) Candidate: teacher: Candidate number: Date of submission: unfermenteds program Count: 2907 arm 1: Introduction In a accustomed common wealthiness, spirit prediction at all(prenominal)iance is the expected number of courses of behavior from birth. Gross domestic product per capita is defined as the market measure of all final goods and function produced within a country in ane year, divided by the size of the population of that country. The master(prenominal) objective of the inclose childbed is to establish the initiation of a statistical congress amongst life sentence expectation (y) at birth and gross domestic product per capita (x).First, we get out present in Section 2 the information, from an authorised brassal source, containing disembodied spirit foreboding at birth and gross domestic product per capita of 48 countries in the year 2003. We volition put th is data in a gameboard redacted alphabetically and at the end of the arm we will perform nearly basic statistical analysis of these data. These statistics will include the mean, median, average(a) syndicate and s adenosine monophosphatele remainder, for both liveness foresight and gross domestic product per capita. In Section 3 we will chance the atavism draw off which stovepipe checks our data and the correspond cor proportion coefficient r.It is natural to ask if in that respect is a non- depictar prototype, which better sop ups the statistical relation in the midst of gross domestic product per capita and Life expectation. This question will be analyze in Section 4, where we will mind if a lumberarithmic relation of until nowt y=A ln(x+C) + B, is a better model. In Section 5 we will perform a khi forthrightly try out to get evidence of the introduction of a statistical relation amidst the variables x and y. In the last section of the take care, separate than summarizing the obtained results, we will present some(prenominal) manageable directions to further investigation. Section 2: Data assemblyThe hobby duck shows the gross domestic product per capita (PPP) (in US Dollars), denoted xi, and the mean Life foreboding at birth (in years), denote yi, in 48 countries in the year 2003. The data has been collected through an online website (2). According to this website it represents semiofficial world records. Country| gross domestic product †per capita (xi)| Life Expectancy at birth (yi)| 1. genus Argentina| 11200| 75. 48| 2. Australia| 29000| 80. 13| 3. Austria| 30000| 78,17| 4. Bahamas, The| 16700| 65,71| 5. Bangladesh| 1900| 61,33| 6. Belgium| 29100| 78,29| 7. Brazil| 7600| 71,13| 8. Bulgaria| 7600| 71,08| 9. Burundi| 600| 43,02| 10. Canada| 29800| 79,83| 1. Central Afri potentiometer res publica| 1100| 41,71| 12. kile| 9900| 76,35| 13. China| 5000| 72,22| 14. Colombia| 6300| 71,14| 15. Congo, Republic of the| 700| 50,02| 16. Costa Rica| 9100| 76,43| 17. Croatia| 10600| 74,37| 18. Cuba| 2900| 76,08| 19. Czechoslovakian Republic| 15700| 75,18| 20. Denmark| 31100| 77,01| 21. Domini rotter Republic| 6000| 67,96| 22. Ecuador| 3300| 71,89| 23. Egypt| 4000| 70,41| 24. El Salvador| 4800| 70,62| 25. Estonia| 12300| 70,31| 26. Finland| 27400| 77,92| 27. France| 27600| 79,28| 28. gallium| 2500| 64,76| 29. Germ all| 27600| 78,42| 30. Ghana| 2200| 56,53| 31. Greece| 20000| 78,89| 32. Guatemala| 4100| 65,23| 33.guinea fowl| 2100| 49,54| 34. Haiti| 1600| 51,61| 35. Hong Kong| 28800| 79,93| 36. Hungary| 13900| 72,17| 37. India| 2900| 63,62| 38. Ind unrivaledsia| 3200| 68,94| 39. Iraq| 1500| 67,81| 40. Israel| 19800| 79,02| 41. Italy| 26700| 79,04| 42. Jamaica| 3900| 75,85| 43. Japan| 28200| 80,93| 44. Jordan| 4300| 77,88| 45. southern Africa| 10700| 46,56| 46. Turkey| 6700| 71,08| 47. joined Kingdom| 27700| 78,16| 48. join States| 37800| 77,14| hedge1: gross domestic product per capita and Life Exp ectancy at birth in 48 countries in 2003 (source: consultation [2]) Statistical analysis: First we puzzle out few basic statistics of the data collected in the above sidestep.Basic statistics for the gross domestic product per capita: Mean: x=i=148xi48 = 12900 In order to rate the median, we need to order the gross domestic product set: 600, 700, 1100, 1500, 1600, 1900, 2100, 2200, 2500, 2900, 2900, 3200, 3300, 3900, 4000, 4100, 4300, 4800, 5000, 6000, 6300, 6700, 7600, 7600, 9100, 9900, 10600, 10700, 11200, 12300, 13900, 15700, 16700, 19800, 20000, 26700, 27400, 27600, 27600, 27700, 28200, 28800, 29000, 29100, 29800, 30000, 31100, 37800. The median is obtained as the inwardness judge of the 2 central set (the twenty-fifth and the 26th): Median= 7600+91002 = 8350 In order to compute the modal class, we need to split the data in classes.If we welcome out classes of USD 1000 (0-999, 1000-1999, …) we micturate the by-line table of frequencies: mannikin| Frequency| 0-999| 2| 1000-1999| 4| 2000-2999| 5| 3000-3999| 3| 4000-4999| 4| 5000-5999| 1| 6000-6999| 3| 7000-7999| 2| 8000-8999| 0| 9000-10000| 2| 10000-10999| 2| 11000-11999| 1| 12000-12999| 1| 13000-13999| 1| 14000-14999| 0| 15000-15999| 1| 16000-16999| 1| 17000-17999| 0| 18000-18999| 0| 19000-19999| 1| 20000-20999| 1| 21000-21999| 0| 22000-22999| 0| 23000-23999| 0| 24000-24999| 0| 25000-25999| 0| 26000-26999| 1| 27000-27999| 4| 28000-28999| 2| 29000-29999| 3| 30000-30999| 1| 31000-31999| 1| 32000-32999| 0| 3000-33999| 0| 34000-34999| 0| 35000-35999| 0| 36000-36999| 0| 37000-38000| 1| put over 2: Frequencies of gross domestic product per capita with classes of USD 1000 With this quality of classes, the modal class is 2000-2999 (with a frequence of 5). If instead we dish out classes of USD 5000 (0-4999, 5000-9999, …) the modal class is the first: 0-4999 (with a frequency of 18). Class| Frequency| 0-4999| 18| 5000-9999| 8| 10000-14999| 5| 15000-19999| 3| 20000-24999| 1| 25000-29999| 10| 30000-34999| 2| 35000-40000| 1| Table 3: Frequencies of GDP per capita with classes of USD 5000 Standard deviation: Sx=i=148(xi-x)248 =11100Basic statistics for the Life Expectancy: Mean: y=i=148yi48 = 70,13 As before, in order to compute the median, we need to order the Life Expectancies: 41. 71, 43. 02, 46. 56, 49. 54, 50. 02, 51. 61, 56. 53, 61. 33, 63. 62, 64. 76, 65. 23, 65. 71, 67. 81, 67. 96, 68. 94, 70. 31, 70. 41, 70. 62, 71. 08, 71. 08, 71. 13, 71. 14, 71. 89, 72. 17, 72. 22, 74. 37, 75. 18, 75. 48, 75. 85, 76. 08, 76. 35, 76. 43, 77. 01, 77. 14, 77. 88, 77. 92, 78. 16, 78. 17, 78. 29, 78. 42, 78. 89, 79. 02, 79. 04, 79. 28, 79. 83, 79. 93, 80. 13, 80. 93. The median is obtained as the middle value of the cardinal central determine:Median= 72,17+72,222 = 72. 195 To find the modal class of Life Expectancy we pick up modal classes of sensation year. The table of frequencies is the interest Class| Frequency | 41| 1| 42| 0| 43| 1| 44| 0| 45| 0| 46| 1| 47| 0| 48| 0| 4 9| 1| 50| 1| 51| 1| 52| 0| 53| 0| 54| 0| 55| 0| 56| 1| 57| 0| 58| 0| 59| 0| 60| 0| 61| 1| 62| 0| 63| 1| 64| 1| 65| 2| 66| 0| 67| 2| 68| 1| 69| 0| 70| 3| 71| 5| 72| 2| 73| 0| 74| 1| 75| 3| 76| 3| 77| 4| 78| 5| 79| 5| 80| 2| Table 4: Frequencies of Life Expectancy at birth with classes of 1 year It appears from the table above that there be three modal classes: 71, 78 and 79 (with a frequency of 5).Standard deviation: Sy=i=148(yi-y)248 =10. 31 The standard deviations Sx and Sy engage been shew employ the following table of data: Country| GDP| Life exp. | (x †x? ) | (x †x? )2| (y †? y)| (y †y? )2| (x †x ? )(y †y ? )| Argentina| 11200| 75. 48| -1665| 2770838| 5. 35| 28. 64| -8907. 60| Australia| 29000| 80. 13| 16135| 260351671| 10. 00| 100. 03| 161374. 34| Austria| 30000| 78. 17| 17135| 293622504| 8. 04| 64. 66| 137790. 17| Bahamas. The| 16700| 65. 71| 3835| 14710421| -4. 42| 19. 53| -16947. 75| Bangladesh| 1900| 61. 33| -10965| 120222088| -8. 80| 77. 4 2| 96474. 63| Belgium| 29100| 78. 29| 16235| 263588754| 8. 16| 66. 1| 132501. 29| Brazil| 7600| 71. 13| -5265| 27715838| 1. 00| 1. 00| -5271. 16| Bulgaria| 7600| 71. 08| -5265| 27715838| 0. 95| 0. 90| -5007. 93| Burundi| 600| 43. 02| -12265| 150420004| -27. 11| 734. 88| 332477. 52| Canada| 29800| 79. 83| 16935| 286808338| 9. 70| 94. 11| 164294. 71| Central Afri quarter Republic| 1100| 41. 71| -11765| 138405421| -28. 42| 807. 63| 334334. 75| Chile| 9900| 76. 35| -2965| 8788754| 6. 22| 38. 70| -18443. 41| China| 5000| 72. 22| -7865| 61851671| 2. 09| 4. 37| -16446. 81| Colombia| 6300| 71. 14| -6565| 43093754| 1. 01| 1. 02| -6638. 43| Congo. Republic of the| 700| 50. 02| -12165| 147977088| -20. 1| 404. 36| 244614. 57| Costa Rica| 9100| 76. 43| -3765| 14172088| 6. 30| 39. 71| -23721. 58| Croatia| 10600| 74. 37| -2265| 5128338| 4. 24| 17. 99| -9604. 66| Cuba| 2900| 76. 08| -9965| 99292921| 5. 95| 35. 42| -59301. 73| Czech Republic| 15700| 75. 18| 2835| 8039588| 5. 05| 25. 52| 14322. 40| D enmark| 31100| 77. 01| 18235| 332530421| 6. 88| 47. 35| 125482. 46| Dominican Republic| 6000| 67. 96| -6865| 47122504| -2. 17| 4. 70| 14887. 57| Ecuador| 3300| 71. 89| -9565| 91481254| 1. 76| 3. 10| -16845. 62| Egypt| 4000| 70. 41| -8865| 78580838| 0. 28| 0. 08| -2493. 16| El Salvador| 4800| 70. 62| -8065| 65037504| 0. 9| 0. 24| -3961. 73| Estonia| 12300| 70. 31| -565| 318754| 0. 18| 0. 03| -102. 33| Finland| 27400| 77. 92| 14535| 211278338| 7. 79| 60. 70| 113249. 07| France| 27600| 79. 28| 14735| 217132504| 9. 15| 83. 75| 134847. 48| Georgia| 2500| 64. 76| -10365| 107424588| -5. 37| 28. 82| 55644. 86| Germ both| 27600| 78. 42| 14735| 217132504| 8. 29| 68. 74| 122175. 02| Ghana| 2200| 56. 53| -10665| 113733338| -13. 60| 184. 93| 145025. 00| Greece| 20000| 78. 89| 7135| 50914171| 8. 76| 76. 76| 62515. 17| Guatemala| 4100| 65. 23| -8765| 76817921| -4. 90| 24. 00| 42935. 50| Guinea| 2100| 49. 54| -10765| 115876254| -20. 59| 423. 0| 221629. 32| Haiti| 1600| 51. 61| -11265| 126890838| -1 8. 52| 342. 94| 208606. 00| Hong Kong| 28800| 79. 93| 15935| 253937504| 9. 80| 96. 06| 156187. 00| Hungary| 13900| 72. 17| 1035| 1072088| 2. 04| 4. 17| 2113. 54| India| 2900| 63. 62| -9965| 99292921| -6. 51| 42. 36| 64856. 98| Ind cardinalsia| 3200| 68. 94| -9665| 93404171| -1. 19| 1. 41| 11488. 77| Iraq| 1500| 67. 81| -11365| 129153754| -2. 32| 5. 38| 26351. 63| Israel| 19800| 79. 02| 6935| 48100004| 8. 89| 79. 05| 61664. 52| Italy| 26700| 79. 04| 13835| 191418754| 8. 91| 79. 41| 123290. 86| Jamaica| 3900| 75. 85| -8965| 80363754| 5. 72| 32. 73| -51288. 2| Japan| 28200| 80. 93| 15335| 235175004| 10. 80| 116. 67| 165641. 67| Jordan| 4300| 77. 88| -8565| 73352088| 7. 75| 60. 08| -66386. 23| South Africa| 10700| 46. 56| -2165| 4685421| -23. 57| 555. 49| 51016. 52| Turkey| 6700| 71. 08| -6165| 38002088| 0. 95| 0. 90| -5864. 06| unite Kingdom| 27700| 78. 16| 14835| 220089588| 8. 03| 64. 50| 119146. 94| United States| 37800| 77. 14| 24935| 621775004| 7. 01| 49. 16| 174828. 44| Table 5: Statistical analysis of the data collected in Table 1 From the last column we can compute the covariance logical argument of the GDP and Life Expectancy: Sxy =148 i=148(xi-x)(yi-y)= 73011. 6 Section 3: bi bilinear regression We start our investigation by perusal the line top hat fit of the data in Table 1. This will allow us to see whether there is a relation of linear habituation surrounded by GDP and Life Expectancy. The regression line for the variables x and y is given by the following formula: y-y ? =SxySx2(x-x ) By using the values found above we get: y= 62. 51 + 0. 5926*10-3 x The Pearsons correlation coefficient is: r = 0. 6380 The following graph shows the data on Table 1 together with the line of best fit computed manikin 1: bilinear regression. The value of the correlation coefficient r ~ 0. , is evidence of a moderate peremptory linear correlation between the variables x and y. On the separate hit it is unpatterned from the graph above that the relation between the variables is not exactly linear. In the next section we will try to speculate on the reason for this non-linear relation and on what fiber of statistical relation can exist between GDP per capita and Life Expectancy. Section 4: Logarithmic regression As explained in reference [3], â€Å"the main reason for this non-linear family [between GDP per capita and Life Expectancy] is because people consume both call for and wants.People consume needs in order to survive. at a time a person’s needs be satisfied, they could then spend the rest of their notes on non-necessities. If everyone’s needs atomic number 18 satisfied, then any increase in GDP per capita would barely furbish up Life Expectancy. â€Å" There are various other reasons that one can think of, to explain the non-linear relationship between GDP per capita and Life Expectancy. For recitation the GDP per capita is the average wealthiness, while one should consider too how the global wealt h is distributed among the population of a given country.With this in mind, to have a more masterly picture of the statistical relation between deliverance of a country and Life Expectancy, one should lay claim into considerations also other economic parameters, such as the dissimilarity might, that describe the distribution of wealth among the population. Moreover, the wealth of the population is not the only factor effecting Life Expectancy: one should also take into account, for ex adenosine monophosphatele, the governmental policies of a nation towards health and poverty. For example Cuba, a country with a very low GDP per capita ($ 2900), has a relatively high Life Expectancy (76. 8 years), to the highest degreely due to the fact that the government provides basic needs and health assistance to the population. almost of these aspects will be proveed in the next section. permit’s try to guess what could be a fairish relation between the variables x (GDP per capi ta) and y (Life Expectancy). According to the above observations we can consider the integrality GDP formed by two values: x= xn + xw, where xn denotes the part of wealth spent on necessities, and xw denotes the part spent on wants.It is reasonable to net the following assumptions: 1. The Life Expectancy depends linearly on the part of wealth spent on necessities: y=axn + b, (1) 2. The fraction xn/x of wealth spent on necessities, is close to 1 when x is close to 0 (if one has a little amount of gold he/she will spend most of it on necessities), and is close to 0 when x is very sizeable (if one has a very large money he/she will spend only a little fraction of on necessities). 3.We make the following choice for the function xn= f(x) satisfying the above requirements: xn= put down (cx + 1)/c, (2) where c is some autocratic parameter. This function is elect mainly for two reasons. On one hand it satisfies the requirements that are describe in 2, indeed the fit graph of xn/x = f(x) = log (cx + 1)/cx: direct 2: Graph of the function y= log (cx + 1)/cx, for C=0. 5 (blue), 1 ( unrelenting) and 10 (red). The blue, black and red lines correspond respectively to the choice of parameter c= 0. 5, 1 and 10.As it appears from the graph in all issues we have f(0)= 1 and f(x) is small for large values of x. On the other hand the function elect allows us to use the statistical tools at our temperament in the excel software to derive some evoke conclusion about the statistical relation between x and y. This is what we are going to do next. First we want to find the relation between x and y under the above assumptions. place together equations (1) and (2) we get: y= aclncx+1+b, (3) which shows that there is a logarithmic dependence between x and y.Equation (3) can be rewritten in the following akin form: if we denote A=a/c, B= b+(a/c)ln(c), C=1/c, y=Aln(x+C)+B . (4) We can now study the curve of type (4) which best fits the data in Table 1, using the statistical t ools of excel spreadsheet. Unfortunately excel allows us to plot only a curve of type y= Aln(x) + B (i. e. equation of type quadruplet where C is equal to 0). For this choice of C, we get the following logarithmic curve of best fit together with the fit value of correlation coefficient r2. Figure 3: Logarithmic regression.To find the analogous curve of best fit for a given value of C (positive, arbitrarily chosen) we can simply add C to all the x values and redo the resembling plot as for C= 0 with the new independent variable x1= x + C. We omit present the graphs containing the curve of best fit for all the come-at-able values of C and we simply report, in the following table, the correlation coefficient r for some appropriately chosen values of C. C| r| 0. 00| 0. 77029| 0. 01| 0. 77029| 0. 1| 0. 77028| 1| 0. 77025| 10| 0. 76991| 100| 0. 76666| Table 8: correlation coefficient r2 for the curve of best fit y= Aln(x+C) +B, for some values of C. The above data indicate that the optimal choice of C is between 0. 00 and 0. 01, since in this case r is the closest to 1. Comparing the results got with the linear regression (r ~ 0,6) and the logarithmic regression (r ~ 0,8) we can conclude that the latter(prenominal) appears to be a better model to describe the relation between GDP per capita and Life Expectancy, since the value of the correlation coefficient is significantly bigger. From Figure 3 one the data is very far from the curve of best fit and so we may define to discuss it separately and do the regression without it.This data is corresponds to South Africa with a GDP per capita of 10700 and a Life Expectancy at birth of 46. 56 (much lower than any other country with a comparable GDP). It is reasonable to think that this anomaly is due to the peculiar narrative of South Africa which, after the end of apartheid, had to face an lordless violence. It is therefore difficult to fit this country in a statistical model and we can decide to remove it from ou r data. Doing so, we get the following new plot. Figure 4: Logarithmic regression for the data in Table 1 excluding South Africa. The new value of correlation coefficient r~ 0. 3 indicates that, excluding the anomalous data of South Africa, there is a strong positive logarithmic correlation between GDP per capita and Life Expectancy at birth. Section 5: Chi shape test (? 2? test)????? We conclude our investigation by making a chi square(a) test. This will allow us to confirm the existence of a relation between the variables x and y. For this aspire we formulate the following null and alternative hypotheses. H0: GDP and Life Expectancy are not correlated. H1: GDP and Life Expectancy are correlated * find frequency: The observed frequencies are obtained directly from Table 2: | Below y? | above y? | Total|Below x| 14| 1| 15| Above x| 16| 17| 33| Total| 30| 18| 48| Table 6: Observed frequencies for the ki square test * Expected frequency: The expected frequencies are obtained b y the formula: fe = (column derive (row total) / total sum | Below y? | Above y? | Total| Below x| 9. 375| 5. 625| 15| Above x| 20. 625| 12. 375| 33| Total| 30| 18| 48| Table 7: Expected frequencies for the chi square test. We can now calculate the chi square variable: ?2? = ( f0-fe)2/fe = 8. 85 In order to decide whether we accept or not the alternative hypothesis H1, we need to find the number of degrees of freedom (df) and to fix a take of authorization .The number of degrees of freedom is: df= (number of rows †1) (number of columns â€1) = 1 The corresponding searing values of chi square, depending on the choice of level of confidence , are given in the following table (see reference [4]) df| 00. 10| 00. 05| 0. 025| 00. 01| 0. 005| 1| 2. 706| 3. 841| 5. 024| 6. 635| 7. 879| Table 7: Critical values of chi square with one degree of freedom. Since the value of chi square is greater than any of the above critical values, we conclude that even with a level of confidence = 0. 005 we can accept the alternative hypothesis H1: GDP and Life Expectancy are related.The above test shows that there is some relation between the two variables x (GDP per capita) and y (Life Expectancy at birth). Our address is to further investigate this relation. Section 6: Conclusions variation of results Our study of the statistical relation between GDP per capita and Life Expectancy brings us to the following conclusions. As the chi square test shows there is emphatically some statistical relation between the two variables (with a confidence level = 0. 005). The study of linear regression shows that there is a moderate positive linear correlation between the two variables, with a correlation coefficient r~ 0. . This linear model can be greatly improved replacing the linear dependence with a different type of relation. In particular we considered a logarithmic relation between the variable x (GDP) and y (Life Expectancy). With this new relation we get a correlation coeffi cient r~ 0. 7. In fact, if we remove the data related to the anomalous country of South Africa (which should be discussed separately and does not fit well in our statistical analysis), we get an even higher correlation coefficient r~ 0. . This is evidence of a strong positive logarithmic dependence between x and y. Validity and Areas of improvement Of course one possible improvement of this project would be to consider a much more extended collection data on which to do the statistical analysis. For example one could consider a large amount countries, data related to different years (other than 2003), and one could even think of studying data referring to topical anaesthetic regions within a single country.All this can be found in literature but we heady to restrict to the data presented in this project because we considered it overflowing as an application of the mathematical and statistical tools apply in the project. A second, probably more interesting, possible improvement of the project would be to consider other economic factors that can affect the Life Expectancy at birth of a country. Indeed the GDP per capita is just a measure of the average wealth of a country and it does not take in account the distribution of the wealth.There are however several economic indices that measure the dispersion of wealth in the population and could be considered, together with the GDP per capita, as a factor influencing Life Expectancy. For example, it would be interesting to study a linear regression model in which the dependent variable y is the Life Expectancy and with two (or more) independent variables xi, one of which should be the GDP per capita and another could be for example the Gini Inequality Index reference (measuring the dispersion of wealth in a country).This would have been very interesting but, perhaps, it would have been out of context in a project studying GDP per capita and Life Expectancy. Probably the most important direction of improvement of the present project is related to the somewhat arbitrary choice of the logarithmic model utilize to describe the relation between GDP and Life Expectancy. Our choice of the function y= Aln(x+C) +B, was mainly dictated by the statistic package at our disposal in the excel software used in this project.Nevertheless we could have considered different, and probably more appropriate, choices of working(a) relations between the variables x and y. For example we could have considered a mixed linear and hyperbolic regression model of type y= A + Bx + C/(x+D), as it is sometimes considered in literature (see reference [4]). Bibliography: 1. Gapminder World. Web. 4 Jan. 2012. ;lt;http://www. gapminder. org;gt;. 2. â€Å"GDP †per Capita (PPP) vs. Infant Mortality Rate. Index Mundi †Country Facts. Web. 4Jan. 2012. <http://www. indexmundi. com/g/correlation. aspx? v1=67>. 3. â€Å"Life Expectancy at Birth versus GDP per Capita (PPP). ” Statistical Consultants L td. Web. 4 Jan. 2012. <http://www. statisticalconsultants. co. nz/ weeklyfeatures/WF6. hypertext mark-up language>. 4. â€Å"Table: Chi-Square Probabilities. ” Faculty & ply Webpages. Web. 4 Jan. 2012. <http://people. richland. edu/james/lecture/m170/tbl-chi. html>.\r\n'

No comments:

Post a Comment