Statistical Error

Welcome! Our Blog was inspired by Joe Schwarz's essay "Lies, Damned Lies, and Statistics." Here we will present our thoughts on statistics, their validity and the ways people interpret them. Enjoy!

"There are three kinds of lies: lies, damned lies, and statistics." -Mark Twain

Tuesday, November 08, 2005

Statistics, Murder, and Sudden Infant Death Syndrome

Statistics are a very powerful tool when used correctly. It has far reaching applications, including one people don't normally associate with statistics: the courtroom. An article by the Scotsman.com, Baby death expert 'distorted statistics' talks about a situation which occured only a few months ago in which improperly used statistics led to the ruin of several families.

In one particular case, 40-year-old Sally Clark was accused of killing her two sons, Christopher and Harry, both of whom were still infants. At the trial, the prosecuting lawyer Roy Meadow cited that the chance of having two babies from the same affluent family die of cot death, or Sudden Infant Death Syndrome (SIDS) is 1 in 73 million. This statistic led the jury to believe that Mrs. Clark had indeed killed her two sons, and she was sentenced to jail.

Three years later, Mrs. Clark's father, Frank Lockyer challenged Meadow's statistic after finding that he had incorrectly assumed that the probabilities of having two children die of SIDs in one family are independent of each other. However, in reality having one baby die from SIDs greatly increases the chances of having another baby in the same family also die from SIDs. In this case, Meadow reached his 1 in 73 million statistic simply by squaring the probability of having one baby die from SIDs, 1 in 8,543.

The Clark family was not the only family affected by Meadow's hockey statistics. Meadow also gave evidence at the trials of Donna Anthony in 1998 and Angela Cannings in 2002. Both were convicted of murdering their children.

This story does have a happy ending however. In January 2003, the Court of Appeal ruled Clark's conviction to be unsafe and she was released after three years of prison. Similarly, Anthony and Cannings were both cleared. Meanwhile, Meadow's reputation was shredded during his hearing at central London, which lasted 19 days, during which time he was charged with serious professional misconduct.

George Bush and Abortions

One of the more popular statistics thrown around by Bush bashers is the claim that despite Bush's anti-abortion policy, abortions have actually increased since he first took office in 2001. However, in an article by FactCheck.org, The Biography of a Bad Statistic this claim is analyzed and found to be false.

The claim that abortions have increased nationwide since 2001 was propagated by Glen Harold Stassen, an ethics professor at Fuller Theological Seminary. Stassen claimed that the decade-long trend of declining abortion rates had reversed since President bush took office and that 52,000 more abortions have occured in the US in 2002 than would have been expected before the change in the declining abortion rates. Stassen published his "findings" in several publications, including Sojourners, a Christian magazine and the Houston Chronicle.

However, Stassen's data is sketchy and dubious at best. He claimed that he only looked at data from sixteen of the fifty states, and did not even state which sixteen. Stassen had no data whatsoever on the remaining thirty-four states. In addition, much of Stassen's data is contradicted by the very sources he cited. He claims that from 2001 to 2002, "five states saw a decrease [in abortion] (4.3 percent average.)" However, according to the Houston Chronicle, only four states saw a decrease with a 4.3 percent average.

Stassen's main (and only) source he cites is the Alan Guttmacher Institute, which keeps detailed statistics on abortions and is the leading authority on abortion statistics (other than the Center for Disease Control). However, according to the Alan Guttmacher Institute, abortion rates have actually decreased since Bush took office, a finding based on 43 states, which is significantly more than 16 states.

The problem with bad statistics is that most people take it as fact and take little time to verify it. Even big name politicians make this mistake. For example, Hillary Clinton was quoted as saying:

"But unfortunately, in the last few years, while we are engaged in an ideological debate instead of one that uses facts and evidence and common sense, the rate of abortion is on the rise in some states . In the three years since President Bush took office, 8 states saw an increase in abortion rates (14.6% average increase), and four saw a decrease (4.3% average), so we have a lot of work still ahead of us."

Here she is clearly citing statistics used by Stassen. She also avoids the problem of only using sixteen states by not explicitly stating that abortions had increased nationally, only that the abortion rate is increasing in "some states."

Other politicians such as John Kerry and Howard Dean also cited statistics from the Stassen article.

How to Use Statistics to Prove What Needs to Be Proven

It is a common belief among certain skeptics that one can prove nearly anything with ‘statistics.’ Unfortunately, such skeptics voice a real and pertinent concern: journalists, public speakers and government officials daily use altered, misinterpreted or simply flawed statistical figures to prove their side of an argument. Joe Horgan in “Your Analysis is Faulty (How to lie with drug statistics)” shows how in 1990 Michael Walsh, the “director of the Division of Applied Research and the Office of Workplace Initiatives at the National Institute on Drug Abuse (NIDA), the chief federal drug research agency,” used faulty statistical studies to justify and promote “programs in which the urine of working people is searched for signs of illegal drugs. Walsh designed the drug-testing program for federal employees mandated by Ronald Reagan … [in the late 1980s and advised] business leaders on how to test their workers. He has argued in favor of testing before Congress and federal judges, on national radio and TV shows, and in countless other public forums.

According to Horgan, Walsh’s argument was simple and straightforward: “[1] drug users, from crack addicts to weekend marijuana smokers, make less productive workers than non-users. So [2] employers are justified in using drug tests … to root out all users from the work force.” The only problem was that the studies that proved Walsh’s ideas were not abundant and their results were usually not nearly as strong and convincing as Walsh hoped. But he found a way to make the statistics advocate his methods. Horgan explains how he did it.

In 1987, Walsh presented in the Supreme Court that drug abuse cost the US industry nearly $50 billion per year. Two years later, president Bush used this figure in a cavalier manner, saying that the costs ranged from $60 to $100 billion. Horgan explains why this statistic is flawed.

The statistic was taken from a 1982 study by Research Triangle Institute (RTI), a NIDA contractor in North Carolina, which analyzed incomes of 3700 households in the US. The results showed that “the household income of adults who had ever smoked marijuana daily for a month (or at least twenty out of thirty days) was twenty-eight percent less than the income of those who hadn't. The RTI analysts called this difference ‘reduced productivity due to daily marijuana use. They calculated the total ‘loss,’ when extrapolated to the general population, at $26 billion. Adding the estimated costs of drug-related crimes, accidents, and medical care produced a grand total of $47 billion for costs to society of drug abuse.’” According to Horgan, the researchers interpreted a correlation between drug use and decreased productivity as a cause-and-effect relationship. Horgan argues that the behavior of poorer individuals differs from that of the richer individuals in many areas: possibly, the poor are more likely to watch “The Wheel of Fortune” much more frequently than the rich, but that fact does not provide sufficient evidence to state that watching “The Wheel of Fortune” decreases the viewers’ productivity or makes them poorer.

The fact that the study based its findings only on loss of productivity of people who have for at least one month in their lifetime smoked marijuana daily also raises Horgan’s suspicions. Such a strict and strange criterion was chosen, as Horgan points out, because the data supplied for less frequent use of other drugs (among which were heroin, LSD and cocaine) showed no correlation between drug use and lowered income. In order to produce the desired statistic, NIDA simply had to alter the variables under consideration.

Horgan cites other examples of statistically flawed studies that Walsh used in his drug-testing campaigns. One study analyzed the effect of marijuana on work productivity of 500 Navy officers who were employed by the Navy even though their marijuana tests were positive. In a period of 30 months, 43% of those 500 were discharged, compared to only 13% of the officers who tested negative when they were recruited. Although the numbers look decisive, their significance needs to be carefully assessed. For example, the ‘positive’ discharged officers were twice as likely not to have had a high school diploma than the ‘negative’ officers. “Why not blame poor education … for their high discharge rate?”, Horgan asks. An even more serious problem with the study lies in the fact that the control group of 500 Navy officers was heavily tested for drug use during the 30 months the study took place, and one third of the discharged officers were discharged only because they failed another drug test, while not exhibiting any loss of productivity at the workplace. Horgan notes that from these results it is reasonable to conclude that those who have used drugs already are more likely to continue using drugs – but not that the drug use affects work efficiency. However, Walsh thought otherwise.

Another study Walsh cited was performed by the Utah Power and Light Company. The results showed, as Walsh said, a “significant difference between drug users and nonusers in terms of being involved in accidents, being absent from work, and overutilization of health benefits.” The study, however, involved a miniscule sample of drug abusers – twelve users in total – and eight of them were tested for drug use after a car accident, after which some were injured and needed to take time off. Walsh’s study ignored these factors; Horgan claims that based on these factors the study itself should be ignored.

However, through like means of using incorrect statistics, Walsh succeeded in implementing the system for employee drug-testing in many big companies, including The New York Times. So maybe skeptics are right – it is possible to support almost anything with statistics; it just depends on the kind of statistics one uses.

Not Only the Scientific Publications Contain Misleading Statistics

Misinterpreted and misleading statistics do not only pollute scientific publications.
As a United Nations Refugee Agency (UNHCR) report states, the Italian Central Eligibility Commission (CEC) also erred in providing misleading statistics in its annual report in 2004
. Ron Redmond, a UNHCR spokesperson commented on the faulty statistic that claimed that Italy rejected 92% of 8701 asylum applications it considered during 2004.

Redmond explains that CEC really granted the so-called 1951 Conventional Refugee Status (CRS) to only 780 people, or about 8.96% of those who applied. The rest of the applicants were simply dubbed ‘rejected.’

However, 2352 out of the 7921 “rejected cases,” or about 27.03% of the applicants, were granted subsidiary forms of international protection – in other words, people who were judged to be in a refugee-like situation, including people fleeing war and widespread violence etc.” Therefore, in total, Italy granted some type of international protection to almost 36% of asylum-seekers. Moreover, over three thousand of the other ‘rejected’ applicants (excluding the 36% of applicants that received some form of protection) have not completed the so-called “first stance procedure,” which means that there is a chance that they may still qualify for asylum or subsidiary protection.There is a colossal difference between the figure of 8 percent recognition currently being reported in the Italian media, [and the 36% that were actually granted protection],” Redmond remarks. (And even the 36% does not include those who were granted some form of protection after an appeal.)

Such a statistical misinterpretation resulted from misleading statistics published by the CEC. So one must be always careful with statistics – regardless of whether the data analyzed is scientific or sociological.

Misleading Statistics in Defense of President Bush

In “Lies, Damned Lies, and Statistics,” Dr. Joe Schwarcz attaches significance to “the way data are communicated” (17). This concern is well justified: not only is misleading statistics found in the areas of science and medicine, but it extends to the political sphere. In specific, an analysis by Media Matters for America reported that Bill O’Reilly, the host of Fox News' The O'Reilly Factor and Westwood One's The Radio Factor, used misleading fiscal statistics to rebut President Clinton’s criticism of Bush’s spending.

For example, in response to Clinton’s claim that Bush was not mitigating poverty, O’Reilly pointed to the poverty rates at the midpoints of the presidents’ terms: the poverty rate in 2004 was higher than it was in 1996. While this standalone fact is true, it fails to account for the changes made during Clinton and Bush’s presidencies. The poverty rate continually decreased during Clinton’s term while it continually increased during Bush’s term. In 1996 the poverty rate was 15.1 percent. This went down to 11.3 percent in 2000 and then up to 12.7 percent in 2004.

Furthermore, O’Reilly asserted that the Clinton administration incurred more economic devastation than the Bush administration. To evince his point, O’Reilly highlighted the OBRA (Omnibus Budget Reconciliation Act of 1993), asserting that while “the tax rate climbed higher than at any time in history except in World War II,” the people subject to this increase were those in the highest wealth bracket (O’Reilly). While this distortion of facts is bad enough, O’Reilly’s assertion that Clinton’s tax rates were the highest since World War II is fallacious. Under Clinton’s taxing plan, the wealthiest Americans were taxed 39.6%. However, in 1975, a portion of Americans were taxed 60% of their annual income.

Additionally, O’Reilly made a prediction that “Federal tax revenues will be more this year than at any time during the Clinton administration” (O’Reilly). This does indeed coincide with the projected 2005 earnings and Clinton’s yearly earnings. However, the projected $2,142,000 in comparison to the $2,025,200 earned in 2000 under Clinton, is less when inflation is taken into account. Using the Bureau of Labor Statistics inflation calculator, the estimated figure for $2,025,200 was marked down to $1,878,070.

Lastly, O’Reilly also claimed that the Clinton administration had raised taxes annually. He was referring to the average tax burden (the total revenue divided by the gross domestic product). While this average tax burden did increase every year during Clinton’s presidency, it was engendered more by the vigorous economic growth exhibited during the 1990s—a growth precipitated by the “low inflation and low interest rates, an improved international picture with the collapse of the Soviet Union, and the advent of a qualitatively and quantitatively new information technologies” (August 2003 Treasury Department).

Quatations Source: http://mediamatters.org/items/200509210010

Thursday, November 03, 2005

Misleading Statistics Found in Scientific Journals

When people talk about misleading statistics and "doctored" statistics, most people envision fast-talking politicians or a tobacco spokesperson trying to make their argument seem more convincing. However, bad statistics can creep into even the most rigorous fields of science, as shown by Sloppy stats shame science, an article published in The Economist.

The article cites a study done by two researchers at the University of Girona in Spain, Emili García-Berthou and Carles Alcaraz. Together they studied 32 papers from Nature and the British Medical Journal, two highly renown and respected journals and found that 38% of papers in Nature and 25% of papers in the British Medical Journal contained statistical errors. They accomplished this by analyzing the numbers presented in the paper, checking to see if any numbers were fishy, and seeing whether the numbers given leads to the conclusion drawn. One technique they used was to look at the last digit in all recorded numbers. Normally, the last digit would be any number between 0 to 9 with equal probability. However, if the data was carelessly rounded, 4s and 9s would be much rarer, since they would be rounded down or up. The two researchers discovered that in several papers, the 4s and 9s indeed appeared with much less frequency than chance would predict.

Although most of these statistics were not radically altered and still led to the same conclusions, several non-significant findings may have been misrepresented as being significant. All of the errors made seem to have been unintentional, yet despite the peer-review system utilised by these journals, errors still creep in. Kamran Abbasi, deputy editor of the BMJ states: "We certainly do not spend our time recalculating all these numbers, and our whole review process would likely grind to a halt if we tried to do so."

Although statistical errors could be weeded out with scientific journals enforcing stricter control over the accuracy of the data, several of the problems lie with the researcheres themselves. Many scientists have a limited grasp of statistical techniques and simply "plug and chug" their numbers into complex statistics formulas or programs, receiving numbers back without really knowing how the process works.

Methods of Obtaining and Analyzing Data

Statistical analysis is employed by virtually every scientist and researcher seeking to find out the significance of the data obtained. Often, however, the results of statistical analysis seem flawed. In "The How and Why of Statistical Sampling" Keith Calkins explains several methods of statistical analysis and shows how incorrect methodology may bring about statistical errors.

Calkins warns that it is most important to make sure that the data is collected properly, since data collection is usually the hardest, longest and most expensive part of the project. Calkins writes: “Many years of labor and even careers have been virtually wasted because of fundamental flaws in the data collection step. The statistical analysis will only likely be a minor part of the total expense of a properly conducted experiment, so time, effort, and money spent ensuring the data are collected appropriately is certainly well spent.”

When collecting data, Calkins warns to make sure that the sample size is not too small. Calkins adds that it is always best to measure the samples by yourself instead of asking others for the measurements. For example, if a study is concerned with heights of people, it is better if one person using the same methodology measures all the heights. Otherwise, it is likely that various measuring tools will be used and the measurements will be rounded differently by different people, resulting in statistical errors and imprecision.

Calkins also notes that there are two types of studies: observational studies and experiments. Whereas observational studies “observe individuals and measure variables of interest but do not attempt to influence the responses, … [experiments] deliberately impose some treatment on individuals in order to observe their responses.” Therefore, “[observational] studies are then a poor way to gauge the effect of an intervention,” and Calkins notes: “when … [the] goal is to understand cause and effect, experiments are the only source of fully convincing data.”

Calkins further distinguishes between five different types of sampling, or “the fundamental methods of inferring information about an entire population without … measuring every member of the population,” and the errors that are often made while carrying out these methods.

In random sampling (also called representative or proportionate sampling), the members of population are chosen specifically so that they “all have an equal chance to be measured.” Systematic sampling takes only every k’th element of the population (i.e., every tenth person in a population).

Stratified sampling is often the most efficient method of sampling. In stratified sampling, the population is first broken up into several strata, or classes (such that all members of a stratum share a common characteristic; age and yearly income could be used as types of strata), and each group is then analyzed separately.

Another type of sampling is cluster sampling, which involves separating a population into several clusters and then sampling exhaustively (sampling every single member of the cluster) several randomly selected clusters. This type of sampling is most often used by private researchers and government organizations.

The fifth type of sampling discussed by Calkins is convenience sampling. Convenience sampling “is done as convenient, often allowing the element to choose whether or not it is sampled.” Convenience sampling is easiest to perform, but it is also “potentially most dangerous [and may lead to biased data].” For example, it may be most convenient to sample the grades of only those students who participate in athletics. Unfortunately for the statistician, the data points obtained will probably be unrepresentative of the whole student population, and a properly carried out analysis of the data will yield flawed results.

Wednesday, November 02, 2005

Vitamin E and Jumping to Conclusions

In Lies, Damned Lies, and Statistics, Dr. Joe Schwarz reflects that “Life often comes down to analyzing risks. But most people do not realize how difficult it is to perform this analysis in a meaningful way” (15-16). This musing is justified by a unfounded study purporting that Vitamin E supplements increased risk of heart failure or death when taken daily in amounts exceeding 400 IU. A recent study clarified the findings, stating that the effect of Vitamin E widely depends on the age and condition of the subjects.

Although multivitamins usually contain about 30 IU of vitamin E, supplements often contain as much as 200, 400 or 1000 IU. The first study explored the healthy limits of vitamin E intake. To do so, 135,967 adults were studied. Vitamin E was administered in 19 randomized trials with subjects receiving anywhere from 16.5 to 2000 IU daily. This regular supplement intake continued for more than a year in every case. The results of the study showed that more than a daily amount of 400 IU of vitamin E placed subjects at a greater risk for death than those treated with less than 400 IU. From this rather inconclusive result, the study concluded that:

”Adults should avoid taking vitamin E preparations in amounts of 400 IU or more. Experts should reconsider the stated upper tolerable intake level of vitamin E. Sellers should consider removing vitamin preparations that contain 400 IU or more per dose from stores” (Meta-Analysis: High-Dosage Vitamin E Supplementation May Increase All-Cause Mortality)

This rash supposition is highly unjustified. The study admits that many of the subjects were elderly (over 60 years old) and approximately 60% of the 135,967 subjects had some variant of heart ailment. Thus, it would seem as though this precaution of 400 IU or less should only be applied to adults within this bracket. Indeed, Jeffrey Blumberg, an antioxidant authority at Tufts University sees no justification in this study. Maret Traber of Oregon State University also mentioned that based on her examination of the study, the study’s conclusion is only justified for vitamin E amounts over 2000 IU. The parochial nature of the study was also evinced by another study performed on 40,000 healthy women which showed that vitamin E intake reduced the risk of cardiac-associated death by 24%--a percentage which Julie Buring from Harvard calls “surprising.” Thus, this review concludes that:

“400 to 800 IU daily is unlikely to harm anyone. The Institute of Medicine puts the safety limit at 1,500 IU daily” (USA Weekend Magazine)

Thursday, October 27, 2005

Misleading Statistics

Statistics, when used correctly, can be a good tool for looking at trends in large numbers and making correlations between different events. However, sometimes these statistics are misinterpreted, misanalyzed, or just plain wrong.

Examples of different types of ways to manipulate statistics is given in Misleading Statistics: faulty statistics, bad sampling, Unfair poll questions, statistics that are true but misleading, ranking statistics, qualifiers on statistics, and percentages. These are outlined below:
  • Faulty statistics: Statistics can often be fabricated out of thin air. Fabricated statistics are harder to see through than fabricated statements, since statistics command more authority than simple statments. For example, saying that "61% of all Americans are obese" seems less suspicious than saying "most americans are obese."
  • Bad sampling: Bad sampling is simply sampling too few people, or sampling people who are atypical of the general population. If a television show asks viewers to call in and give a response to a poll, the people who bother to telephone in, and more generally the people who are watching the show might be inclined to answer one way or the other.
  • Unfair poll questions: Poll questions may be worded differently in order to create an impression on the voter. An example from Misleading Statistics claims that there could be two poll questions, "Do you feel you should be taxed so some people can get paid for staying home and doing nothing?" and "Do you think the government should help people who are unable to find work?" Both questions deal with taxes, but the first question is more likely to get more "no" answers than the second question.
  • Statistics that are true but misleading: Statisticians can always "select" the data they wish to present in order to mislead the readers. For example, in an election one candidate's supporter claimed that employment was up when their candidate was in office. The opposing party claimed that unemployment was up when their candidate was in often. Both statements were true, since the population had increased during the candidate's time in office, meaning that the number of both employed and unemployed people had risen.
  • Ranking statistics: The problem with ranking statistics is that it is dependent on how you split up the items you are ranking. In an example from Misleading Statistics diabetes is listed as the third leading cause of death in the United States -- but is cancer considered one disease, or many diseases based on its nature (lung cancer, breast cancer, colon cancer, etc.)?
  • Qualifiers on Statistics: By adding qualifiers to statistics, it makes them seem as if they are something they are not. The brown bear is certainly a large animal, but not the largest in the world. However, if we say that the brown bear is the largest land predator in the world, then that statement is true, and it makes the brown bear's size seem more impressive.
  • Percentages: Statisticians can switch between numbers and percentages based on which looks more impressive: if a factory of 100,000 people fires 10,000 people, a newspaper might report that 10,000 people were fired. However, if a factory of 100 people fires 10 people, that same newspaper might report that 10% of all workers were fired.
Misleading Statistics also discusses how to avoid being misled by statistics. One strategy is to take a step back, and look at how the statistic could have been reworded or changed to make it seem misleading. Another strategy is to consider the group who is presenting the statistics: do they have a reason to be biased one way or another, or are they neutral?

What If Scientists Do Not Suspect That They Are Wrong?

In “Lies, Damned Lies, and Statistics” Joe Schwarz encourages the readers to base their decision-making, when possible, on scientifically proven and statistically verified information. Schwarz also warns of the dangers of misinterpreting the results of scientific studies and statistical analyses. Robert Matthews, a visiting reader at Department of Information Engineering at Aston University, urges to be even more careful.

In February of 2003, The New England Journal of Medicine was forced to withdraw one of the papers from the issue of the journal, because the data provided was flawed. The author of another paper in British Journal of Medicine underwent disciplinary action for the same reason. Such worrisome incidents prompt Matthews to ask: “… what if the authors have no idea that what they have done is questionable or misleading? What if they routinely use unreliable methods, with top research journals cheerfully publishing results based on them?” He soon answers his own question: “[The use of unreliable methods in studies published in top scientific journals] is the stark prospect highlighted by the epidemic of flawed statistical analysis plaguing today's research literature. From medical "breakthroughs" that prove to be mirages to grand conclusions drawn from tiny samples, the journals are awash with unreliable findings based on faulty statistics.

Matthews explains that even more than forty years ago statisticians already warned of the dangers of misusing statistics. They realized that the methods used to determine the so-called “statistically significant results,” crucial variables in all statistical testing, are very inaccurate and often exaggerate the true results. Many of these methods involve using a calculated P-value and deem results insignificant if the obtained P-value is greater than five percent. But because the methods do not take into account the plausibility of the hypothesis, it turns out that the results are four times as likely to be flawed than what most methods predict. Matthews concludes, “[in] other words, a substantial proportion of "statistically significant" findings are meaningless flukes -- scientific fool's gold that lures others to waste time and money in attempts to replicate findings.

Small studies are another cause for potentially statistically flawed results. The problem with such studies lies in the meager amount of collected data that often does not yield representative results. One study published in Journal of the Royal Society of Medicine tested the effects of a homeopathic treatment versus those of a placebo on only sixty-two patients. According to Matthews, the small number of the patients observed yielded the study only a one in three chance of confirming the efficacy of the treatment to begin with. Similar mistakes were made in numerous other studies. Matthews provides results of a study, which shows that none of twenty-five examined orthopedic studies “had the statistical power to detect a potentially worthwhile benefit” of the treatment.

Concerns of statistically flawed data being published in respected scientific journals have been voiced over a long period of time. Unfortunately, not much has been done to subdue the problem. The best way to fight the problem is probably to make sure the scientists know of the limitations of certain scientific methods, and engender an agency designated to test the data of all published scientific materials.

Global Warming--Statistical Justification?

Dr. Joe Schwarz asserts in his essay, "Lies, Damned Lies, and Statistics," people “don’t think statistically—they think emotionally” (16). This motif presents itself in the fear of global warming. While the threat of global warming is present, the International Association of Official Statistics (IAOS) discovered that the statistical basis for this phobia is insignificant. In particular, the global warming scenarios of the United Nations’ Intergovernmental Panel on Climate Change (IPCC) have been nullified after study of the unrealistic economic growth used in these projections. Among the flaws in the IPCC’s study are: an overstated wealth gap, an overstated GHG intensity, and an error in empirical data.

By exaggerating the baseline income disparity of the richest and poorest nations, the IPCC effectively augmented the projected amount of greenhouse gas emissions. The IPCC’s assumption is that underdeveloped countries will gradually develop more infrastructure and industry, thus approaching the industrial levels of the wealthier nations. Hence, a greater gap in income entails a greater amount of industrialization that must occur. This industrialization would be directly proportional to the amount of greenhouse gas emissions produced. Therefore, an exaggerated difference between the respective baseline incomes of different countries translates into an exaggerated prospect of global warming. Furthermore, the IPCC’s predicted rates of economic improvement far exceeds reason. According to the IPCC, in the year 2100 the economic output of Asia will total 70-140 times the current output. Castle states:

Such dramatic economic growth by even a single country--let alone an entire continent--would be unprecedented

Additionally, the IPCC’s model predicts that the amount of greenhouse gases per unit of global economic growth (GHG intensity) will remain constant. Thus, this assumption makes increased greenhouse gas emission as inevitable as the prospect of continued economic development. To evince the fallacies of this theory, the IAOS points to the marked decrease in GHG intensity over the past 120 years: since 1880, the GHG intensity has decreased by a factor of 5.

Furthermore, the IPCC’s study failed to consider the 2000 GHG readings to factor into its predictions. The 40 scenarios predicted by the IPCC rely on empirical data prior to 1990. However, the disparity between the real and predicted values of GHG in the year 2000 lends itself to the conclusion that the entire model is fallacious.

Source:
http://www.heartland.org/Article.cfm?artId=12088

Friday, October 21, 2005

Analyzing Risks

In the Fly in the Ointment, Joe Schwarcz mentions that "Life often comes down to analyzing risks. But most people do not realize how difficult it is to perform this analysis in a meaningful way." An example is flying vs. driving: most people prefer to drive, having the misconception that flying presents a greater risk. This issue was discussed in detail in a previous post. Another example is the risk of breast cancer associated with estrogen. Dr. Schwarcz mentions that although there is a 30% increase in risk of breast cancer among women taking estrogen supplements, the chance of having breast cancer is initially 3 to 4%, meaning that a 30% increase makes this statistic 5%.

This issue is discussed by the Breast Cancer Screening Center, where they mention that

"The best guess is that any potential increased risk of breast cancer from short-term use of HRT is so small that there is no way to distinguish it from 'no difference at all.' If there were a large and significant increase in the risk of breast cancer from short-term use of HRT, it would certainly have been noticed and measured by now."

The article also states "[Post menopausal women] need to know if the possible long-term risk from HRT outweighs the definite short-term benefit from control of the symptoms of menopause." This is where risk evaluation comes in. Even though taking estrogen may increase the chance of breast cancer very slightly, does that risk outweigh the benefit of taking estrogen after menopause? In general, most women believe that the benefits of taking estrogen outweigh the risks involved, however there are some women who firmly refuse to take estrogen supplements due to a fear of breast cancer, an example of inaccurately analyzing risks.

Statistics Urge Not To Put Blind Faith In Statistics

In his essay “Lies, Damned Lies, and Statistics” Joe Schwarz persuades the reader to think statistically and make decisions according to statistically verified scientific results. He proceeds in his essay by presenting various statistical findings that advocate making choices opposite to what seems emotionally plausible (i.e., to take a plane rather than to drive a car, because plane travel is safer). However, Schwarz never questions the statistics’ validity (instead, he only clarifies misinterpreted statistical figures; i.e., he explained that a 30 percent increase in breast cancer risk for women taking estrogen supplements really means that the risk of breast cancer is elevated from 95 to 96 percent for a post-menopausal woman, if she chooses to take the supplements) and always assumes that the statistics found in scientific studies are true. But several other researchers do not put such blind faith in statistics.

Two Spanish scientists from the University of Girona, Emili García-Berthou and Carles Alcaraz studied papers published in internationally acclaimed scientific journals Nature and British Medical Journal (BMJ) in 2001. They found that out of 32 papers examined in Nature, twelve contained at least one statistical error; similarly, four out of twelve papers published in BMJ also contained at least one error. Overall, eleven percent of the publications in both journals contained statistical inconsistencies. García-Berthou and Alcaraz are afraid that the same plague might be affecting many more journals worldwide. The researchers note that a lot of the errors might result from unnecessary rounding up (this is suggested by a much smaller number of 4’s and 9’s in the data than expected) and typographical errors.

Jud W. Gurney, M.D. FACR from the Department of Radiology of Nebraska Medical Center states that “up to 50% of articles [in radiological publications] are statistically flawed.” In his opinion, the difference between radiological and statistical jargon might be the cause of the mistakes. He notes that another reason for statistical error is that analysts often choose wrong methods of statistics when analyzing a certain type of data. Gurney explains that there are three types of data: nominal data (there is no arithmetic relationship between categories, and the categories are unordered, that is, no category is better than another), ordinal data (there is no arithmetic relationship between the categories, but the data is ordered) and interval data (measurements of length, weight, volume, etc., when there exists a mathematical relationship between the data). Each type of data should be analyzed using specific statistical techniques and tests in order to get correct, meaningful results.

Unfortunately, some analysts are not always aware of that, producing statistically flawed findings.

Thursday, October 20, 2005

Obesity Statistics - Just a fat bunch of lies

In “Lies, Damned Lies, and Statistics,” Dr. Schwartz asserts that steaks should be avoided “because of their fat content, not because ‘they cause cancer’” (Schwartz 18). An article explored the “statistics” of obesity. According to the Center for Consumer Freedom, some of the most prevalently cited “statistics” about obesity are flawed. Specifically, many organizations (including the FDA in some cases) tout three faulty statistics: that obesity causes over 300,000 deaths annually, that over 61% of America’s population is obese, and that the fiscal cost of obesity is $117 billion a year.

The first statistic—that 300,000 people die annually due to obesity—was based on “‘weak [and] incomplete data’” according to the reputed New England Journal of Medicine. David Allison, the man who introduced this statistic incorporated irrelevant data. Perhaps the most dubious aspect of his study is his use of data from 1948 and his disregard for the past 50 years of improved medical treatment.

The second statistic, asserting that 61% of Americans are obese, bears as little credence as the first. The 1998 redefinition of obesity placed men and women into the same boat, judging everyone on the same standard of obesity. This seemingly innocuous nuance threw 30 million Americans into the “obese” category. These “obese” Americans included famous figures such as Will Smith, Pierce Brosnan, and President George W. Bush.

As for the last statistic, the authors of the Obesity Research study which published the $117 million figure admitted that they summed the expenses of anyone with a BMI (body mass index) of greater than or equal to 29 rather than 30—a difference of a mere 10 million people. In addition, this study counted expenses due to interdependent effects as separate, thereby doubling or tripling the calculated expense. Furthermore, Obesity Research seems to have a motivation for inflating this sum: the North American Association for the Study of Obesity (NAASO) is predominantly funded by pharmaceutical companies. Thus, by overstating the costs obesity incurs, the NAASO could hope for larger donations. This motivation causes a bias in their study.

Sources:
"Lies, Damn Lies, and Statistics." Dr. Joe Schwarcz. The Fly in the Ointment.
Obesity Statistics Seriously Flawed:
http://www.consumerfreedom.com/news_detail.cfm/headline/2185
Obesity Statistics are as bogus as weight-loss scams:
http://www.news-medical.net/?id=6223

Thursday, October 13, 2005

Driving or Flying?

In the Fly in the Ointment, Dr. Joe Schwarcz touches on the choice of driving or flying. He mentions that despite the fact that people have a preconceived notion that flying is more dangerous than driving, the converse is true. He states, "Since the advent of commercial air transport around 1914, some 15,000 people have perished in airplane crashes. In North America alone more than three times that many people die in automobile accidents every year! You are far more likely to arrive at your destination if you fly than if you drive."

He goes on to mention that people feel more "in control" if they are driving with their own hands. This idea of relinquishing control when they step on an airplane scares many people. He claims that this is a case of people thinking emotionally, rather than statistically.

This idea is expanded in a Boeing Report on the Fear of Flying, in which 48% of adults who avoid flying cited fear as the major reason they did not fly. However, only 6% considered flying unsafe. This suggests that although most people realize that flying is safe, they are still afraid to "relinquish control" of their lives. Among these people, the highest levels of anxiety occur during segments of air travel that involve heights and life-threatening situations, despite the fact that in Definitive Statistics comparing Driving with Flying, it is claimed that 95% of accidents occur during takeoff and landing. Therefore, once the plane is in the air, it has a significantly less chance of having an accident, although for many people the height aspect makes them even more fearful, another example of thinking emotionally rather than statistically.

Among all people who fear flying, death and heights are dominant themes, despite the overwhelming evidence that flying is safer than driving. Since 95% of accidents occur during takeoff or landing, only the number of flights involved in the trip makes a significant impact on your probability of being in an accident, as opposed to driving, where the probability of being in an accident increases over the distance driven. This is summed up by the quote from Statistics, "Let's consider a trip from New York to Los Angeles: it is 261 times safer to fly than to drive the 2821 miles." Despite this, there are still people who would rather drive than fly.

There Is a Difference Between Positive and Good

In “Lies, Damned Lies and Statistics,” Joe Schwartz points out that many people choose to rely on their emotions and not on statistically examined scientific findings. As is often the case, what seems emotionally plausible, turns out to be scientifically (and statistically) wrong. However, as Peter Sandman, a columnist for The Synergist, shows, the problem is not always that of distrusting statistics – oftentimes, people simply misinterpret them. In “Risk Words You Can’t Use” he examines the distinction between emotional and statistical definitions and uses of certain widely used words. Serious misunderstandings may easily result if people misuse the terms Sandman mentions.

“Conservative” is one of the often misused words. To statisticians, a “conservative risk estimate” is one that overstates the risk (just to be sure that the risk is not underestimated). However, when the term “conservative risk estimate” is used in the news, the audience interprets it as an underestimate of the actual risk – the direct opposite of the statement’s statistical meaning.

Similarly, the words “significant” and “insignificant” carry vastly different meanings for statisticians and the public. Statistically, a relationship (i.e., between headaches and an industrial pollutant – the example Sandman uses) is “insignificant” if there seems to be no correlation between the variables. But, oftentimes, people think that their headaches are simply unimportant in the face of some other problem, when they read that the relationship is “insignificant.”

In a statistical report, the adjectives “positive” and “negative” serve to suggest, respectively, solely an existence or nonexistence of a correlation between two variables. Emotionally, it seems plausible that a “positive” finding about harmful effects of plant A on disease B means that plant A does not cause disease B. The statistical meaning, however, is exactly opposite.

Sandman further presents several other examples of contrasting statistical and ‘common,’ emotionally plausible definitions of words. In statistics, “biased” data is one that was not obtained randomly – not one that was obtained through cheating. When scientist calls a study “anecdotal,” it is not because the study tells short, amusing stories, but because the samples were not taken randomly. The word “risk,” to a risk assessor, “is the magnitude of a bad outcome times its frequency.” Very often, however, risk is interpreted as just the frequency or uncertainty of something. Sandman also warns that terms “prepared” and “safe” can only be used relatively, because nothing can be absolutely “safe” and it is impossible to be completely “prepared” for anything.

The proper understanding and use of these terms is crucial for successful communication between the scientific community and the public. Both, the media reporters and the television viewers, should familiarize themselves with the differences between statistical and common meanings of these terms. And then, maybe, once the people understand what the statistical jargon really means, they will begin to put more trust in statistics.


*All quotations were taken from http://www.blogger.com/post-edit.g?blogID=17199710&postID=112926062239788998.

Airplane Air Quality

While Dr. Schwartz mentioned that deep vein thrombosis was not a significant threat to airplane passengers, he did acknowledge one aspect of air travel that may prove worrisome: air quality (Schwartz 17). In a report for the UK cross-departmental Aviation Health Working Group (AHWG), the air quality of older aircrafts was measured and assessed. Thus, a primary objective was to determine if there was a significant difference between the air qualities of the newer and older aircrafts. The two aircraft used in the study were the BAe 146 and the Boeing 737-300. During the fourteen flights which were assessed, the air quality was taken at three points: take-off, cruise, and descent.

The measurements taken were of cabin pressure, air and globe temperature, relative humidity, air speed, carbon monoxide, carbon dioxide, nitrogen dioxide, volatile organic compounds, carbonyls, semi-volatile organic compounds, bacteria and fungi, dust, and ultrafine particles. The results affirmed that the air quality was roughly the same as that of newer aircraft; the amounts of all air pollutants met WHO (World Health Organization) standards. Interestingly, some particles, such as volatile organic compounds, ultrafine particles, and bacteria and fungi, were found at higher levels when the aircraft was on the ground.

However, going beyond recommended standards, the Air Travel Health News addresses the worry of cabin air quality. It points to several precautions and steps to follow if the air quality becomes a noticeable problem. Several signs that indicate a low oxygen level are: difficulty concentrating, aching lungs, clammy skin, nausea (not due to turbulence), and headaches. If any of these symptoms develop, a flight attendant should be contacted and requested to ask the pilot for “less recirculated air and more fresh air.” Another viable option would be to ask for an oxygen bottle—every 747 carries approximately 25 portable oxygen bottles.

In order to counter the pathogenic threats onboard, several precautions can be taken. Some of these measures include washing hands with soap and hot water before touching eyes or nose and covering nose and mouth with a water-saturated hankerchief (helps block spread of germs and provides humidity for the lungs). A slightly more awkward prevention technique advises carrying a disposable, surgical mask to present to a sick person onboard the aircraft. Perhaps Diana Fairechild points out the poor oxygen intake best by revealing that pilots receive ten times more oxygen than passengers.