Wednesday, May 15, 2013

Statistics; Truth and Dare.

Recently I have noticed people misusing and misrepresenting statistical information. This is nothing new and I guess my current awareness is just god’s way of telling me it is time to offer a brief course in statistics for the everyman (or woman). The numbers in and of themselves have very little meaning…it is like a quote taken out of context. So what do you as a reader, need to consider before making an assessment of a given situation from statistics?
a. Who’s interests are being served. The people paying for the study (data collection) have an agenda try to figure out what that is and look at the results with consideration to that. This does not affect the numbers themselves it affects the way the numbers are presented. ex: Agency X offers and addiction recovery program. If the actual numbers are for every 100 people who enter the program… 50 complete the program… 40 stay clean for 1 year, eventually 20 of those relapse. What is their success rate?  They will tell you they have an 80% success rate. How can that be, they started with 100 and only 20 stayed clean isn’t that an 80% failure rate? First they don’t count anyone who doesn’t complete (graduate) the full program.  Then they include anyone who lasts a specified follow-up period, 3 months, 6 months or 1 year. Actually from these numbers our Agency X is having a good deal of success with their recovery program. The dropout rate is lower than average and they are using a longer recovery baseline then most. I would support this program. Unless you are informed about industry standards it is hard to judge what these numbers mean. This is true of any study you are reading for any industry or project. If a study produces results that cannot be spun to favour the desired outcome for the funding parties then it would simply be scrapped and you would never see it. An example of spin is… Labrador’s seal hunt culls 20% of seal population, or Labrador sanctions the slaughter of 1000 baby seals. The statistic may have been provided by an impartial 3rd party (like the department of fisheries) but the report you are reading carries an agenda. The truth usually falls somewhere in the middle, if you care about the issue, get informed before blindly repeating a statistic like it is written in stone.  
b. The illusory correlation is one of my favourite analytical problems.  The example my professor used was. Babies have no teeth, and babies cannot talk, this is true 100% of the time; therefore we can conclude that without teeth one cannot talk. This is an obvious and easily refuted example but let’s look at a real world statistic.  In my work it shows up as things like, 80% of homeless suffer from mental health disorders. The statistic is not a lie but it is misleading. The illusion is that MH issues cause homelessness. The true statistic is that homelessness will almost certainly cause (situational) depression; a mental health issue.Another good example of the illusory correlation, long ago dispelled by science and statisticians was; Black Americans consistently scored more than 10% below Whites on standardized IQ tests… this statistic was held for a long time as proof that, Blacks were genetically inferior to Whites. But one day someone looked at the actually test questions; a test written by white people for white cultural norms. The test was rewritten from Black cultural references and white people failed miserably. This crosses into the next problem with statistical information.
c. The size and nature of the sample being used for the study. Whether it is people or micro-organisms, a small or narrow sample leads to less reliable results. For example a survey on the desirability of a particular city funded project should reflect the ethnic diversity of that city’s population. A broad sample of 10,000 white people in Thunder Bay is not going to lead to an accurate result of public opinion when 30% of that city’s population is Native. And a small sample of 7 White people and 3 Natives will not truly reflect the community’s position. The larger the sample and the more closely it reflects the population demographic the more accurate the result.
d. Once you have examined the above three factors then, you may still have to question why you want this to be true/false. Using the 80% of homeless are mentally ill stat as an example… it is comforting it makes us feel like homelessness can never happen to us. Statistics can give us scapegoats and absolve us of responsibility…often for our own lives and our own happiness.
e. Lastly we need to consider the issue of falling data or the inability to follow-up with subjects leading to misperceptions. You will find sources that say the vast majority of child abusers were abused themselves as children. This is not saying that the majority of abused children grow-up to be abusers. Just as most Muslims are not terrorists, most middle-aged white males are not pedophiles, most young people are not gang bangers, nor are most abuse survivors, abusers. We as a culture do not collect statistics on what is right, good and working well. So maybe we have to in our own minds ask,“What about the rest?” What went right (but that is another blog)?
The only thing I can do is to urge you to be cautious when reading (or embracing) a statistical fact… and if the issue really matters to you learn more. Find different sources on both sides of the argument. Try looking objectively from outside of the issue and be honest with yourself. Does this feel right to you?…REALLY.

Have a joyous day.