How (not) to use statistics

Statistics can provide validity to what you are trying to write (or say, if presenting orally), but you can’t just drop a number in and – voila! – have validity. Even if that statistic is related to what you are writing about, it isn’t always enough.

I recently read an opinion piece in the NY Times (yes, that piece, the one I wrote about on my other blog here: https://taramoeller.wordpress.com/?p=935 ) that used “statistics” to support it, and on the surface, it worked. But as a technical editor with experience editing articles and reports that use statistics, these numbers didn’t work for me. While the numbers used seemed to support the opinion presented, I couldn’t go check because they weren’t actually cited. Which meant, I couldn’t easily verify their veracity to the opinion presented.

You see, statistics are fussy things. If you’ve heard that a person can skew any data into statistics that support their view, you know what I am on about. In my career (25-plus years), I learned how to properly collect data and analyze it because I was editing analysis and results. I had to know where the numbers I was editing came from so that I could understand them enough to edit the results extrapolated from them.

Data need (data is plural, datum is singular) to be collected for a purpose; you need to be trying to answer a specific question, i.e., the scientific method. Once you have that question, before you start collecting any data, you need to know how you are going to analyze the data you collect. This will help you to collect the right kind of data.

As good analysts say: crap data in, crap info out.

[If you want to know more about recognizing good data/results as an editor, let me know and I’ll write a future post about it.]

While most people will think that data and information are the same thing, in the world of data analysis, they are very different. In this world, analysts take raw data (all those numbers and data points (or datums) collected), examine and analyze it, and turn it into information (or, in some cases, the answer to the question asked.) It is only good information if the data collected was relevant to the answer (and that the question was formulated in a way that it COULD be answered).

[Formulating questions might be another post, especially since as an editor, an analyst might ask for help.]

For instance, if I wanted to know the average age of first menses, I would not collect data from biological young men. They have no relevant data to add to the data set; their answer would almost always be never. [If you are wondering about that “almost”, intersex people exist, and an intersex person can be both biologically male and biologically female.] Similarly, if I wanted to know how many young parents felt safe taking their children to the emergency department, I would not ask grandparents (their answers would either be second-hand or out of date).

This particular essay mentioned an NBC News poll that “asked young people to name the life goals that were important to their personal definition of success.” The essay states that in this survey, young men were more likely to prioritize family goals (marriage and children) than young women.

Hmm. I have questions.

How young is “young?” How many “young” men participated in the poll? How many “young” women? And were they provided a list to choose from or did they have to write in a response that was then binned by the polling entity? How much is the “more” stated in the essay? 1% more or 10% more?

I performed a Google search and found the original NBC News article (the poll was conducted from 23 Aug to 1 Sep 25 with 2970 respondents, and it asked a LOT more questions than just about life goals and success) (https://www.nbcnews.com/politics/politics-news/poll-gen-zs-gender-divide-reaches-politics-views-marriage-children-suc-rcna229255).

I found a couple of the answers I needed. “Young” means 18- to 29-year-olds (Gen Z, in the article) and the poll listed 13 items from which each participant (there were 2,970) could choose their “top 3.” The article provides a ranked list result by gender (as identified by the participant? Hmmm.) The article also provides the margin of error of
+ / – 2.2 percentage points (um…since we’re provided a ranked list, it’s hard to determine what this actually means in regards to the results.)

The article did not specify how many women participated versus how many men; the respondents were all lumped into a single group (2,970 adults aged 18 to 29). If a significantly more number of men answered than did women, the conclusions made are not as relevant as claimed, and since the article does not provide this information, I’m suspect of its results.

Such answers can impact the statistical significance of the results. Statistical significance is important in analysis (see https://www.statology.org/understanding-statistical-significance/.) If there were not enough participants in the poll, and the number of young men participants was a lot larger (or a lot smaller) than the number of young women participants, these results (vague as they are) may mean NOTHING.

Let’s look at another example form this essay.

The essay refers to a General Social Survey (who administered it and for what purpose?) that found that in the 1980s there was little gap (what does this mean exactly?) in the number of children conservative women had versus liberal women (they were asking 25- to 35-year-olds; this information was in the essay). By the 2020s (per this essay), 71% of conservative women in this age range had children versus only 40% of liberal women.

So what is wrong with these numbers? While we know a specific age range, we still do not know how many 25- to 35-year-old women participated in the survey. Nor do we know the ratio of conservative women to liberal women. If 80 conservative women responded but only 20 liberal women, I don’t think anyone could legitimately claim these numbers as valid (significant), unless we knew for certain that 80% of the general population of 25-35 year old women identified themselves as conservative (or were identified as conservative by the surveying organization, and we had strict definitions to apply to determine if a woman was conservative or liberal, and that the definition hasn’t changed in 45 years.)

[Deep breath.]

After my quick Google search for General Social Survey, I found that it has been conducted for 50 years by NORC at the University of Chicago. What is the NORC? It is the National Opinion Research Center. From their website: NORC at the University of Chicago is an objective, nonpartisan research organization that delivers insights and analysis decision-makers trust.

This group provides access to the data they’ve been collecting and provide information about how its been collected and analyzed. They even provide a way to extract their data into a statistical analysis tool, such as R.

So, back to what the essay provides from this survey: two sets of numbers (almost) in contrast. A set from the 1980s and a set from the 2020s. These numbers are not immediately accessible from the website (the website only provides access to the data, not analysis of the data), and the essay provides nothing about how the author determined these numbers or the source that analyzed the data and published a conclusion (ratios of women with children to those without). Unless I extrapolated data from the site into a tool to analyze it myself, I’m not certain how valid they are. And the vagueness of that first set (from the 1980s) call to doubt their veracity.

Let’s go back to the Scientific Method and needing the data you use to be collected for a specific purpose. The data at the NORC is just a database of semi-random data collected about societal opinion over the last 50 years. Unless we are absolutely certain that the definitions of conservative and liberal have not changed over the last 50 years (hint, while their denotative definitions may not have changed, societal (connotative) definitions have, leading folks to possibly self-identify differently), we cannot compare or contrast these results.

[Compare is to look for similarities between 2 objects or groups; Contract is to look for differences between 2 objects or groups. And yes, in analysis, it matters which one you are doing.]

It’s not just about the validity of the collected data, but also about how the data are analyzed and any conclusions made. Data sets need to be collected the same way with the same parameters to even be used in the same analysis. And identical analyses must be performed for any two results be used to answer a single question about the two groups.

The essay provides the reader with absolutely no information about how anything was analyzed, and since I have questions about that (either about how the source analyzed or how the author analyzed a data set), as a technical editor I am suspect of his whole essay (and thus, his opinion.)

A misrepresentation of a single statistic in an otherwise well-written essay or article can invalidate everything else, no matter how well-presented everything else is.