In today’s data-driven society, students from nearly every discipline need to be able to understand and use statistics. Luckily, you do not need a formal statistics background in order to become proficient.
Any basic statistics course that covers all of the fundamentals will include instruction on measures of central tendency, measures of dispersion, measures of relationship and organizing and displaying data.
Measures of Central Tendency: Even a statistics novice will feel comfortable with this portion of the class, since this complicated phrase really just refers to the mean (average), median (the middle point of all values) and mode (the most frequently occurring value) of a data set. The trick is to determine which of these three measures of the center is the most useful for the data you have.
These central tendencies are important in a number of fields. Educators, for example, often ensure the mean grade of any class huddles around the average C (depending on how easy they grade).
Measures of Dispersion: In order to understand a data set, it is important to know how close together or spread apart the individual values are. For example, if a class of nine has 1A, 2Bs, 3Cs, 2Ds and 1F the deviation would be called flat, but if their grades were 1B, 7Cs and 1D, the deviation would be called narrow. The standard deviation, or deviation of each score from the mean, is the most commonly used measure of dispersion.
Under this heading, students learn about the bell curve or normal curve, which reflects the way most things are distributed in this world, including weight, height, IQ, and, usually, grades in a class. Psychology, which bases its diagnoses on observations, is keenly interested in gauging how far from the norm a particular individual’s mental disease is situated.
Measures of Relationship: Basically, determining how values relate to each other, the two primary measures of relationship are correlation (things occur together, independently) and causation (one thing causes another). There are a number of ways two values or events may correlate, but the most commonly used in statistics has a range from -1.00, where two things are negatively correlated, to +1.00, where the two variables are positively correlated. The closer the number is to 0, whether positive or negative, the less strong the correlation, or relationship.
Just because two variables are correlated, it does not mean one causes the other. For example, reading and math grades are usually positively correlated, but are not typically causally related. In order to take a strong correlation into causation, statisticians have to use a method, such as a control group, to test the hypothesis.
Surprisingly, historians frequently rely on statistical relationships in order to explain past events. For example, using statistical data of the causes of death of masses of soldiers, historians have recently hypothesized that Napoleon was not defeated by the Russian winter, but by the spread of typhus through lice infestation.
Organizing and Displaying Data: Statistics students learn that they have to create graphs and charts to explain the patterns and trends in a data set. Commonly, these include line and bar graphs, histograms, scatter plots, pie charts and stem and leaf plots. Geographers, for example, frequently create maps that include pie charts of demographic data (race, gender, income and location of people), to describe how people relate to their environments.
Top Statistics Fields
While everyone in today’s information economy can benefit from knowing statistics, for some industries, mastering statistical methods is a must. Actuaries use probability and statistics to predict mortality and anticipate illness and other human conditions. They then factor those predictions into equations to set insurance premiums that are profitable for the company, but also reasonably priced for the consumer.
Data miners are often computer scientists who use statistical methods to sift through large amounts of data, looking for patterns. Using techniques such as cluster analysis, association rule mining, anomaly detection and data dredging, data miners help businesses target likely customers and geneticists identify how a person’s DNA may affect their risk of getting an inherited disease like cystic fibrosis.
Today’s economists have become masters of statistical data analysis. In their sub-field of econometrics, economists use statistics to “sift through mountains of data to extract simple relationships.” Economists study data sets of observations over a period of time, observations of a variety of individuals at one point in time, or both, and use that to forecast all sorts of economic conditions, such as inflation, wages and Gross Domestic Product (GDP).
Epidemiologists, who study disease patterns, rely on statistical information to understand the causes, effects and spread of disease in specific populations. These medical professionals design statistical studies, collect data and analyze it to better understand disease outbreaks, as well as to follow its spread and even monitor the effectiveness of experimental treatments.
The recent presidential election demonstrated how political scientists rely on statistics to successfully predict the election results. Nate Silver is perhaps the most famous of these. Using computer models that incorporated poll results, economic data, likely turnout, demographics and historical data, Silver and his colleagues at the Five Thirty Eight blog produced probability distributions about the likelihood of either candidate winning. And their predictions were correct.
Students looking to major in statistics should have at least four semesters of calculus, linear algebra and differential equations. Frequently, students in related fields, such as computer science or chemistry, choose to minor in statistics to supplement their employability. Regardless of your level of interest, get started on your exploration of statistics with free and open courses and other materials at online schools like Coursera, MIT Open Courseware and Udacity.