regression

Scientists use regression techniques to find and illustrate trends in the relationship between two items, such as the relationship between degree of supervision and unsafe behaviours. These trends can then be used to make predictions.

Suppose you were wondering if workers were more likely to work safely if they were more closely supervised. To learn more, you’d have to set up a study and use a common concept in research called regression.

Your study might look at different ratios of workers to supervisors at several large plants: 10 workers per supervisor, 20 per supervisor, 30, 40, and so on up to 60 workers per supervisor. Then you might observe each worker for a set time period over different shifts, and record any unsafe behaviours.

To find out how these two items — the worker-supervisor ratio and unsafe behaviours — are related, and to see any trends in the relationship, you would fit a regression line. A regression is used to learn more about the relationship between two items or “variables.” Have a look at the chart below (remember, these answers are fictional):

In this example, you see there are more and more unsafe behaviours as supervisors become responsible for a growing number of workers. The two “variables” are the worker-supervisor ratio and the number of unsafe behaviours.

This type of chart is called a scatter plot. It’s the first step in identifying a regression line.

In this case, the worker-supervisor ratio is called the “independent” or predictor variable. It is fixed. The independent variable is mapped out along the horizontal or X axis. We want to see how this ratio affects or predicts our other variable, unsafe behaviours. The number of unsafe behaviours is called the “dependent” or response variable. It goes along the vertical or Y axis.

Using regression to predict outcomes

Now let’s see how the chart can be used. The next step is to draw a line that best “fits” the dots. This is a linear regression line, and it is often calculated by a software program. It’s also called a trend line as it shows that there is a trend between the two variables.

We can now use the regression line to make predictions. One example might be to estimate the number of unsafe behaviours when there are 25 workers per supervisor, which we didn’t look at in the study.

Scientists may estimate a mathematical equation for the line called the regression equation. They can further analyze their results in different ways. They might consider how far each point falls from the line. They might also test how well their prediction matches the actual value by examining safety behaviours in units with different numbers of workers per supervisor than what they measured.

Sometimes, a straight line is not the best way to describe the relationship between the variables under study. Maybe a curve would be better. Regression techniques can accommodate these types of relationships, too.

Working with more than one variable

But you may be thinking that other things affect workers’ safety behaviours, too. For example, what about workers’ experience? What about the amount of training they’ve had?

In such situations, with more than one item or variable that could predict an outcome, scientists can do a multiple regression. Multiple regressions are often used in research studies. They are more complicated to calculate, but they make the prediction more accurate.

Regressions are used in many different ways to help inform decision-making. You might use regression to measure the relationship between training hours and injury rates, mammogram screening rates and breast cancer mortality, Grade 12 averages and first-year university averages.

Source: At Work, Issue 57, Summer 2009: Institute for Work & Health, Toronto