Compensation Models, Algorithms and Projections, Oh My!

Posted January 15, 2016

In preparation for our “2016 Compensation Resolutions #3” post on the need to rely on internal equity planning over the vagaries of using market data in establishing compensation programs, I realized that my “preface” was over 1,600 words, and that my social media manager was going to have a stroke. That led to three posts out of one. The first spoke to the issue of using recruiters “data” in your planning. Now we take on something just as scary – the “prediction” method of market data collection based on models, algorithms and projections.

Popular on the internet, and really cool if you can afford it, these programs allow you to input some demographic information, push “submit” and voila, a highly accurate prediction of the market rate for any given job springs up. This purportedly eliminates the need to have an actual compensation program or consult published salary surveys. Now, one might be curious about how these programs are able to tell me that the market rate for a Food Science Engineer with 11 years of experience in a $1 billion global agri-business company located in Royal Oak, Michigan is $113, 273. Curious, perhaps, because there are no $1 billion global agri-business companies located here in our headquarters town, and I highly doubt there are any Food Science Engineers living among the approximately 60,000 residents here (hey, if you are out there, raise your hand, because I have some questions about glutens to ask you!).

Whichever program you use, take a look at the “methodology” page – assuming you can find it. Many of these programs tell you about all the real surveys they use, whether it is compilations of published data, or “millions of submissions by users” or Bureau of Labor Statistics data. We’ll give them the benefit of the doubt that they do have real data buried in there somewhere. But now comes the fun part. They have to find a way to smooth all the data out, create multiple regression models with algorithms and carefully designed models. They need to put in all sorts of data, like the Cost of Living, or “regional pay differentials” to fill in the holes. Remember how the scientists in Jurassic Park used frog DNA to fill in the blanks of the dinosaurs – that didn’t work out so well, did it?

The truth is, the more complex the model, the more absurd the result. Sitting in an airport a few months back, one of my colleagues and I thought about how much fun it would be to start consulting with micro-breweries. We did some research and found an actual survey by an industry group, but at the time weren’t that interested in buying the data. So instead, we went to one of the online compensation gurus and input our information. Guess what, it turns out that a brewmaster in Naples, Florida earns twice as much as a brewmaster in nearby Ft. Myers!! We were surprised, given that we’d been to most of the micro-breweries in the area and knew for a fact it wasn’t true. However – I bet that “algorithm” took into account the much higher cost of living and average incomes in Naples, and applied that to create the silly projection.

The models and algorithms also explain why (and local Detroit-area folks will appreciate this) the market rate for an accountant in manufacturing in Oak Park, Michigan would project to be 30% less than for a similar job in neighboring Pleasant Ridge, Michigan. Of course the fact that beautiful Pleasant Ridge (I grew up right next door in Oak Park) is a 0.57 square mile suburb that doesn’t have a single manufacturing plant apparently doesn’t count, but the fact that housing costs a lot more than in Oak Park does.

The morals of the story are:

Attempting to create precision projections based on multiple variables is both silly and counter-productive. Labor markets aren’t restricted to single zip-codes, or even BLS geographic regions, and breakdowns based on the fifth or sixth digit of a Standard Industrial Classification code ignores the reality that many markets don’t have a single employer in that code.
Labor market data analysis is NOT a science. The use of labor market data requires understanding where it comes from, who provided it, how representative it is a sample, and most of all, its limitations.

Along with data provided by recruiters, educational institutions and advocacy groups, “projections” based on models and algorithms need to be used with great care.