Double standards in evaluation
RESEARCH IN BRIEF 
Double standards in evaluation:
Average men get the benefit of the doubt
Are women held to a different standard than men in real-world evaluative situations? While lab-based evidence of double standards in evaluation exists, some have argued that competitive pressure in the market resolves any bias by creating a disincentive to favor candidates based on gender. However, even in this competitive setting where users are highly motivated to seek recommendations that yield the highest returns, this paper demonstrates that evaluators in a financial market setting do use gender to rank candidates (preferencing men) and that gender preferences led average-quality men to be given the benefit of the doubt while average-quality women were penalized. The researchers also disproved the wildly held belief that women are more risk-averse than men, observing the excess return volatility, expected return, investment horizon, and short position of investment recommendations to find that women and men are similarly risk-loving.
Evaluations take place frequently in the course of a person’s career (e.g. hiring, performance assessments) and they have a significant impact on economic outcomes. However, evaluations in real-world settings are more of an art than a science and when information is lacking or uncertain, evaluators often rely on other indicators to determine expected quality. Research has shown that gender is associated with widely held beliefs about expected performance and is often used as one such indicator of quality.
Gender is associated with widely held beliefs about expected performance and is often used as one such indicator of quality
Research has also shown that gender is used in evaluations even when information about the actual quality of an individual is known. This identifies the double standard bias that arises when women are held to a higher standard, and indeed must outperform men (who, research has shown, are generally expected to be more competent) in order to receive similar evaluations and recognition. Thus, the standards by which men and women are judged differently has far-reaching effects and contributes to both social and economic inequality.
Gender is used in evaluations even when information about the actual quality of an individual is known
This paper uses a field-based study to determine whether double standards exist in real-world evaluations and explores when and how gender informs evaluations. The study used data from a private online knowledge-sharing platform used by buy-side investment professionals. Users of the platform submit investment recommendations and also evaluate the recommendations submitted by other users. In this competitive setting, users are highly motivated to select the best recommendations in order to earn the highest returns and the subsequent performance in the financial markets is unbiased.
The study found that users clicked on the submissions of recommenders with male names (e.g. Matthew) 33% more often than those with obviously female names (e.g. Mary). Average-performing men were preferred over similarly performing women and even in the top performing quartile, men received more page views. In fact, women had to be the cream of the crop (in the top 10%) of users for their page views to equal those of men with a similar performance level. This finding provides evidence that women are indeed held to a double standard of evaluation.
Average-performing men were preferred over similarly performing women
In order to categorize by gender, each recommender was given a female name score from 0 (Matthew) to 99 (Mary), signifying how likely it was that the name belonged to a woman. Interestingly, the study found that even men with more feminine sounding names (and higher female name scores) received relatively fewer page views than their counterparts with more masculine sounding names.
The study also observed users’ behavior after they have clicked on a submission and more information about the recommendation is provided. The findings suggest that once evaluators have reviewed the submission in full, the feedback they provide reveals no significant difference between the evaluation of performance for male and female recommenders.
The researchers also examined whether some inherent difference in the female recommenders’ behavior may be influencing their evaluation. While ruling out any systematic gender differences, they paid special attention to risk-aversion behavior given its relevance to the financial services industry and the pervasiveness of perceived female risk-aversion. The study examined the excess return volatility, expected return, investment horizon, and short position of investment recommendations and declaratively showed that women and men act in similarly risk-loving ways.
- Many organizations and companies are now exploring how to be gender blind in their hiring and evaluation processes. For example, many symphony orchestras have adopted blind auditions where the musicians play behind a screen as they are evaluated – this has directly contributed to significant gains in female musicians’ representation in this highly competitive field.
- Another practice many companies are increasingly adopting is to remove applicants’ names or use only initials when generating long-lists for consideration. While it is best practice to ensure that the short-list is intentionally diverse, the generation of the long-list in a gender-blind way can aid in this effort.
- This study demonstrates that even in the most competitive settings, gender bias still influences decision-making. Meritocracy cannot be assumed, even in situations where there is a financial disincentive to be gender-biased. Gender is considered a salient characteristic now in part because women are so underrepresented - increasing the representation of women at all levels could help to reduce this bias.
Title: Pursuing Quality: How search costs and uncertainty magnify double standards in a multistage evaluation process
Authors: Tristan Botelho, Mabel Abraham
Institutions: Massachusetts Institute of Technology, Columbia University
Source: Administrative Science Quarterly
Published: February 2017
Subject: Double standards, evaluation, inequality, markets, status, stratification
Research brief prepared by Celeste Jalbert