Research Interests

My primary research interests are moral judgments and behaviors, and stereotyping and discrimination in the workplace. In recent years my collaborators and I have increasingly examined these topics using diverse methodological approaches, including crowdsourced initiatives designed to increase the rigor and transparency of the research.
 
Moral judgments and behaviors
 
My first major line of research examines the intuitive underpinnings of human morality. The person-centered account of moral judgment my colleagues and I have proposed (Landy & Uhlmann, in press; Pizarro, Tannenbaum, & Uhlmann, 2012; Pizarro & Tannenbaum, 2011; Uhlmann, Pizarro, & Diermeier, 2015) posits that human beings are fundamentally motivated to understand the moral character of others. Virtue ethics may therefore more accurately describe intuitive moral judgments than ethics that focus on the harmfulness or permissibility acts. In a series of empirical investigations, we find that features of an act that signal virtues and vices can carry more weight in moral judgment than the consequences of the act, or whether a moral rule has been broken.
 
Consider the disproportionate outrage elicited by frivolous corporate perks. Such expenses are financially minor for large firms, yet lead to widespread negative publicity and public indignation. We suggest this occurs because in addition to assessing the permissibility of the action (an act-based judgment), people use behavior to make inferences about the agent performing the act (a person-based judgment). The character information signaled by a behavior then serves as an additional input to judgments of blame, over and above evaluations of the act. In one illustrative study, when deciding between a candidate for a corporate executive position who requested a $40,000 marble table and another candidate who requested an additional $1 million in salary, participants preferred the latter because they interpreted the former candidate’s request for a perk as a strong, negative signal regarding the candidate’s moral integrity (Tannenbaum, Uhlmann, & Diermeier, 2011). This and related studies indicate that the perceived informativeness of social behaviors regarding moral character can drive judgments to a greater degree that the objective material consequences of the act.
 
My collaborators and I have further documented striking act-person dissociations, such that some acts are perceived as not especially immoral in-and-of themselves, yet as strong indicators of poor moral character (Uhlmann & Zhu, 2014; Uhlmann, Zhu, & Diermeier, 2014). The focus on the diagnostic value of behaviors when it comes to character assessments means that judgments of acts and persons can diverge under circumstances in which acts cause little harm but are highly suggestive of the agents’ lack of moral character. One such case is harmless-but-offensive transgressions that violate moral taboos (Haidt, 2001), which are perceived as less immoral behaviors than harmful transgressions, but more diagnostic of the moral character of the person who commits the transgression (Uhlmann & Zhu, 2014).
 
We have also applied our model to lay understandings of ethical leadership. We find that leaders, whose acts have important practical implications at an aggregate level, are especially apt to be judged based on whether their decisions lead to positive aggregate outcomes. For example, a hospital administrator who chooses to forgo saving one life now in order to save a greater number of future lives is perceived as deficient in empathy and moral character, but also as having made the morally right decision and as a good leader (Uhlmann, Zhu, & Tannenbaum, 2013). Under some circumstances, people appear to prefer leaders who are willing to make hard choice moral choices and are therefore relatively low in positive moral traits such as empathy.
 
In other research in the moral domain, my colleagues and I demonstrate that moral principles are invoked flexibly to justify preferred judgments (Uhlmann, Pizarro, Tannenbaum, & Ditto, 2009), that moral taint and responsibility for financial restitution intuitively spread from a person who engages in immoral acts to other individuals who are related by ties of blood kinship (Uhlmann, Zhu, Pizarro, & Bloom, 2012), and that a strong moral identity is associated with a suppressed sense of humor in both laboratory and field settings (Yam, Barnes, Leavitt, Uhlmann, & Wei, 2016). I also have a longstanding interest in cultural differences in intuitive moral values related to work (Uhlmann & Sanchez-Burks, 2014; Uhlmann, Heaphy, Ashford, Zhu, & Sanchez-Burks, 2013).
 
Stereotyping and discrimination in the workplace
 
My second major line of research examines the influence of gender stereotypes on evaluations of male and female professionals. A series of studies conducted with my advisor and mentor Geoffrey Cohen demonstrate people change the qualifications important for the job in favor of male applicants for stereotypically male jobs (Uhlmann & Cohen, 2005).  For example, if a male applicant for the job of police chief has a formal education, a formal education is rated as important for the job.  But if he lacks a formal education, its importance is downplayed. No such favoritism is exhibited toward female applicants for police chief. Encouragingly, when evaluators decide the qualifications important for a job before they know the gender of the applicant (reducing the ability to rationalize discrimination), male and female applicants are equally likely to be hired.
 
We have further examined the role of self-perceived objectivity in selection decisions, hypothesizing that conviction in one’s personal objectivity licenses people to act on their sexist thoughts and beliefs. Consistent with this idea, evaluators who perceive themselves as objective are actually more likely to revise the qualifications they deem important for a leadership position so as to rationalize discrimination (Uhlmann & Cohen, 2005). Moreover, experimentally enhancing a sense of personal objectivity increases gender discrimination, particularly among decision-makers who endorse stereotypes (Uhlmann & Cohen, 2007).
 
A related line of work with longtime collaborator Victoria Brescoll examines backlash against women who violate prescriptive norms about the behavior “appropriate” for their gender. One series of studies demonstrates that female leaders who express anger in the workplace are perceived as less deserving of a high status job and given a lower salary (Brescoll & Uhlmann, 2008).  While female targets’ emotional reactions are attributed to internal characteristics (e.g., “She is an angry person”) men's emotional reactions are attributed to external circumstances (e.g., “The situation was very frustrating”).  Providing a situational explanation for why the person is angry eliminates the backlash against female leaders who express anger. Other work shows that making a mistake on the job is more damaging to a female leader in a stereotypically male field than it is for a man (Brescoll, Dawson, & Uhlmann, 2010).
 
Recently I have been taking this research in a new direction by testing predictions derived from micro-level psychological theories in large archival datasets. For example, my collaborators Raphael Silberzahn, Luke Zhu and I have documented gender disparities in an online market for temporary labor involving 100,000 hiring decisions made by over 9,000 employers. We hypothesize and find that women actually hired for stereotypically male jobs are often paid by the hour rather than offered fixed contracts, a risk-averse strategy on the part of employers that makes them easier to terminate in case of an unsatisfactory performance. Female workers are much more successful at attaining fixed contracts for stereotypically female work. Gender disparities in selection and type of contract emerge even controlling for male and female workers’ degree of interest in the job category, and despite the fact that female workers in this online labor market who are interested in stereotypically male jobs are on average more qualified than their male counterparts.
 
Crowdsourcing science
 
The ongoing crisis of confidence in science (Pashler & Wagenmakers, 2012) has profoundly affected the ways in which I approach research questions. In recent years I have relied on increasingly diverse methodological approaches, combining experiments with field studies and archival analyses, to boost statistical power and enhance the generalizability of the findings (e.g., Silberzahn, Uhlmann, & Zhu, 2016; Yam, Barnes, Leavitt, Uhlmann, & Wei, 2016). At the same time, I have worked to develop new crowdsourced methodologies aimed at increasing the robustness of both experimental and archival research. These projects leverage crowds of scientists to carry out empirical investigations that would be too resource intensive for any single research team to accomplish.
 
In one such initiative, my collaborators and I introduce the pre-publication independent replication (PPIR) approach, in which experimental findings are replicated in independent laboratories before (rather than after) they are published (Schweinsberg et al., 2016). Twenty-five research groups conducted replications of all ten unpublished moral judgment effects from my laboratory, collecting over 10,000 research participants in the process. Six of our findings replicated according to all replication criteria, one finding replicated but with a significantly smaller effect size than the original, one finding replicated consistently in the original culture but not outside of it, and two findings failed to find empirical support. Encouragingly, five out of six findings related to person-centered morality replicated robustly, with the sixth finding found to be culturally bounded.  The results of the this first PPIR initiative indicate that: 1) replicating unpublished research findings in independent laboratories prior to submission is a viable approach to improving the reliability of the published literature, and 2) even when both original and replication studies are conducted transparently, some findings that appeared reliable in the original investigation will subsequently fail to replicate. Thus, failures to reproduce findings should not call into question the competence or good faith of either original authors or replicators.
 
In another initiative, we recruited a crowd of scientists to collectively test theoretical predictions regarding workplace discrimination. Our crowdsourcing data analysis approach involves providing the same complex dataset to numerous scientists to independently test the same hypotheses (Silberzahn & Uhlmann, 2015; Silberzahn, Uhlmann, Martin, et al., & Nosek, 2016). The first such project distributed a large archival dataset on over 300,000 referee decisions to 29 independent teams of analysts to investigate whether players with darker skin tone are more likely to receive red cards during football (soccer) matches (Silberzahn et al., 2016). Although approximately two-thirds of teams observed a significant effect in the expected direction, effect size estimates ranged all the way from a nonsignificant tendency for light skin toned players to receive more red cards to a strong tendency for dark skin toned players to receive more red cards. These results demonstrate that defensible, but subjective analytic choices can lead to highly variable effect size estimates. We argue that high levels of transparency regarding the contingency of research results on analytic strategies are particularly important for controversial topics with public policy implications, such as potential disparities based on race and gender.
 
The common theme of these projects is to crowdsource science from the early stages of the research process, with the goal of increasing the reliability of the conclusions drawn regarding moral judgments, workplace discrimination, and other important topics.