The (Lack Of) Gender Equality Paradox
Chances are that the Gender Equality Paradox is not real, and such findings--even if true-- tell us nothing about gender neutrality.
When deliberating upon the matter of group disparities and their underlying determinants, it is not uncommon for certain assertions to posit the predominant influence of genetics in the emergence of such distinctions. Advocates of genetic perspectives have advanced arguments about racial differentials in cognitive aptitude (e.g., Levin, 1997; Jensen, 1988; Bronski, 2023; cf., Schwartz, 1974; Adams, Ghodsian, and Richardson, 1976), variances in mate preferences between sexes (Black, 2023), and, pertinent to the scope of this article, disparities in traits and choices between genders (e.g., Levin, 1987).
Utilizing insights derived from studies on the Gender Equality Paradox (GEP, hereafter), it’s been argued that sex differences in multiple aspects are largely a result of genetics, primarily because sex differences increase, not decrease, in “egalitarian” societies. GEP study findings are in direct contrast with social role theory, which argues that people — in this case, men and women — are expected to behave in socially defined social roles (Bosak, 2018). As one commentator perfectly put it, “It's like sex differences. The more egalitarian a society becomes, the more of the differences we see are genetic” (in Bronski: see comment section). Such claims rooted in genetics aren’t uncommon and perverse evolutionary psychology (e.g. Buss and Wood 1999; c.f.: Shackelford, Schmitt, and Buss, 2005; Alice and Wendy, 1999), where the GEP studies tend to come from.
Through the use of GEP studies, multiple studies have shown that sex differences increase in different variables. For example, GEP has been evoked when discussing sex differences in STEM achievements (Stoet and Geary, 2018), differences in priorities that the sexes value (Schwartz and Rubel-Lifschitz, 2009), sex differences in moral judgment (Atari, Lai, and Dehghani, 2020), and even personality differences. (Giolla and Kajonius, 2018). Such findings are taken to mean that genetics play a large role in sex differences, especially since one would assume that if sex differences were a result of social learning, then such differences would be minimized in egalitarian areas. However, such studies have not painted consistent differences — and it’s dubious to assume that, even if such widening differences do exist in egalitarian areas, GEP studies are telling us anything about the “egalitarian” aspects they attempt to measure.
Issues with GEP Studies
An example of a flawed GEP study comes from Stoet and Geary (2018, ibid). In this analysis, the authors find that, despite women being more competent or as competent as males in achievements in science, mathematics, and reading, women were underrepresented in STEM achievements, with such sex differences in representation increasing with gender egalitarianism. Such differences were said to be “near universal.” In response, Richardson et al. (2020) raise pertinent critiques of Stoet and Geary's analysis, asserting that the identified evidence signals fundamental distinctions in preferences rather than a direct reflection of gender-based propensities. A primary concern highlighted by Richardson et al. is the reliance on ratios by Stoet and Geary, which presupposes that such ratios inherently signify the "propensity" of women graduating in STEM fields. Richardson et al. argue that a more rigorous approach necessitates the direct measurement of baseline preferences, rather than the implicit assumption that the observed propensity, as denoted by the ratio, serves as an accurate proxy for preferences.
Furthermore, Richardson et al. underscore a crucial aspect regarding Stoet and Geary's metric; it does not measure gender equality per se, but rather the divergence in parity between men and women across various indicators. Richardson et al. contend that the absence of an association arises when an alternative metric, one explicitly gauging STEM achievement and gender equality, is employed. This underscores the importance of employing nuanced measurements that accurately capture the intended constructs under examination. 1 2
The authors posit that the observed relationships are susceptible to the nuances of the measurements employed for gender equality, STEM achievement—specifically quantified as the percentage of women graduating with STEM degrees—and the diversity within the pool of countries under consideration. The challenge associated with the selection of countries in such analyses has been previously highlighted, as demonstrated by Fryer and Levitt (2010). Their findings indicated that gender disparities in mathematical achievement exhibited sensitivity to the inclusion of Muslim countries, wherein no discernible sex gap was observed. In contrast to Stoet and Geary’s claim that such sex differences in STEM achievement are “universal”, Reilly, Neumann, and Andrews (2019) found varied directions for sex differences and no global gender differences in STEM achievement. According to the authors, “Such a pattern cross-culturally is incompatible with the notion of immutable gender differences.”
Jergins (2023) conducted an investigation that challenges the applicability of the GEP model to STEM outcomes when additional contextual factors are considered. Jergins adopted a novel approach by associating immigrants to the USA with measures reflecting the institutional, political, economic, and cultural landscapes of their countries of origin. Given their status as immigrants, devoid of direct ties to their home countries, these measures were posited to influence immigrant outcomes through cultural and shared belief systems. This methodological innovation allowed for the inclusion of a more expansive dataset not utilized in Stoet and Geary's original study.
The comprehensive analysis, encompassing the entire sample, failed to substantiate the GEP model in the realm of STEM outcomes. This outcome prompts a reconsideration of the model's generalizability in the presence of broader contextual considerations.
Looking at the values one prioritizes, Schwartz and Rubel-Lifschitz find support for the GEP model. As gender equality increases, sex differences do, too.
Schwartz and Rubel-Lifschitz seem to take a largely genetic stance on the issue, even, unsurprisingly, pointing to evolutionary psychology as the mechanism. As they argued in their study, men having higher importance on power aligns with the evolutionary psychology finding that women prefer a man with higher earning potential than them. Essentially implying that such increases in sex differences are a result of genetics. 3
However, another analysis found no support for the evolutionary and biosocial model when it comes to sex differences in values prioritized (Connolly, Goosen, and Hjerm, 2019). While gender equality correlated with gender differences, lining up with the previous GEP studies, there was a convergence in the traits analyzed (negative values indicate a divergence).
The average reduction was 15%. When running a longitudinal model, there was a convergence for benevolence, power, and achievement but not universalism and stimulation. In their regression, they found that there was no support for the idea that an increase in gender equality related to changes in gender differences in values. In other words, gender equality was not causally associated with differences in values.
Future research should examine if there is a causal relationship between GEP and a widening of the sex differences in different variables.
Concerning personality, Berrgen and Bergh (2023) found that when controlling for language within Protestant Western countries (adjusted because cultural regions were more strongly correlated with gender differences than gender equality), higher gender equality was associated with smaller gender differences in personality, not larger.
Adjusting for cultural confounds significantly decreases the effect size strength found in other studies in support of GEP. Controlling for cultural regions causes the associations with gender equality to become statistically insignificant.
Given the small differences found when further variables are adjusted and the lack of a causal effect, it’s hard to argue that the GEP reflects genetic sex differences if they fail to increase with gender equality. If the genetic hypothesis were true as an explanation for GEP, we should expect a widening of the gaps rather than one that closes. Given the failure of the GEP, the genetic hypothesis fails to provide evidence for its support.
Guo et al. (2024) found that sex differences in subjective well-being increased with gender egalitarianism. In gender egalitarian areas, males reported higher subjective well-being than women. However, it does not seem as if this is strong evidence for the GEP stance as, according to data from Erikkson and Strimling (2023), adjusting for sex differences in competitiveness and fear of failure explained 40% of the variance.
In conclusion, the discourse surrounding the GEP seems to rely on a false assumption that is biased by the Simpsons Paradox and cultural confounds, as noted by Berrgen and Bergh. Critiques of Stoet and Geary's analysis, articulated by Richardson et al., underscore the importance of meticulous measurement and nuanced approaches, challenging the purported universality of sex differences in STEM achievement. Contrasting findings from Reilly, Neumann, and Andrews and the innovative investigation by Jergins further highlight the contextual sensitivity of GEP, prompting a reconsideration of its applicability across diverse settings. While Schwartz and Rubel-Lifschitz align with a genetic perspective, drawing from evolutionary psychology, Connolly, Goosen, and Hjerm's analysis challenges this stance, particularly in the realm of values prioritization. The complexities extend to personality differences, where Berrgen and Bergh caution against oversimplification, advocating for adjustments to cultural confounds. GEP, then, does not seem to be a reflection of anything real and certainly does not lend strong support to the genetic hypothesis if such differences get smaller rather than larger.
Egalitarianism Measurements
When constructing a measurement, test creators attempt to construct the test to measure what it attempts to measure through some criterion. In the case of GEP studies, it’s dubious whether such measurements of “egalitarianism” are showing gender neutrality rather than simply being a measure of something else. Individuals may live in “egalitarian” societies, but this does not mean that individuals will not face obstacles that might impact them in an interpersonal way.
Along the lines of social learning theory, sex differences can be “baked in”. In other words, the way girls and boys are raised and how society displays them can impact their behaviors, which might lead to GEP (assuming it even does exist). Crowley et al. (2001) found that while parents were equally likely to talk to their sons and daughters about how to use an exhibit at a museum and how the evidence gathered at the exhibit, parents are 3x more likely to explain science to boys than girls using interactive exhibits. This gives boys a hands-on experience which might encourage them to pursue STEM, unlike girls who aren’t shown this. Indeed, dads use more cognitively demanding language with their songs and not daughters, and parents believed that science was less interesting for girls than boys (Tenenbaum and Leaper, 2003). These issues might make girls less likely to pursue STEM as the way they’re raised might make them believe that they aren’t cut out for STEM. Since this might lead to fewer women in STEM, there would be fewer female scientists, and women might not want to pursue STEM as they feel as if they don’t belong.
Indeed, this issue has been noted in other critical commentaries when discussing the reasons for the GEP. As Noll (2020) argues, the measurement used by Stoet and Geary, the GGGI, is not a true measure of egalitarianism because it includes such items as literacy rates, representation in elected offices, and labor force participation, but not norms or sterotypes. Noll remarks that such issues show that gender equality is not the same as gender neutrality. Since children use gender knowledge to guide how they behave and how they police the behaviors of other children, these gender stereotypes and norms “bake in” sex differences that would lead to the GEP. Furthermore, Noll cites evidence on how women do not feel welcome in STEM spaces, for example, due to the large male “vibes” these sectors typically have. When such areas are made gender-neutral, women feel welcome in these spaces; similar issues are also noted for race.
Nolls position was also argued by Yalcinkya and Adams (2020). Their contention revolves around the assertion that the escalation of sex differences in gender-egalitarian societies is not a consequence of individuals freely opting for pursuits in alignment with their intrinsic preferences (as suggested by the biological model, where freedom to choose corresponds to alignment with one's genotype). Instead, they propose that the newfound liberty in these societies is directed toward the expression of gendered preferences. In essence, personal choices are molded by gender essentialist ideologies, such as stereotypes, which presuppose innate sex differences. Consequently, individuals may be inclined to choose professions or majors that align with these gender stereotypes, reflecting a selection that may not authentically represent genuine personal preferences.
This dynamic further contributes to the perpetuation of gender differences through a process of socialization, effectively "baking" these distinctions into societal norms. The argument posits that the freedom to express preferences, when influenced by gender essentialist ideologies, results in choices that may be more reflective of societal expectations than a genuine expression of unfettered individual choice.
In addition, Boulicault (2019) offers several criticisms of the common measurements used in GEP studies, which include the Gender Inequality Index, the United Nations Development Programme’s Gender Empowerment Measure, Social Watch’s Gender Equity Index, the Basic Index of Gender Inequality and the OECD’s Social Institutions and Gender Index. Such measures do not take into account the issues noted above with baked-in sterotypes, instead measuring items like political participation and women in public office. These items do not tell us if men and women feel a sense of gender neutrality in their regions, rather specific domains that can still be plagued by sexism. As was said when discussing the GGGI,
“However, this is not what the GGGI measures. The GGGI measures “the gap between men and women across four fundamental categories (subindexes): Economic Participation and Opportunity, Educational Attainment, Health and Survival and Political Empowerment.”21 First, none of these subindices include any indicators that measure the extent to which a country promotes women’s participation in STEM. Second, the GGGI does not measure education and empowerment “opportunities,” which are input measures; it only measures outcomes, i.e. it only measures the gap between men and women’s educational achievements, and doesn’t measure anything about why that gap does or does not exist (e.g. whether these gaps were caused by differences in opportunities). The GGGI is thus an invalid measure in the context of “Gender Equality Paradox” research. It is not the right tool for the job.”
This is not to say that such measurements are bad, but rather that they are not appropriate for such studies.
In summary, the examination of gender equality, particularly in the context of the GEP, raises concerns about the adequacy of employed measurements. The contention arises regarding whether these metrics genuinely reflect gender neutrality or inadvertently encompass broader societal elements. Social learning theory highlights the potential 'baking in' of sex differences through societal influences, as evidenced by studies revealing disparities in how parents engage with their children, shaping early preferences and potentially steering individuals away from certain fields. This notion aligns with critiques from Noll and Yalcinkya and Adams, arguing that gender equality does not necessarily equate to gender neutrality, as societal norms and stereotypes may still influence individual choices. Furthermore, Boulicault's critique of common measurements used in GEP studies emphasizes the need for more context-specific assessments, shedding light on the limitations of indices that primarily focus on specific domains without accounting for the nuanced influence of gender stereotypes. In conclusion, refining measurement tools is crucial to better align with the complexities of gender-related phenomena, avoiding oversimplification, and acknowledging the intricate societal influences on individual choices within ostensibly egalitarian contexts.
Stoet and Geary (2020) assert the validity of their STEM achievement metric by considering the proportion of men and women attending college. Their illustration using Algeria exemplifies this, where 67.2% of college students are female, yet only 8.9% choose STEM majors compared to 13% of male students. However, it is contended that such disparities do not necessarily indicate inherent sex differences in preferences. The Algerian education system is structured in a manner where individuals tend to pursue majors aligned with their high school specializations, taking into account the relevance of acquired skills for their chosen university and major, as well as university prestige (Ahmaid, 2021: 45). Consequently, social variables, including stereotypes, can still exert influence over individuals' decisions regarding their chosen majors. In light of these contextual factors, the critique posits that Stoet and Geary's response to criticisms related to their STEM achievement measurement appears tenuous.
A replication attempt from other researchers also failed to match the propensity values produced by Stoet and Geary using UNESCO data. When their values were compared to the original Stoey and Geary ones, 15 countries shifted 10 spots or more in rankings; lack of causal reasoning; they failed to find a significant correlation in support of GEP (Richardson and Bruch, 2020).
Ironically, this is a bad example used to bolster the genetic (evolutionary) argument as such sex differences in preferences go away once further variables are adjusted. In manipulated environments, men place a greater importance on a partner’s earnings than women; and there are no sex differences in the importance but on earning potential once further partner traits are included in measurements (Black, 2023).