Uncategorized

The research publishing gender gap is slowly getting better

Institution-level analysis of the gender gap in chemistry and physical sciences using the chatbot ChatGPT: An additional metric for inferring author genders

The institution-level analysis incorporates an extra metric: expected values for the proportion of female authorships at an institution, based on how much research they publish in male- or female-heavy sciences. An institution that primarily publishes in physics will have a relatively small proportion of female authors compared with institutions that specialize in health-science research.

Providing data at the field or country level is what will allow individual disciplinary communities to not only know what is happening inside their fields, but also understand the reasons for the trends, mark successes, study how to do better and, crucially, learn from each other.

The better the access to information that lays bare the inequities faced by women in pursuing careers in science, technology, engineering and mathematics, especially for positions of influence and leadership, the better equipped society will be to push back on unfair, antiquated systems and build a stronger research environment for all.

Of the top ten countries in the Nature Index, only three — the United States, Canada and France (all at 34%) — had a female author representation of more than 30% in 2024.

The lowest-ranking topics — in which the gender gap is the widest — were mainly chemistry and physical sciences topics, many of which had less than 20% female co-authorship.

To create the data set, the Nature Index data-analysis team used the chatbot ChatGPT, created by OpenAI, based in San Francisco, California, to infer author genders on the basis of their most likely country of origin and the name–gender association trends in that country (see ‘Methodology’).

In some countries, people with a certain name can identify as a certain gender, whereas in other countries, they can’t. Authors for whom there is low confidence that the model is accurate in inferring gender have been excluded from the analysis, including the majority of names in some countries, such as China and Singapore.