• - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Beauty sleep:...

Beauty sleep: experimental study on the perceived health and attractiveness of sleep deprived people

  • Related content
  • Peer review
  • John Axelsson , researcher 1 2 ,
  • Tina Sundelin , research assistant and MSc student 2 ,
  • Michael Ingre , statistician and PhD student 3 ,
  • Eus J W Van Someren , researcher 4 ,
  • Andreas Olsson , researcher 2 ,
  • Mats Lekander , researcher 1 3
  • 1 Osher Center for Integrative Medicine, Department of Clinical Neuroscience, Karolinska Institutet, 17177 Stockholm, Sweden
  • 2 Division for Psychology, Department of Clinical Neuroscience, Karolinska Institutet
  • 3 Stress Research Institute, Stockholm University, Stockholm
  • 4 Netherlands Institute for Neuroscience, an Institute of the Royal Netherlands Academy of Arts and Sciences, and VU Medical Center, Amsterdam, Netherlands
  • Correspondence to: J Axelsson john.axelsson{at}ki.se
  • Accepted 22 October 2010

Objective To investigate whether sleep deprived people are perceived as less healthy, less attractive, and more tired than after a normal night’s sleep.

Design Experimental study.

Setting Sleep laboratory in Stockholm, Sweden.

Participants 23 healthy, sleep deprived adults (age 18-31) who were photographed and 65 untrained observers (age 18-61) who rated the photographs.

Intervention Participants were photographed after a normal night’s sleep (eight hours) and after sleep deprivation (31 hours of wakefulness after a night of reduced sleep). The photographs were presented in a randomised order and rated by untrained observers.

Main outcome measure Difference in observer ratings of perceived health, attractiveness, and tiredness between sleep deprived and well rested participants using a visual analogue scale (100 mm).

Results Sleep deprived people were rated as less healthy (visual analogue scale scores, mean 63 (SE 2) v 68 (SE 2), P<0.001), more tired (53 (SE 3) v 44 (SE 3), P<0.001), and less attractive (38 (SE 2) v 40 (SE 2), P<0.001) than after a normal night’s sleep. The decrease in rated health was associated with ratings of increased tiredness and decreased attractiveness.

Conclusion Our findings show that sleep deprived people appear less healthy, less attractive, and more tired compared with when they are well rested. This suggests that humans are sensitive to sleep related facial cues, with potential implications for social and clinical judgments and behaviour. Studies are warranted for understanding how these effects may affect clinical decision making and can add knowledge with direct implications in a medical context.

Introduction

The recognition [of the case] depends in great measure on the accurate and rapid appreciation of small points in which the diseased differs from the healthy state Joseph Bell (1837-1911)

Good clinical judgment is an important skill in medical practice. This is well illustrated in the quote by Joseph Bell, 1 who demonstrated impressive observational and deductive skills. Bell was one of Sir Arthur Conan Doyle’s teachers and served as a model for the fictitious detective Sherlock Holmes. 2 Generally, human judgment involves complex processes, whereby ingrained, often less consciously deliberated responses from perceptual cues are mixed with semantic calculations to affect decision making. 3 Thus all social interactions, including diagnosis in clinical practice, are influenced by reflexive as well as reflective processes in human cognition and communication.

Sleep is an essential homeostatic process with well established effects on an individual’s physiological, cognitive, and behavioural functionality 4 5 6 7 and long term health, 8 but with only anecdotal support of a role in social perception, such as that underlying judgments of attractiveness and health. As illustrated by the common expression “beauty sleep,” an individual’s sleep history may play an integral part in the perception and judgments of his or her attractiveness and health. To date, the concept of beauty sleep has lacked scientific support, but the biological importance of sleep may have favoured a sensitivity to perceive sleep related cues in others. It seems warranted to explore such sensitivity, as sleep disorders and disturbed sleep are increasingly common in today’s 24 hour society and often coexist with some of the most common health problems, such as hypertension 9 10 and inflammatory conditions. 11

To describe the relation between sleep deprivation and perceived health and attractiveness we asked untrained observers to rate the faces of people who had been photographed after a normal night’s sleep and after a night of sleep deprivation. We chose facial photographs as the human face is the primary source of information in social communication. 12 A perceiver’s response to facial cues, signalling the bearer’s emotional state, intentions, and potential mate value, serves to guide actions in social contexts and may ultimately promote survival. 13 14 15 We hypothesised that untrained observers would perceive sleep deprived people as more tired, less healthy, and less attractive compared with after a normal night’s sleep.

Using an experimental design we photographed the faces of 23 adults (mean age 23, range 18-31 years, 11 women) between 14.00 and 15.00 under two conditions in a balanced design: after a normal night’s sleep (at least eight hours of sleep between 23.00-07.00 and seven hours of wakefulness) and after sleep deprivation (sleep 02.00-07.00 and 31 hours of wakefulness). We advertised for participants at four universities in the Stockholm area. Twenty of 44 potentially eligible people were excluded. Reasons for exclusion were reported sleep disturbances, abnormal sleep requirements (for example, sleep need out of the 7-9 hour range), health problems, or availability on study days (the main reason). We also excluded smokers and those who had consumed alcohol within two days of the protocol. One woman failed to participate in both conditions. Overall, we enrolled 12 women and 12 men.

The participants slept in their own homes. Sleep times were confirmed with sleep diaries and text messages. The sleep diaries (Karolinska sleep diary) included information on sleep latency, quality, duration, and sleepiness. Participants sent a text message to the research assistant by mobile phone (SMS) at bedtime and when they got up on the night before sleep deprivation. They had been instructed not to nap. During the normal sleep condition the participants’ mean duration of sleep, estimated from sleep diaries, was 8.45 (SE 0.20) hours. The sleep deprivation condition started with a restriction of sleep to five hours in bed; the participants sent text messages (SMS) when they went to sleep and when they woke up. The mean duration of sleep during this night, estimated from sleep diaries and text messages, was 5.06 (SE 0.04) hours. For the following night of total sleep deprivation, the participants were monitored in the sleep laboratory at all times. Thus, for the sleep deprivation condition, participants came to the laboratory at 22.00 (after 15 hours of wakefulness) to be monitored, and stayed awake for a further 16 hours. We therefore did not observe the participants during the first 15 hours of wakefulness, when they had had a slightly restricted sleep, but had good control over the last 16 hours of wakefulness when sleepiness increased in magnitude. For the sleep condition, participants came to the laboratory at 12.00 (after five hours of wakefulness). They were kept indoors two hours before being photographed to avoid the effects of exposure to sunlight and the weather. We had a series of five or six photographs (resolution 3872×2592 pixels) taken in a well lit room, with a constant white balance (×900l; colour temperature 4200 K, Nikon D80; Nikon, Tokyo). The white balance was differently set during the two days of the study and affected seven photographs (four taken during sleep deprivation and three during a normal night’s sleep). Removing these participants from the analyses did not affect the results. The distance from camera to head was fixed, as was the focal length, within 14 mm (between 44 and 58 mm). To ensure a fixed surface area of each face on the photograph, the focal length was adapted to the head size of each participant.

For the photo shoot, participants wore no makeup, had their hair loose (combed backwards if long), underwent similar cleaning or shaving procedures for both conditions, and were instructed to “sit with a straight back and look straight into the camera with a neutral, relaxed facial expression.” Although the photographer was not blinded to the sleep conditions, she followed a highly standardised procedure during each photo shoot, including minimal interaction with the participants. A blinded rater chose the most typical photograph from each series of photographs. This process resulted in 46 photographs; two (one from each sleep condition) of each of the 23 participants. This part of the study took place between June and September 2007.

In October 2007 the photographs were presented at a fixed interval of six seconds in a randomised order to 65 observers (mainly students at the Karolinska Institute, mean age 30 (range 18-61) years, 40 women), who were unaware of the conditions of the study. They rated the faces for attractiveness (very unattractive to very attractive), health (very sick to very healthy), and tiredness (not at all tired to very tired) on a 100 mm visual analogue scale. After every 23 photographs a brief intermission was allowed, including a working memory task lasting 23 seconds to prevent the faces being memorised. To ensure that the observers were not primed to tiredness when rating health and attractiveness they rated the photographs for attractiveness and health in the first two sessions and tiredness in the last. To avoid the influence of possible order effects we presented the photographs in a balanced order between conditions for each session.

Statistical analyses

Data were analysed using multilevel mixed effects linear regression, with two crossed independent random effects accounting for random variation between observers and participants using the xtmixed procedure in Stata 9.2. We present the effect of condition as a percentage of change from the baseline condition as the reference using the absolute value in millimetres (rated on the visual analogue scale). No data were missing in the analyses.

Sixty five observers rated each of the 46 photographs for attractiveness, health, and tiredness: 138 ratings by each observer and 2990 ratings for each of the three factors rated. When sleep deprived, people were rated as less healthy (visual analogue scale scores, mean 63 (SE 2) v 68 (SE 2)), more tired (53 (SE 3) v 44 (SE 3)), and less attractive (38 (SE 2) v 40 (SE 2); P<0.001 for all) than after a normal night’s sleep (table 1 ⇓ ). Compared with the normal sleep condition, perceptions of health and attractiveness in the sleep deprived condition decreased on average by 6% and 4% and tiredness increased by 19%.

 Multilevel mixed effects regression on effect of how sleep deprived people are perceived with respect to attractiveness, health, and tiredness

  • View inline

A 10 mm increase in tiredness was associated with a −3.0 mm change in health, a 10 mm increase in health increased attractiveness by 2.4 mm, and a 10 mm increase in tiredness reduced attractiveness by 1.2 mm (table 2 ⇓ ). These findings were also presented as correlation, suggesting that faces with perceived attractiveness are positively associated with perceived health (r=0.42, fig 1 ⇓ ) and negatively with perceived tiredness (r=−0.28, fig 1). In addition, the average decrease (for each face) in attractiveness as a result of deprived sleep was associated with changes in tiredness (−0.53, n=23, P=0.03) and in health (0.50, n=23, P=0.01). Moreover, a strong negative association was found between the respective perceptions of tiredness and health (r=−0.54, fig 1). Figure 2 ⇓ shows an example of observer rated faces.

 Associations between health, tiredness, and attractiveness

Fig 1  Relations between health, tiredness, and attractiveness of 46 photographs (two each of 23 participants) rated by 65 observers on 100 mm visual analogue scales, with variation between observers removed using empirical Bayes’ estimates

  • Download figure
  • Open in new tab
  • Download powerpoint

Fig 2  Participant after a normal night’s sleep (left) and after sleep deprivation (right). Faces were presented in a counterbalanced order

To evaluate the mediation effects of sleep loss on attractiveness and health, tiredness was added to the models presented in table 1 following recommendations. 16 The effect of sleep loss was significantly mediated by tiredness on both health (P<0.001) and attractiveness (P<0.001). When tiredness was added to the model (table 1) with an estimated coefficient of −2.9 (SE 0.1; P<0.001) the independent effect of sleep loss on health decreased from −4.2 to −1.8 (SE 0.5; P<0.001). The effect of sleep loss on attractiveness decreased from −1.6 (table 1) to −0.62 (SE 0.4; P=0.133), with tiredness estimated at −1.1 (SE 0.1; P<0.001). The same approach applied to the model of attractiveness and health (table 2), with a decrease in the association from 2.4 to 2.1 (SE 0.1; P<0.001) with tiredness estimated at −0.56 (SE 0.1; P<0.001).

Sleep deprived people are perceived as less attractive, less healthy, and more tired compared with when they are well rested. Apparent tiredness was strongly related to looking less healthy and less attractive, which was also supported by the mediating analyses, indicating that a large part of the found effects and relations on appearing healthy and attractive were mediated by looking tired. The fact that untrained observers detected the effects of sleep loss in others not only provides evidence for a perceptual ability not previously subjected to experimental control, but also supports the notion that sleep history gives rise to socially relevant signals that provide information about the bearer. The adaptiveness of an ability to detect sleep related facial cues resonates well with other research, showing that small deviations from the average sleep duration in the long term are associated with an increased risk of health problems and with a decreased longevity. 8 17 Indeed, even a few hours of sleep deprivation inflict an array of physiological changes, including neural, endocrinological, immunological, and cellular functioning, that if sustained are relevant for long term health. 7 18 19 20 Here, we show that such physiological changes are paralleled by detectable facial changes.

These results are related to photographs taken in an artificial setting and presented to the observers for only six seconds. It is likely that the effects reported here would be larger in real life person to person situations, when overt behaviour and interactions add further information. Blink interval and blink duration are known to be indicators of sleepiness, 21 and trained observers are able to evaluate reliably the drowsiness of drivers by watching their videotaped faces. 22 In addition, a few of the people were perceived as healthier, less tired, and more attractive during the sleep deprived condition. It remains to be evaluated in follow-up research whether this is due to random error noise in judgments, or associated with specific characteristics of observers or the sleep deprived people they judge. Nevertheless, we believe that the present findings can be generalised to a wide variety of settings, but further studies will have to investigate the impact on clinical studies and other social situations.

Importantly, our findings suggest a prominent role of sleep history in several domains of interpersonal perception and judgment, in which sleep history has previously not been considered of importance, such as in clinical judgment. In addition, because attractiveness motivates sexual behaviour, collaboration, and superior treatment, 13 sleep loss may have consequences in other social contexts. For example, it has been proposed that facial cues perceived as attractive are signals of good health and that this recognition has been selected evolutionarily to guide choice of mate and successful transmission of genes. 13 The fact that good sleep supports a healthy look and poor sleep the reverse may be of particular relevance in the medical setting, where health estimates are an essential part. It is possible that people with sleep disturbances, clinical or otherwise, would be judged as more unhealthy, whereas those who have had an unusually good night’s sleep may be perceived as rather healthy. Compared with the sleep deprivation used in the present investigation, further studies are needed to investigate the effects of less drastic acute reductions of sleep as well as long term clinical effects.

Conclusions

People are capable of detecting sleep loss related facial cues, and these cues modify judgments of another’s health and attractiveness. These conclusions agree well with existing models describing a link between sleep and good health, 18 23 as well as a link between attractiveness and health. 13 Future studies should focus on the relevance of these facial cues in clinical settings. These could investigate whether clinicians are better than the average population at detecting sleep or health related facial cues, and whether patients with a clinical diagnosis exhibit more tiredness and are less healthy looking than healthy people. Perhaps the more successful doctors are those who pick up on these details and act accordingly.

Taken together, our results provide important insights into judgments about health and attractiveness that are reminiscent of the anecdotal wisdom harboured in Bell’s words, and in the colloquial notion of “beauty sleep.”

What is already known on this topic

Short or disturbed sleep and fatigue constitute major risk factors for health and safety

Complaints of short or disturbed sleep are common among patients seeking healthcare

The human face is the main source of information for social signalling

What this study adds

The facial cues of sleep deprived people are sufficient for others to judge them as more tired, less healthy, and less attractive, lending the first scientific support to the concept of “beauty sleep”

By affecting doctors’ general perception of health, the sleep history of a patient may affect clinical decisions and diagnostic precision

Cite this as: BMJ 2010;341:c6614

We thank B Karshikoff for support with data acquisition and M Ingvar for comments on an earlier draft of the manuscript, both without compensation and working at the Department for Clinical Neuroscience, Karolinska Institutet, Sweden.

Contributors: JA designed the data collection, supervised and monitored data collection, wrote the statistical analysis plan, carried out the statistical analyses, obtained funding, drafted and revised the manuscript, and is guarantor. TS designed and carried out the data collection, cleaned the data, drafted, revised the manuscript, and had final approval of the manuscript. JA and TS contributed equally to the work. MI wrote the statistical analysis plan, carried out the statistical analyses, drafted the manuscript, and critically revised the manuscript. EJWVS provided statistical advice, advised on data handling, and critically revised the manuscript. AO provided advice on the methods and critically revised the manuscript. ML provided administrative support, drafted the manuscript, and critically revised the manuscript. All authors approved the final version of the manuscript.

Funding: This study was funded by the Swedish Society for Medical Research, Rut and Arvid Wolff’s Memory Fund, and the Osher Center for Integrative Medicine.

Competing interests: All authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any company for the submitted work; no financial relationships with any companies that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.

Ethical approval: This study was approved by the Karolinska Institutet’s ethical committee. Participants were compensated for their participation.

Participant consent: Participant’s consent obtained.

Data sharing: Statistical code and dataset of ratings are available from the corresponding author at john.axelsson{at}ki.se .

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode .

  • ↵ Deten A, Volz HC, Clamors S, Leiblein S, Briest W, Marx G, et al. Hematopoietic stem cells do not repair the infarcted mouse heart. Cardiovasc Res 2005 ; 65 : 52 -63. OpenUrl Abstract / FREE Full Text
  • ↵ Doyle AC. The case-book of Sherlock Holmes: selected stories. Wordsworth, 1993.
  • ↵ Lieberman MD, Gaunt R, Gilbert DT, Trope Y. Reflection and reflexion: a social cognitive neuroscience approach to attributional inference. Adv Exp Soc Psychol 2002 ; 34 : 199 -249. OpenUrl CrossRef
  • ↵ Drummond SPA, Brown GG, Gillin JC, Stricker JL, Wong EC, Buxton RB. Altered brain response to verbal learning following sleep deprivation. Nature 2000 ; 403 : 655 -7. OpenUrl CrossRef PubMed
  • ↵ Harrison Y, Horne JA. The impact of sleep deprivation on decision making: a review. J Exp Psychol Appl 2000 ; 6 : 236 -49. OpenUrl CrossRef PubMed Web of Science
  • ↵ Huber R, Ghilardi MF, Massimini M, Tononi G. Local sleep and learning. Nature 2004 ; 430 : 78 -81. OpenUrl CrossRef PubMed Web of Science
  • ↵ Spiegel K, Leproult R, Van Cauter E. Impact of sleep debt on metabolic and endocrine function. Lancet 1999 ; 354 : 1435 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Kripke DF, Garfinkel L, Wingard DL, Klauber MR, Marler MR. Mortality associated with sleep duration and insomnia. Arch Gen Psychiatry 2002 ; 59 : 131 -6. OpenUrl CrossRef PubMed Web of Science
  • ↵ Olson LG, Ambrogetti A. Waking up to sleep disorders. Br J Hosp Med (Lond) 2006 ; 67 : 118 , 20. OpenUrl PubMed
  • ↵ Rajaratnam SM, Arendt J. Health in a 24-h society. Lancet 2001 ; 358 : 999 -1005. OpenUrl CrossRef PubMed Web of Science
  • ↵ Ranjbaran Z, Keefer L, Stepanski E, Farhadi A, Keshavarzian A. The relevance of sleep abnormalities to chronic inflammatory conditions. Inflamm Res 2007 ; 56 : 51 -7. OpenUrl CrossRef PubMed Web of Science
  • ↵ Haxby JV, Hoffman EA, Gobbini MI. The distributed human neural system for face perception. Trends Cogn Sci 2000 ; 4 : 223 -33. OpenUrl CrossRef PubMed Web of Science
  • ↵ Rhodes G. The evolutionary psychology of facial beauty. Annu Rev Psychol 2006 ; 57 : 199 -226. OpenUrl CrossRef PubMed Web of Science
  • ↵ Todorov A, Mandisodza AN, Goren A, Hall CC. Inferences of competence from faces predict election outcomes. Science 2005 ; 308 : 1623 -6. OpenUrl Abstract / FREE Full Text
  • ↵ Willis J, Todorov A. First impressions: making up your mind after a 100-ms exposure to a face. Psychol Sci 2006 ; 17 : 592 -8. OpenUrl Abstract / FREE Full Text
  • ↵ Krull JL, MacKinnon DP. Multilevel modeling of individual and group level mediated effects. Multivariate Behav Res 2001 ; 36 : 249 -77. OpenUrl CrossRef Web of Science
  • ↵ Ayas NT, White DP, Manson JE, Stampfer MJ, Speizer FE, Malhotra A, et al. A prospective study of sleep duration and coronary heart disease in women. Arch Intern Med 2003 ; 163 : 205 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Bryant PA, Trinder J, Curtis N. Sick and tired: does sleep have a vital role in the immune system. Nat Rev Immunol 2004 ; 4 : 457 -67. OpenUrl CrossRef PubMed Web of Science
  • ↵ Cirelli C. Cellular consequences of sleep deprivation in the brain. Sleep Med Rev 2006 ; 10 : 307 -21. OpenUrl CrossRef PubMed Web of Science
  • ↵ Irwin MR, Wang M, Campomayor CO, Collado-Hidalgo A, Cole S. Sleep deprivation and activation of morning levels of cellular and genomic markers of inflammation. Arch Intern Med 2006 ; 166 : 1756 -62. OpenUrl CrossRef PubMed Web of Science
  • ↵ Schleicher R, Galley N, Briest S, Galley L. Blinks and saccades as indicators of fatigue in sleepiness warnings: looking tired? Ergonomics 2008 ; 51 : 982 -1010. OpenUrl CrossRef PubMed Web of Science
  • ↵ Wierwille WW, Ellsworth LA. Evaluation of driver drowsiness by trained raters. Accid Anal Prev 1994 ; 26 : 571 -81. OpenUrl CrossRef PubMed Web of Science
  • ↵ Horne J. Why we sleep—the functions of sleep in humans and other mammals. Oxford University Press, 1988.

experimental method research articles

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 01 June 2023

Data, measurement and empirical methods in the science of science

  • Lu Liu 1 , 2 , 3 , 4 ,
  • Benjamin F. Jones   ORCID: orcid.org/0000-0001-9697-9388 1 , 2 , 3 , 5 , 6 ,
  • Brian Uzzi   ORCID: orcid.org/0000-0001-6855-2854 1 , 2 , 3 &
  • Dashun Wang   ORCID: orcid.org/0000-0002-7054-2206 1 , 2 , 3 , 7  

Nature Human Behaviour volume  7 ,  pages 1046–1058 ( 2023 ) Cite this article

17k Accesses

8 Citations

118 Altmetric

Metrics details

  • Scientific community

The advent of large-scale datasets that trace the workings of science has encouraged researchers from many different disciplinary backgrounds to turn scientific methods into science itself, cultivating a rapidly expanding ‘science of science’. This Review considers this growing, multidisciplinary literature through the lens of data, measurement and empirical methods. We discuss the purposes, strengths and limitations of major empirical approaches, seeking to increase understanding of the field’s diverse methodologies and expand researchers’ toolkits. Overall, new empirical developments provide enormous capacity to test traditional beliefs and conceptual frameworks about science, discover factors associated with scientific productivity, predict scientific outcomes and design policies that facilitate scientific progress.

Similar content being viewed by others

experimental method research articles

SciSciNet: A large-scale open data lake for the science of science research

experimental method research articles

A dataset for measuring the impact of research data and their curation

experimental method research articles

Envisioning a “science diplomacy 2.0”: on data, global challenges, and multi-layered networks

Scientific advances are a key input to rising standards of living, health and the capacity of society to confront grand challenges, from climate change to the COVID-19 pandemic 1 , 2 , 3 . A deeper understanding of how science works and where innovation occurs can help us to more effectively design science policy and science institutions, better inform scientists’ own research choices, and create and capture enormous value for science and humanity. Building on these key premises, recent years have witnessed substantial development in the ‘science of science’ 4 , 5 , 6 , 7 , 8 , 9 , which uses large-scale datasets and diverse computational toolkits to unearth fundamental patterns behind scientific production and use.

The idea of turning scientific methods into science itself is long-standing. Since the mid-20th century, researchers from different disciplines have asked central questions about the nature of scientific progress and the practice, organization and impact of scientific research. Building on these rich historical roots, the field of the science of science draws upon many disciplines, ranging from information science to the social, physical and biological sciences to computer science, engineering and design. The science of science closely relates to several strands and communities of research, including metascience, scientometrics, the economics of science, research on research, science and technology studies, the sociology of science, metaknowledge and quantitative science studies 5 . There are noticeable differences between some of these communities, mostly around their historical origins and the initial disciplinary composition of researchers forming these communities. For example, metascience has its origins in the clinical sciences and psychology, and focuses on rigour, transparency, reproducibility and other open science-related practices and topics. The scientometrics community, born in library and information sciences, places a particular emphasis on developing robust and responsible measures and indicators for science. Science and technology studies engage the history of science and technology, the philosophy of science, and the interplay between science, technology and society. The science of science, which has its origins in physics, computer science and sociology, takes a data-driven approach and emphasizes questions on how science works. Each of these communities has made fundamental contributions to understanding science. While they differ in their origins, these differences pale in comparison to the overarching, common interest in understanding the practice of science and its societal impact.

Three major developments have encouraged rapid advances in the science of science. The first is in data 9 : modern databases include millions of research articles, grant proposals, patents and more. This windfall of data traces scientific activity in remarkable detail and at scale. The second development is in measurement: scholars have used data to develop many new measures of scientific activities and examine theories that have long been viewed as important but difficult to quantify. The third development is in empirical methods: thanks to parallel advances in data science, network science, artificial intelligence and econometrics, researchers can study relationships, make predictions and assess science policy in powerful new ways. Together, new data, measurements and methods have revealed fundamental new insights about the inner workings of science and scientific progress itself.

With multiple approaches, however, comes a key challenge. As researchers adhere to norms respected within their disciplines, their methods vary, with results often published in venues with non-overlapping readership, fragmenting research along disciplinary boundaries. This fragmentation challenges researchers’ ability to appreciate and understand the value of work outside of their own discipline, much less to build directly on it for further investigations.

Recognizing these challenges and the rapidly developing nature of the field, this paper reviews the empirical approaches that are prevalent in this literature. We aim to provide readers with an up-to-date understanding of the available datasets, measurement constructs and empirical methodologies, as well as the value and limitations of each. Owing to space constraints, this Review does not cover the full technical details of each method, referring readers to related guides to learn more. Instead, we will emphasize why a researcher might favour one method over another, depending on the research question.

Beyond a positive understanding of science, a key goal of the science of science is to inform science policy. While this Review mainly focuses on empirical approaches, with its core audience being researchers in the field, the studies reviewed are also germane to key policy questions. For example, what is the appropriate scale of scientific investment, in what directions and through what institutions 10 , 11 ? Are public investments in science aligned with public interests 12 ? What conditions produce novel or high-impact science 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 ? How do the reward systems of science influence the rate and direction of progress 13 , 21 , 22 , 23 , 24 , and what governs scientific reproducibility 25 , 26 , 27 ? How do contributions evolve over a scientific career 28 , 29 , 30 , 31 , 32 , and how may diversity among scientists advance scientific progress 33 , 34 , 35 , among other questions relevant to science policy 36 , 37 .

Overall, this review aims to facilitate entry to science of science research, expand researcher toolkits and illustrate how diverse research approaches contribute to our collective understanding of science. Section 2 reviews datasets and data linkages. Section 3 reviews major measurement constructs in the science of science. Section 4 considers a range of empirical methods, focusing on one study to illustrate each method and briefly summarizing related examples and applications. Section 5 concludes with an outlook for the science of science.

Historically, data on scientific activities were difficult to collect and were available in limited quantities. Gathering data could involve manually tallying statistics from publications 38 , 39 , interviewing scientists 16 , 40 , or assembling historical anecdotes and biographies 13 , 41 . Analyses were typically limited to a specific domain or group of scientists. Today, massive datasets on scientific production and use are at researchers’ fingertips 42 , 43 , 44 . Armed with big data and advanced algorithms, researchers can now probe questions previously not amenable to quantification and with enormous increases in scope and scale, as detailed below.

Publication datasets cover papers from nearly all scientific disciplines, enabling analyses of both general and domain-specific patterns. Commonly used datasets include the Web of Science (WoS), PubMed, CrossRef, ORCID, OpenCitations, Dimensions and OpenAlex. Datasets incorporating papers’ text (CORE) 45 , 46 , 47 , data entities (DataCite) 48 , 49 and peer review reports (Publons) 33 , 50 , 51 have also become available. These datasets further enable novel measurement, for example, representations of a paper’s content 52 , 53 , novelty 15 , 54 and interdisciplinarity 55 .

Notably, databases today capture more diverse aspects of science beyond publications, offering a richer and more encompassing view of research contexts and of researchers themselves (Fig. 1 ). For example, some datasets trace research funding to the specific publications these investments support 56 , 57 , allowing high-scale studies of the impact of funding on productivity and the return on public investment. Datasets incorporating job placements 58 , 59 , curriculum vitae 21 , 59 and scientific prizes 23 offer rich quantitative evidence on the social structure of science. Combining publication profiles with mentorship genealogies 60 , 61 , dissertations 34 and course syllabi 62 , 63 provides insights on mentoring and cultivating talent.

figure 1

This figure presents commonly used data types in science of science research, information contained in each data type and examples of data sources. Datasets in the science of science research have not only grown in scale but have also expanded beyond publications to integrate upstream funding investments and downstream applications that extend beyond science itself.

Finally, today’s scope of data extends beyond science to broader aspects of society. Altmetrics 64 captures news media and social media mentions of scientific articles. Other databases incorporate marketplace uses of science, including through patents 10 , pharmaceutical clinical trials and drug approvals 65 , 66 . Policy documents 67 , 68 help us to understand the role of science in the halls of government 69 and policy making 12 , 68 .

While datasets of the modern scientific enterprise have grown exponentially, they are not without limitations. As is often the case for data-driven research, drawing conclusions from specific data sources requires scrutiny and care. Datasets are typically based on published work, which may favour easy-to-publish topics over important ones (the streetlight effect) 70 , 71 . The publication of negative results is also rare (the file drawer problem) 72 , 73 . Meanwhile, English language publications account for over 90% of articles in major data sources, with limited coverage of non-English journals 74 . Publication datasets may also reflect biases in data collection across research institutions or demographic groups. Despite the open science movement, many datasets require paid subscriptions, which can create inequality in data access. Creating more open datasets for the science of science, such as OpenAlex, may not only improve the robustness and replicability of empirical claims but also increase entry to the field.

As today’s datasets become larger in scale and continue to integrate new dimensions, they offer opportunities to unveil the inner workings and external impacts of science in new ways. They can enable researchers to reach beyond previous limitations while conducting original studies of new and long-standing questions about the sciences.

Measurement

Here we discuss prominent measurement approaches in the science of science, including their purposes and limitations.

Modern publication databases typically include data on which articles and authors cite other papers and scientists. These citation linkages have been used to engage core conceptual ideas in scientific research. Here we consider two common measures based on citation information: citation counts and knowledge flows.

First, citation counts are commonly used indicators of impact. The term ‘indicator’ implies that it only approximates the concept of interest. A citation count is defined as how many times a document is cited by subsequent documents and can proxy for the importance of research papers 75 , 76 as well as patented inventions 77 , 78 , 79 . Rather than treating each citation equally, measures may further weight the importance of each citation, for example by using the citation network structure to produce centrality 80 , PageRank 81 , 82 or Eigenfactor indicators 83 , 84 .

Citation-based indicators have also faced criticism 84 , 85 . Citation indicators necessarily oversimplify the construct of impact, often ignoring heterogeneity in the meaning and use of a particular reference, the variations in citation practices across fields and institutional contexts, and the potential for reputation and power structures in science to influence citation behaviour 86 , 87 . Researchers have started to understand more nuanced citation behaviours ranging from negative citations 86 to citation context 47 , 88 , 89 . Understanding what a citation actually measures matters in interpreting and applying many research findings in the science of science. Evaluations relying on citation-based indicators rather than expert judgements raise questions regarding misuse 90 , 91 , 92 . Given the importance of developing indicators that can reliably quantify and evaluate science, the scientometrics community has been working to provide guidance for responsible citation practices and assessment 85 .

Second, scientists use citations to trace knowledge flows. Each citation in a paper is a link to specific previous work from which we can proxy how new discoveries draw upon existing ideas 76 , 93 and how knowledge flows between fields of science 94 , 95 , research institutions 96 , regions and nations 97 , 98 , 99 , and individuals 81 . Combinations of citation linkages can also approximate novelty 15 , disruptiveness 17 , 100 and interdisciplinarity 55 , 95 , 101 , 102 . A rapidly expanding body of work further examines citations to scientific articles from other domains (for example, patents, clinical drug trials and policy documents) to understand the applied value of science 10 , 12 , 65 , 66 , 103 , 104 , 105 .

Individuals

Analysing individual careers allows researchers to answer questions such as: How do we quantify individual scientific productivity? What is a typical career lifecycle? How are resources and credits allocated across individuals and careers? A scholar’s career can be examined through the papers they publish 30 , 31 , 106 , 107 , 108 , with attention to career progression and mobility, publication counts and citation impact, as well as grant funding 24 , 109 , 110 and prizes 111 , 112 , 113 ,

Studies of individual impact focus on output, typically approximated by the number of papers a researcher publishes and citation indicators. A popular measure for individual impact is the h -index 114 , which takes both volume and per-paper impact into consideration. Specifically, a scientist is assigned the largest value h such that they have h papers that were each cited at least h times. Later studies build on the idea of the h -index and propose variants to address limitations 115 , these variants ranging from emphasizing highly cited papers in a career 116 , to field differences 117 and normalizations 118 , to the relative contribution of an individual in collaborative works 119 .

To study dynamics in output over the lifecycle, individuals can be studied according to age, career age or the sequence of publications. A long-standing literature has investigated the relationship between age and the likelihood of outstanding achievement 28 , 106 , 111 , 120 , 121 . Recent studies further decouple the relationship between age, publication volume and per-paper citation, and measure the likelihood of producing highly cited papers in the sequence of works one produces 30 , 31 .

As simple as it sounds, representing careers using publication records is difficult. Collecting the full publication list of a researcher is the foundation to study individuals yet remains a key challenge, requiring name disambiguation techniques to match specific works to specific researchers. Although algorithms are increasingly capable at identifying millions of career profiles 122 , they vary in accuracy and robustness. ORCID can help to alleviate the problem by offering researchers the opportunity to create, maintain and update individual profiles themselves, and it goes beyond publications to collect broader outputs and activities 123 . A second challenge is survivorship bias. Empirical studies tend to focus on careers that are long enough to afford statistical analyses, which limits the applicability of the findings to scientific careers as a whole. A third challenge is the breadth of scientists’ activities, where focusing on publications ignores other important contributions such as mentorship and teaching, service (for example, refereeing papers, reviewing grant proposals and editing journals) or leadership within their organizations. Although researchers have begun exploring these dimensions by linking individual publication profiles with genealogical databases 61 , 124 , dissertations 34 , grants 109 , curriculum vitae 21 and acknowledgements 125 , scientific careers beyond publication records remain under-studied 126 , 127 . Lastly, citation-based indicators only serve as an approximation of individual performance with similar limitations as discussed above. The scientific community has called for more appropriate practices 85 , 128 , ranging from incorporating expert assessment of research contributions to broadening the measures of impact beyond publications.

Over many decades, science has exhibited a substantial and steady shift away from solo authorship towards coauthorship, especially among highly cited works 18 , 129 , 130 . In light of this shift, a research field, the science of team science 131 , 132 , has emerged to study the mechanisms that facilitate or hinder the effectiveness of teams. Team size can be proxied by the number of coauthors on a paper, which has been shown to predict distinctive types of advance: whereas larger teams tend to develop ideas, smaller teams tend to disrupt current ways of thinking 17 . Team characteristics can be inferred from coauthors’ backgrounds 133 , 134 , 135 , allowing quantification of a team’s diversity in terms of field, age, gender or ethnicity. Collaboration networks based on coauthorship 130 , 136 , 137 , 138 , 139 offer nuanced network-based indicators to understand individual and institutional collaborations.

However, there are limitations to using coauthorship alone to study teams 132 . First, coauthorship can obscure individual roles 140 , 141 , 142 , which has prompted institutional responses to help to allocate credit, including authorship order and individual contribution statements 56 , 143 . Second, coauthorship does not reflect the complex dynamics and interactions between team members that are often instrumental for team success 53 , 144 . Third, collaborative contributions can extend beyond coauthorship in publications to include members of a research laboratory 145 or co-principal investigators (co-PIs) on a grant 146 . Initiatives such as CRediT may help to address some of these issues by recording detailed roles for each contributor 147 .

Institutions

Research institutions, such as departments, universities, national laboratories and firms, encompass wider groups of researchers and their corresponding outputs. Institutional membership can be inferred from affiliations listed on publications or patents 148 , 149 , and the output of an institution can be aggregated over all its affiliated researchers 150 . Institutional research information systems (CRIS) contain more comprehensive research outputs and activities from employees.

Some research questions consider the institution as a whole, investigating the returns to research and development investment 104 , inequality of resource allocation 22 and the flow of scientists 21 , 148 , 149 . Other questions focus on institutional structures as sources of research productivity by looking into the role of peer effects 125 , 151 , 152 , 153 , how institutional policies impact research outcomes 154 , 155 and whether interdisciplinary efforts foster innovation 55 . Institution-oriented measurement faces similar limitations as with analyses of individuals and teams, including name disambiguation for a given institution and the limited capacity of formal publication records to characterize the full range of relevant institutional outcomes. It is also unclear how to allocate credit among multiple institutions associated with a paper. Moreover, relevant institutional employees extend beyond publishing researchers: interns, technicians and administrators all contribute to research endeavours 130 .

In sum, measurements allow researchers to quantify scientific production and use across numerous dimensions, but they also raise questions of construct validity: Does the proposed metric really reflect what we want to measure? Testing the construct’s validity is important, as is understanding a construct’s limits. Where possible, using alternative measurement approaches, or qualitative methods such as interviews and surveys, can improve measurement accuracy and the robustness of findings.

Empirical methods

In this section, we review two broad categories of empirical approaches (Table 1 ), each with distinctive goals: (1) to discover, estimate and predict empirical regularities; and (2) to identify causal mechanisms. For each method, we give a concrete example to help to explain how the method works, summarize related work for interested readers, and discuss contributions and limitations.

Descriptive and predictive approaches

Empirical regularities and generalizable facts.

The discovery of empirical regularities in science has had a key role in driving conceptual developments and the directions of future research. By observing empirical patterns at scale, researchers unveil central facts that shape science and present core features that theories of scientific progress and practice must explain. For example, consider citation distributions. de Solla Price first proposed that citation distributions are fat-tailed 39 , indicating that a few papers have extremely high citations while most papers have relatively few or even no citations at all. de Solla Price proposed that citation distribution was a power law, while researchers have since refined this view to show that the distribution appears log-normal, a nearly universal regularity across time and fields 156 , 157 . The fat-tailed nature of citation distributions and its universality across the sciences has in turn sparked substantial theoretical work that seeks to explain this key empirical regularity 20 , 156 , 158 , 159 .

Empirical regularities are often surprising and can contest previous beliefs of how science works. For example, it has been shown that the age distribution of great achievements peaks in middle age across a wide range of fields 107 , 121 , 160 , rejecting the common belief that young scientists typically drive breakthroughs in science. A closer look at the individual careers also indicates that productivity patterns vary widely across individuals 29 . Further, a scholar’s highest-impact papers come at a remarkably constant rate across the sequence of their work 30 , 31 .

The discovery of empirical regularities has had important roles in shaping beliefs about the nature of science 10 , 45 , 161 , 162 , sources of breakthrough ideas 15 , 163 , 164 , 165 , scientific careers 21 , 29 , 126 , 127 , the network structure of ideas and scientists 23 , 98 , 136 , 137 , 138 , 139 , 166 , gender inequality 57 , 108 , 126 , 135 , 143 , 167 , 168 , and many other areas of interest to scientists and science institutions 22 , 47 , 86 , 97 , 102 , 105 , 134 , 169 , 170 , 171 . At the same time, care must be taken to ensure that findings are not merely artefacts due to data selection or inherent bias. To differentiate meaningful patterns from spurious ones, it is important to stress test the findings through different selection criteria or across non-overlapping data sources.

Regression analysis

When investigating correlations among variables, a classic method is regression, which estimates how one set of variables explains variation in an outcome of interest. Regression can be used to test explicit hypotheses or predict outcomes. For example, researchers have investigated whether a paper’s novelty predicts its citation impact 172 . Adding additional control variables to the regression, one can further examine the robustness of the focal relationship.

Although regression analysis is useful for hypothesis testing, it bears substantial limitations. If the question one wishes to ask concerns a ‘causal’ rather than a correlational relationship, regression is poorly suited to the task as it is impossible to control for all the confounding factors. Failing to account for such ‘omitted variables’ can bias the regression coefficient estimates and lead to spurious interpretations. Further, regression models often have low goodness of fit (small R 2 ), indicating that the variables considered explain little of the outcome variation. As regressions typically focus on a specific relationship in simple functional forms, regressions tend to emphasize interpretability rather than overall predictability. The advent of predictive approaches powered by large-scale datasets and novel computational techniques offers new opportunities for modelling complex relationships with stronger predictive power.

Mechanistic models

Mechanistic modelling is an important approach to explaining empirical regularities, drawing from methods primarily used in physics. Such models predict macro-level regularities of a system by modelling micro-level interactions among basic elements with interpretable and modifiable formulars. While theoretical by nature, mechanistic models in the science of science are often empirically grounded, and this approach has developed together with the advent of large-scale, high-resolution data.

Simplicity is the core value of a mechanistic model. Consider for example, why citations follow a fat-tailed distribution. de Solla Price modelled the citing behaviour as a cumulative advantage process on a growing citation network 159 and found that if the probability a paper is cited grows linearly with its existing citations, the resulting distribution would follow a power law, broadly aligned with empirical observations. The model is intentionally simplified, ignoring myriad factors. Yet the simple cumulative advantage process is by itself sufficient in explaining a power law distribution of citations. In this way, mechanistic models can help to reveal key mechanisms that can explain observed patterns.

Moreover, mechanistic models can be refined as empirical evidence evolves. For example, later investigations showed that citation distributions are better characterized as log-normal 156 , 173 , prompting researchers to introduce a fitness parameter to encapsulate the inherent differences in papers’ ability to attract citations 174 , 175 . Further, older papers are less likely to be cited than expected 176 , 177 , 178 , motivating more recent models 20 to introduce an additional aging effect 179 . By combining the cumulative advantage, fitness and aging effects, one can already achieve substantial predictive power not just for the overall properties of the system but also the citation dynamics of individual papers 20 .

In addition to citations, mechanistic models have been developed to understand the formation of collaborations 136 , 180 , 181 , 182 , 183 , knowledge discovery and diffusion 184 , 185 , topic selection 186 , 187 , career dynamics 30 , 31 , 188 , 189 , the growth of scientific fields 190 and the dynamics of failure in science and other domains 178 .

At the same time, some observers have argued that mechanistic models are too simplistic to capture the essence of complex real-world problems 191 . While it has been a cornerstone for the natural sciences, representing social phenomena in a limited set of mathematical equations may miss complexities and heterogeneities that make social phenomena interesting in the first place. Such concerns are not unique to the science of science, as they represent a broader theme in computational social sciences 192 , 193 , ranging from social networks 194 , 195 to human mobility 196 , 197 to epidemics 198 , 199 . Other observers have questioned the practical utility of mechanistic models and whether they can be used to guide decisions and devise actionable policies. Nevertheless, despite these limitations, several complex phenomena in the science of science are well captured by simple mechanistic models, showing a high degree of regularity beneath complex interacting systems and providing powerful insights about the nature of science. Mixing such modelling with other methods could be particularly fruitful in future investigations.

Machine learning

The science of science seeks in part to forecast promising directions for scientific research 7 , 44 . In recent years, machine learning methods have substantially advanced predictive capabilities 200 , 201 and are playing increasingly important parts in the science of science. In contrast to the previous methods, machine learning does not emphasize hypotheses or theories. Rather, it leverages complex relationships in data and optimizes goodness of fit to make predictions and categorizations.

Traditional machine learning models include supervised, semi-supervised and unsupervised learning. The model choice depends on data availability and the research question, ranging from supervised models for citation prediction 202 , 203 to unsupervised models for community detection 204 . Take for example mappings of scientific knowledge 94 , 205 , 206 . The unsupervised method applies network clustering algorithms to map the structures of science. Related visualization tools make sense of clusters from the underlying network, allowing observers to see the organization, interactions and evolution of scientific knowledge. More recently, supervised learning, and deep neural networks in particular, have witnessed especially rapid developments 207 . Neural networks can generate high-dimensional representations of unstructured data such as images and texts, which encode complex properties difficult for human experts to perceive.

Take text analysis as an example. A recent study 52 utilizes 3.3 million paper abstracts in materials science to predict the thermoelectric properties of materials. The intuition is that the words currently used to describe a material may predict its hitherto undiscovered properties (Fig. 2 ). Compared with a random material, the materials predicted by the model are eight times more likely to be reported as thermoelectric in the next 5 years, suggesting that machine learning has the potential to substantially speed up knowledge discovery, especially as data continue to grow in scale and scope. Indeed, predicting the direction of new discoveries represents one of the most promising avenues for machine learning models, with neural networks being applied widely to biology 208 , physics 209 , 210 , mathematics 211 , chemistry 212 , medicine 213 and clinical applications 214 . Neural networks also offer a quantitative framework to probe the characteristics of creative products ranging from scientific papers 53 , journals 215 , organizations 148 , to paintings and movies 32 . Neural networks can also help to predict the reproducibility of papers from a variety of disciplines at scale 53 , 216 .

figure 2

This figure illustrates the word2vec skip-gram methods 52 , where the goal is to predict useful properties of materials using previous scientific literature. a , The architecture and training process of the word2vec skip-gram model, where the 3-layer, fully connected neural network learns the 200-dimensional representation (hidden layer) from the sparse vector for each word and its context in the literature (input layer). b , The top two principal components of the word embedding. Materials with similar features are close in the 2D space, allowing prediction of a material’s properties. Different targeted words are shown in different colours. Reproduced with permission from ref. 52 , Springer Nature Ltd.

While machine learning can offer high predictive accuracy, successful applications to the science of science face challenges, particularly regarding interpretability. Researchers may value transparent and interpretable findings for how a given feature influences an outcome, rather than a black-box model. The lack of interpretability also raises concerns about bias and fairness. In predicting reproducible patterns from data, machine learning models inevitably include and reproduce biases embedded in these data, often in non-transparent ways. The fairness of machine learning 217 is heavily debated in applications ranging from the criminal justice system to hiring processes. Effective and responsible use of machine learning in the science of science therefore requires thoughtful partnership between humans and machines 53 to build a reliable system accessible to scrutiny and modification.

Causal approaches

The preceding methods can reveal core facts about the workings of science and develop predictive capacity. Yet, they fail to capture causal relationships, which are particularly useful in assessing policy interventions. For example, how can we test whether a science policy boosts or hinders the performance of individuals, teams or institutions? The overarching idea of causal approaches is to construct some counterfactual world where two groups are identical to each other except that one group experiences a treatment that the other group does not.

Towards causation

Before engaging in causal approaches, it is useful to first consider the interpretative challenges of observational data. As observational data emerge from mechanisms that are not fully known or measured, an observed correlation may be driven by underlying forces that were not accounted for in the analysis. This challenge makes causal inference fundamentally difficult in observational data. An awareness of this issue is the first step in confronting it. It further motivates intermediate empirical approaches, including the use of matching strategies and fixed effects, that can help to confront (although not fully eliminate) the inference challenge. We first consider these approaches before turning to more fully causal methods.

Matching. Matching utilizes rich information to construct a control group that is similar to the treatment group on as many observable characteristics as possible before the treatment group is exposed to the treatment. Inferences can then be made by comparing the treatment and the matched control groups. Exact matching applies to categorical values, such as country, gender, discipline or affiliation 35 , 218 . Coarsened exact matching considers percentile bins of continuous variables and matches observations in the same bin 133 . Propensity score matching estimates the probability of receiving the ‘treatment’ on the basis of the controlled variables and uses the estimates to match treatment and control groups, which reduces the matching task from comparing the values of multiple covariates to comparing a single value 24 , 219 . Dynamic matching is useful for longitudinally matching variables that change over time 220 , 221 .

Fixed effects. Fixed effects are a powerful and now standard tool in controlling for confounders. A key requirement for using fixed effects is that there are multiple observations on the same subject or entity (person, field, institution and so on) 222 , 223 , 224 . The fixed effect works as a dummy variable that accounts for the role of any fixed characteristic of that entity. Consider the finding where gender-diverse teams produce higher-impact papers than same-gender teams do 225 . A confounder may be that individuals who tend to write high-impact papers may also be more likely to work in gender-diverse teams. By including individual fixed effects, one accounts for any fixed characteristics of individuals (such as IQ, cultural background or previous education) that might drive the relationship of interest.

In sum, matching and fixed effects methods reduce potential sources of bias in interpreting relationships between variables. Yet, confounders may persist in these studies. For instance, fixed effects do not control for unobserved factors that change with time within the given entity (for example, access to funding or new skills). Identifying casual effects convincingly will then typically require distinct research methods that we turn to next.

Quasi-experiments

Researchers in economics and other fields have developed a range of quasi-experimental methods to construct treatment and control groups. The key idea here is exploiting randomness from external events that differentially expose subjects to a particular treatment. Here we review three quasi-experimental methods: difference-in-differences, instrumental variables and regression discontinuity (Fig. 3 ).

figure 3

a – c , This figure presents illustrations of ( a ) differences-in-differences, ( b ) instrumental variables and ( c ) regression discontinuity methods. The solid line in b represents causal links and the dashed line represents the relationships that are not allowed, if the IV method is to produce causal inference.

Difference-in-differences. Difference-in-difference regression (DiD) investigates the effect of an unexpected event, comparing the affected group (the treated group) with an unaffected group (the control group). The control group is intended to provide the counterfactual path—what would have happened were it not for the unexpected event. Ideally, the treated and control groups are on virtually identical paths before the treatment event, but DiD can also work if the groups are on parallel paths (Fig. 3a ). For example, one study 226 examines how the premature death of superstar scientists affects the productivity of their previous collaborators. The control group are collaborators of superstars who did not die in the time frame. The two groups do not show significant differences in publications before a death event, yet upon the death of a star scientist, the treated collaborators on average experience a 5–8% decline in their quality-adjusted publication rates compared with the control group. DiD has wide applicability in the science of science, having been used to analyse the causal effects of grant design 24 , access costs to previous research 155 , 227 , university technology transfer policies 154 , intellectual property 228 , citation practices 229 , evolution of fields 221 and the impacts of paper retractions 230 , 231 , 232 . The DiD literature has grown especially rapidly in the field of economics, with substantial recent refinements 233 , 234 .

Instrumental variables. Another quasi-experimental approach utilizes ‘instrumental variables’ (IV). The goal is to determine the causal influence of some feature X on some outcome Y by using a third, instrumental variable. This instrumental variable is a quasi-random event that induces variation in X and, except for its impact through X , has no other effect on the outcome Y (Fig. 3b ). For example, consider a study of astronomy that seeks to understand how telescope time affects career advancement 235 . Here, one cannot simply look at the correlation between telescope time and career outcomes because many confounds (such as talent or grit) may influence both telescope time and career opportunities. Now consider the weather as an instrumental variable. Cloudy weather will, at random, reduce an astronomer’s observational time. Yet, the weather on particular nights is unlikely to correlate with a scientist’s innate qualities. The weather can then provide an instrumental variable to reveal a causal relationship between telescope time and career outcomes. Instrumental variables have been used to study local peer effects in research 151 , the impact of gender composition in scientific committees 236 , patents on future innovation 237 and taxes on inventor mobility 238 .

Regression discontinuity. In regression discontinuity, policies with an arbitrary threshold for receiving some benefit can be used to construct treatment and control groups (Fig. 3c ). Take the funding paylines for grant proposals as an example. Proposals with scores increasingly close to the payline are increasingly similar in their both observable and unobservable characteristics, yet only those projects with scores above the payline receive the funding. For example, a study 110 examines the effect of winning an early-career grant on the probability of winning a later, mid-career grant. The probability has a discontinuous jump across the initial grant’s payline, providing the treatment and control groups needed to estimate the causal effect of receiving a grant. This example utilizes the ‘sharp’ regression discontinuity that assumes treatment status to be fully determined by the cut-off. If we assume treatment status is only partly determined by the cut-off, we can use ‘fuzzy’ regression discontinuity designs. Here the probability of receiving a grant is used to estimate the future outcome 11 , 110 , 239 , 240 , 241 .

Although quasi-experiments are powerful tools, they face their own limitations. First, these approaches identify causal effects within a specific context and often engage small numbers of observations. How representative the samples are for broader populations or contexts is typically left as an open question. Second, the validity of the causal design is typically not ironclad. Researchers usually conduct different robustness checks to verify whether observable confounders have significant differences between the treated and control groups, before treatment. However, unobservable features may still differ between treatment and control groups. The quality of instrumental variables and the specific claim that they have no effect on the outcome except through the variable of interest, is also difficult to assess. Ultimately, researchers must rely partly on judgement to tell whether appropriate conditions are met for causal inference.

This section emphasized popular econometric approaches to causal inference. Other empirical approaches, such as graphical causal modelling 242 , 243 , also represent an important stream of work on assessing causal relationships. Such approaches usually represent causation as a directed acyclic graph, with nodes as variables and arrows between them as suspected causal relationships. In the science of science, the directed acyclic graph approach has been applied to quantify the causal effect of journal impact factor 244 and gender or racial bias 245 on citations. Graphical causal modelling has also triggered discussions on strengths and weaknesses compared to the econometrics methods 246 , 247 .

Experiments

In contrast to quasi-experimental approaches, laboratory and field experiments conduct direct randomization in assigning treatment and control groups. These methods engage explicitly in the data generation process, manipulating interventions to observe counterfactuals. These experiments are crafted to study mechanisms of specific interest and, by designing the experiment and formally randomizing, can produce especially rigorous causal inference.

Laboratory experiments. Laboratory experiments build counterfactual worlds in well-controlled laboratory environments. Researchers randomly assign participants to the treatment or control group and then manipulate the laboratory conditions to observe different outcomes in the two groups. For example, consider laboratory experiments on team performance and gender composition 144 , 248 . The researchers randomly assign participants into groups to perform tasks such as solving puzzles or brainstorming. Teams with a higher proportion of women are found to perform better on average, offering evidence that gender diversity is causally linked to team performance. Laboratory experiments can allow researchers to test forces that are otherwise hard to observe, such as how competition influences creativity 249 . Laboratory experiments have also been used to evaluate how journal impact factors shape scientists’ perceptions of rewards 250 and gender bias in hiring 251 .

Laboratory experiments allow for precise control of settings and procedures to isolate causal effects of interest. However, participants may behave differently in synthetic environments than in real-world settings, raising questions about the generalizability and replicability of the results 252 , 253 , 254 . To assess causal effects in real-world settings, researcher use randomized controlled trials.

Randomized controlled trials. A randomized controlled trial (RCT), or field experiment, is a staple for causal inference across a wide range of disciplines. RCTs randomly assign participants into the treatment and control conditions 255 and can be used not only to assess mechanisms but also to test real-world interventions such as policy change. The science of science has witnessed growing use of RCTs. For instance, a field experiment 146 investigated whether lower search costs for collaborators increased collaboration in grant applications. The authors randomly allocated principal investigators to face-to-face sessions in a medical school, and then measured participants’ chance of writing a grant proposal together. RCTs have also offered rich causal insights on peer review 256 , 257 , 258 , 259 , 260 and gender bias in science 261 , 262 , 263 .

While powerful, RCTs are difficult to conduct in the science of science, mainly for two reasons. The first concerns potential risks in a policy intervention. For instance, while randomizing funding across individuals could generate crucial causal insights for funders, it may also inadvertently harm participants’ careers 264 . Second, key questions in the science of science often require a long-time horizon to trace outcomes, which makes RCTs costly. It also raises the difficulty of replicating findings. A relative advantage of the quasi-experimental methods discussed earlier is that one can identify causal effects over potentially long periods of time in the historical record. On the other hand, quasi-experiments must be found as opposed to designed, and they often are not available for many questions of interest. While the best approaches are context dependent, a growing community of researchers is building platforms to facilitate RCTs for the science of science, aiming to lower their costs and increase their scale. Performing RCTs in partnership with science institutions can also contribute to timely, policy-relevant research that may substantially improve science decision-making and investments.

Research in the science of science has been empowered by the growth of high-scale data, new measurement approaches and an expanding range of empirical methods. These tools provide enormous capacity to test conceptual frameworks about science, discover factors impacting scientific productivity, predict key scientific outcomes and design policies that better facilitate future scientific progress. A careful appreciation of empirical techniques can help researchers to choose effective tools for questions of interest and propel the field. A better and broader understanding of these methodologies may also build bridges across diverse research communities, facilitating communication and collaboration, and better leveraging the value of diverse perspectives. The science of science is about turning scientific methods on the nature of science itself. The fruits of this work, with time, can guide researchers and research institutions to greater progress in discovery and understanding across the landscape of scientific inquiry.

Bush, V . S cience–the Endless Frontier: A Report to the President on a Program for Postwar Scientific Research (National Science Foundation, 1990).

Mokyr, J. The Gifts of Athena (Princeton Univ. Press, 2011).

Jones, B. F. in Rebuilding the Post-Pandemic Economy (eds Kearney, M. S. & Ganz, A.) 272–310 (Aspen Institute Press, 2021).

Wang, D. & Barabási, A.-L. The Science of Science (Cambridge Univ. Press, 2021).

Fortunato, S. et al. Science of science. Science 359 , eaao0185 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Azoulay, P. et al. Toward a more scientific science. Science 361 , 1194–1197 (2018).

Article   PubMed   Google Scholar  

Clauset, A., Larremore, D. B. & Sinatra, R. Data-driven predictions in the science of science. Science 355 , 477–480 (2017).

Article   CAS   PubMed   Google Scholar  

Zeng, A. et al. The science of science: from the perspective of complex systems. Phys. Rep. 714 , 1–73 (2017).

Article   Google Scholar  

Lin, Z., Yin. Y., Liu, L. & Wang, D. SciSciNet: a large-scale open data lake for the science of science research. Sci. Data, https://doi.org/10.1038/s41597-023-02198-9 (2023).

Ahmadpoor, M. & Jones, B. F. The dual frontier: patented inventions and prior scientific advance. Science 357 , 583–587 (2017).

Azoulay, P., Graff Zivin, J. S., Li, D. & Sampat, B. N. Public R&D investments and private-sector patenting: evidence from NIH funding rules. Rev. Econ. Stud. 86 , 117–152 (2019).

Yin, Y., Dong, Y., Wang, K., Wang, D. & Jones, B. F. Public use and public funding of science. Nat. Hum. Behav. 6 , 1344–1350 (2022).

Merton, R. K. The Sociology of Science: Theoretical and Empirical Investigations (Univ. Chicago Press, 1973).

Kuhn, T. The Structure of Scientific Revolutions (Princeton Univ. Press, 2021).

Uzzi, B., Mukherjee, S., Stringer, M. & Jones, B. Atypical combinations and scientific impact. Science 342 , 468–472 (2013).

Zuckerman, H. Scientific Elite: Nobel Laureates in the United States (Transaction Publishers, 1977).

Wu, L., Wang, D. & Evans, J. A. Large teams develop and small teams disrupt science and technology. Nature 566 , 378–382 (2019).

Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science 316 , 1036–1039 (2007).

Foster, J. G., Rzhetsky, A. & Evans, J. A. Tradition and innovation in scientists’ research strategies. Am. Sociol. Rev. 80 , 875–908 (2015).

Wang, D., Song, C. & Barabási, A.-L. Quantifying long-term scientific impact. Science 342 , 127–132 (2013).

Clauset, A., Arbesman, S. & Larremore, D. B. Systematic inequality and hierarchy in faculty hiring networks. Sci. Adv. 1 , e1400005 (2015).

Ma, A., Mondragón, R. J. & Latora, V. Anatomy of funded research in science. Proc. Natl Acad. Sci. USA 112 , 14760–14765 (2015).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Ma, Y. & Uzzi, B. Scientific prize network predicts who pushes the boundaries of science. Proc. Natl Acad. Sci. USA 115 , 12608–12615 (2018).

Azoulay, P., Graff Zivin, J. S. & Manso, G. Incentives and creativity: evidence from the academic life sciences. RAND J. Econ. 42 , 527–554 (2011).

Schor, S. & Karten, I. Statistical evaluation of medical journal manuscripts. JAMA 195 , 1123–1128 (1966).

Platt, J. R. Strong inference: certain systematic methods of scientific thinking may produce much more rapid progress than others. Science 146 , 347–353 (1964).

Ioannidis, J. P. Why most published research findings are false. PLoS Med. 2 , e124 (2005).

Simonton, D. K. Career landmarks in science: individual differences and interdisciplinary contrasts. Dev. Psychol. 27 , 119 (1991).

Way, S. F., Morgan, A. C., Clauset, A. & Larremore, D. B. The misleading narrative of the canonical faculty productivity trajectory. Proc. Natl Acad. Sci. USA 114 , E9216–E9223 (2017).

Sinatra, R., Wang, D., Deville, P., Song, C. & Barabási, A.-L. Quantifying the evolution of individual scientific impact. Science 354 , aaf5239 (2016).

Liu, L. et al. Hot streaks in artistic, cultural, and scientific careers. Nature 559 , 396–399 (2018).

Liu, L., Dehmamy, N., Chown, J., Giles, C. L. & Wang, D. Understanding the onset of hot streaks across artistic, cultural, and scientific careers. Nat. Commun. 12 , 5392 (2021).

Squazzoni, F. et al. Peer review and gender bias: a study on 145 scholarly journals. Sci. Adv. 7 , eabd0299 (2021).

Hofstra, B. et al. The diversity–innovation paradox in science. Proc. Natl Acad. Sci. USA 117 , 9284–9291 (2020).

Huang, J., Gates, A. J., Sinatra, R. & Barabási, A.-L. Historical comparison of gender inequality in scientific careers across countries and disciplines. Proc. Natl Acad. Sci. USA 117 , 4609–4616 (2020).

Gläser, J. & Laudel, G. Governing science: how science policy shapes research content. Eur. J. Sociol. 57 , 117–168 (2016).

Stephan, P. E. How Economics Shapes Science (Harvard Univ. Press, 2012).

Garfield, E. & Sher, I. H. New factors in the evaluation of scientific literature through citation indexing. Am. Doc. 14 , 195–201 (1963).

Article   CAS   Google Scholar  

de Solla Price, D. J. Networks of scientific papers. Science 149 , 510–515 (1965).

Etzkowitz, H., Kemelgor, C. & Uzzi, B. Athena Unbound: The Advancement of Women in Science and Technology (Cambridge Univ. Press, 2000).

Simonton, D. K. Scientific Genius: A Psychology of Science (Cambridge Univ. Press, 1988).

Khabsa, M. & Giles, C. L. The number of scholarly documents on the public web. PLoS ONE 9 , e93949 (2014).

Xia, F., Wang, W., Bekele, T. M. & Liu, H. Big scholarly data: a survey. IEEE Trans. Big Data 3 , 18–35 (2017).

Evans, J. A. & Foster, J. G. Metaknowledge. Science 331 , 721–725 (2011).

Milojević, S. Quantifying the cognitive extent of science. J. Informetr. 9 , 962–973 (2015).

Rzhetsky, A., Foster, J. G., Foster, I. T. & Evans, J. A. Choosing experiments to accelerate collective discovery. Proc. Natl Acad. Sci. USA 112 , 14569–14574 (2015).

Poncela-Casasnovas, J., Gerlach, M., Aguirre, N. & Amaral, L. A. Large-scale analysis of micro-level citation patterns reveals nuanced selection criteria. Nat. Hum. Behav. 3 , 568–575 (2019).

Hardwicke, T. E. et al. Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition. R. Soc. Open Sci. 5 , 180448 (2018).

Nagaraj, A., Shears, E. & de Vaan, M. Improving data access democratizes and diversifies science. Proc. Natl Acad. Sci. USA 117 , 23490–23498 (2020).

Bravo, G., Grimaldo, F., López-Iñesta, E., Mehmani, B. & Squazzoni, F. The effect of publishing peer review reports on referee behavior in five scholarly journals. Nat. Commun. 10 , 322 (2019).

Tran, D. et al. An open review of open review: a critical analysis of the machine learning conference review process. Preprint at https://doi.org/10.48550/arXiv.2010.05137 (2020).

Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571 , 95–98 (2019).

Yang, Y., Wu, Y. & Uzzi, B. Estimating the deep replicability of scientific findings using human and artificial intelligence. Proc. Natl Acad. Sci. USA 117 , 10762–10768 (2020).

Mukherjee, S., Uzzi, B., Jones, B. & Stringer, M. A new method for identifying recombinations of existing knowledge associated with high‐impact innovation. J. Prod. Innov. Manage. 33 , 224–236 (2016).

Leahey, E., Beckman, C. M. & Stanko, T. L. Prominent but less productive: the impact of interdisciplinarity on scientists’ research. Adm. Sci. Q. 62 , 105–139 (2017).

Sauermann, H. & Haeussler, C. Authorship and contribution disclosures. Sci. Adv. 3 , e1700404 (2017).

Oliveira, D. F. M., Ma, Y., Woodruff, T. K. & Uzzi, B. Comparison of National Institutes of Health grant amounts to first-time male and female principal investigators. JAMA 321 , 898–900 (2019).

Yang, Y., Chawla, N. V. & Uzzi, B. A network’s gender composition and communication pattern predict women’s leadership success. Proc. Natl Acad. Sci. USA 116 , 2033–2038 (2019).

Way, S. F., Larremore, D. B. & Clauset, A. Gender, productivity, and prestige in computer science faculty hiring networks. In Proc. 25th International Conference on World Wide Web 1169–1179. (ACM 2016)

Malmgren, R. D., Ottino, J. M. & Amaral, L. A. N. The role of mentorship in protege performance. Nature 465 , 622–626 (2010).

Ma, Y., Mukherjee, S. & Uzzi, B. Mentorship and protégé success in STEM fields. Proc. Natl Acad. Sci. USA 117 , 14077–14083 (2020).

Börner, K. et al. Skill discrepancies between research, education, and jobs reveal the critical need to supply soft skills for the data economy. Proc. Natl Acad. Sci. USA 115 , 12630–12637 (2018).

Biasi, B. & Ma, S. The Education-Innovation Gap (National Bureau of Economic Research Working papers, 2020).

Bornmann, L. Do altmetrics point to the broader impact of research? An overview of benefits and disadvantages of altmetrics. J. Informetr. 8 , 895–903 (2014).

Cleary, E. G., Beierlein, J. M., Khanuja, N. S., McNamee, L. M. & Ledley, F. D. Contribution of NIH funding to new drug approvals 2010–2016. Proc. Natl Acad. Sci. USA 115 , 2329–2334 (2018).

Spector, J. M., Harrison, R. S. & Fishman, M. C. Fundamental science behind today’s important medicines. Sci. Transl. Med. 10 , eaaq1787 (2018).

Haunschild, R. & Bornmann, L. How many scientific papers are mentioned in policy-related documents? An empirical investigation using Web of Science and Altmetric data. Scientometrics 110 , 1209–1216 (2017).

Yin, Y., Gao, J., Jones, B. F. & Wang, D. Coevolution of policy and science during the pandemic. Science 371 , 128–130 (2021).

Sugimoto, C. R., Work, S., Larivière, V. & Haustein, S. Scholarly use of social media and altmetrics: a review of the literature. J. Assoc. Inf. Sci. Technol. 68 , 2037–2062 (2017).

Dunham, I. Human genes: time to follow the roads less traveled? PLoS Biol. 16 , e3000034 (2018).

Kustatscher, G. et al. Understudied proteins: opportunities and challenges for functional proteomics. Nat. Methods 19 , 774–779 (2022).

Rosenthal, R. The file drawer problem and tolerance for null results. Psychol. Bull. 86 , 638 (1979).

Franco, A., Malhotra, N. & Simonovits, G. Publication bias in the social sciences: unlocking the file drawer. Science 345 , 1502–1505 (2014).

Vera-Baceta, M.-A., Thelwall, M. & Kousha, K. Web of Science and Scopus language coverage. Scientometrics 121 , 1803–1813 (2019).

Waltman, L. A review of the literature on citation impact indicators. J. Informetr. 10 , 365–391 (2016).

Garfield, E. & Merton, R. K. Citation Indexing: Its Theory and Application in Science, Technology, and Humanities (Wiley, 1979).

Kelly, B., Papanikolaou, D., Seru, A. & Taddy, M. Measuring Technological Innovation Over the Long Run Report No. 0898-2937 (National Bureau of Economic Research, 2018).

Kogan, L., Papanikolaou, D., Seru, A. & Stoffman, N. Technological innovation, resource allocation, and growth. Q. J. Econ. 132 , 665–712 (2017).

Hall, B. H., Jaffe, A. & Trajtenberg, M. Market value and patent citations. RAND J. Econ. 36 , 16–38 (2005).

Google Scholar  

Yan, E. & Ding, Y. Applying centrality measures to impact analysis: a coauthorship network analysis. J. Am. Soc. Inf. Sci. Technol. 60 , 2107–2118 (2009).

Radicchi, F., Fortunato, S., Markines, B. & Vespignani, A. Diffusion of scientific credits and the ranking of scientists. Phys. Rev. E 80 , 056103 (2009).

Bollen, J., Rodriquez, M. A. & Van de Sompel, H. Journal status. Scientometrics 69 , 669–687 (2006).

Bergstrom, C. T., West, J. D. & Wiseman, M. A. The eigenfactor™ metrics. J. Neurosci. 28 , 11433–11434 (2008).

Cronin, B. & Sugimoto, C. R. Beyond Bibliometrics: Harnessing Multidimensional Indicators of Scholarly Impact (MIT Press, 2014).

Hicks, D., Wouters, P., Waltman, L., De Rijcke, S. & Rafols, I. Bibliometrics: the Leiden Manifesto for research metrics. Nature 520 , 429–431 (2015).

Catalini, C., Lacetera, N. & Oettl, A. The incidence and role of negative citations in science. Proc. Natl Acad. Sci. USA 112 , 13823–13826 (2015).

Alcacer, J. & Gittelman, M. Patent citations as a measure of knowledge flows: the influence of examiner citations. Rev. Econ. Stat. 88 , 774–779 (2006).

Ding, Y. et al. Content‐based citation analysis: the next generation of citation analysis. J. Assoc. Inf. Sci. Technol. 65 , 1820–1833 (2014).

Teufel, S., Siddharthan, A. & Tidhar, D. Automatic classification of citation function. In Proc. 2006 Conference on Empirical Methods in Natural Language Processing, 103–110 (Association for Computational Linguistics 2006)

Seeber, M., Cattaneo, M., Meoli, M. & Malighetti, P. Self-citations as strategic response to the use of metrics for career decisions. Res. Policy 48 , 478–491 (2019).

Pendlebury, D. A. The use and misuse of journal metrics and other citation indicators. Arch. Immunol. Ther. Exp. 57 , 1–11 (2009).

Biagioli, M. Watch out for cheats in citation game. Nature 535 , 201 (2016).

Jo, W. S., Liu, L. & Wang, D. See further upon the giants: quantifying intellectual lineage in science. Quant. Sci. Stud. 3 , 319–330 (2022).

Boyack, K. W., Klavans, R. & Börner, K. Mapping the backbone of science. Scientometrics 64 , 351–374 (2005).

Gates, A. J., Ke, Q., Varol, O. & Barabási, A.-L. Nature’s reach: narrow work has broad impact. Nature 575 , 32–34 (2019).

Börner, K., Penumarthy, S., Meiss, M. & Ke, W. Mapping the diffusion of scholarly knowledge among major US research institutions. Scientometrics 68 , 415–426 (2006).

King, D. A. The scientific impact of nations. Nature 430 , 311–316 (2004).

Pan, R. K., Kaski, K. & Fortunato, S. World citation and collaboration networks: uncovering the role of geography in science. Sci. Rep. 2 , 902 (2012).

Jaffe, A. B., Trajtenberg, M. & Henderson, R. Geographic localization of knowledge spillovers as evidenced by patent citations. Q. J. Econ. 108 , 577–598 (1993).

Funk, R. J. & Owen-Smith, J. A dynamic network measure of technological change. Manage. Sci. 63 , 791–817 (2017).

Yegros-Yegros, A., Rafols, I. & D’este, P. Does interdisciplinary research lead to higher citation impact? The different effect of proximal and distal interdisciplinarity. PLoS ONE 10 , e0135095 (2015).

Larivière, V., Haustein, S. & Börner, K. Long-distance interdisciplinarity leads to higher scientific impact. PLoS ONE 10 , e0122565 (2015).

Fleming, L., Greene, H., Li, G., Marx, M. & Yao, D. Government-funded research increasingly fuels innovation. Science 364 , 1139–1141 (2019).

Bowen, A. & Casadevall, A. Increasing disparities between resource inputs and outcomes, as measured by certain health deliverables, in biomedical research. Proc. Natl Acad. Sci. USA 112 , 11335–11340 (2015).

Li, D., Azoulay, P. & Sampat, B. N. The applied value of public investments in biomedical research. Science 356 , 78–81 (2017).

Lehman, H. C. Age and Achievement (Princeton Univ. Press, 2017).

Simonton, D. K. Creative productivity: a predictive and explanatory model of career trajectories and landmarks. Psychol. Rev. 104 , 66 (1997).

Duch, J. et al. The possible role of resource requirements and academic career-choice risk on gender differences in publication rate and impact. PLoS ONE 7 , e51332 (2012).

Wang, Y., Jones, B. F. & Wang, D. Early-career setback and future career impact. Nat. Commun. 10 , 4331 (2019).

Bol, T., de Vaan, M. & van de Rijt, A. The Matthew effect in science funding. Proc. Natl Acad. Sci. USA 115 , 4887–4890 (2018).

Jones, B. F. Age and great invention. Rev. Econ. Stat. 92 , 1–14 (2010).

Newman, M. Networks (Oxford Univ. Press, 2018).

Mazloumian, A., Eom, Y.-H., Helbing, D., Lozano, S. & Fortunato, S. How citation boosts promote scientific paradigm shifts and nobel prizes. PLoS ONE 6 , e18975 (2011).

Hirsch, J. E. An index to quantify an individual’s scientific research output. Proc. Natl Acad. Sci. USA 102 , 16569–16572 (2005).

Alonso, S., Cabrerizo, F. J., Herrera-Viedma, E. & Herrera, F. h-index: a review focused in its variants, computation and standardization for different scientific fields. J. Informetr. 3 , 273–289 (2009).

Egghe, L. An improvement of the h-index: the g-index. ISSI Newsl. 2 , 8–9 (2006).

Kaur, J., Radicchi, F. & Menczer, F. Universality of scholarly impact metrics. J. Informetr. 7 , 924–932 (2013).

Majeti, D. et al. Scholar plot: design and evaluation of an information interface for faculty research performance. Front. Res. Metr. Anal. 4 , 6 (2020).

Sidiropoulos, A., Katsaros, D. & Manolopoulos, Y. Generalized Hirsch h-index for disclosing latent facts in citation networks. Scientometrics 72 , 253–280 (2007).

Jones, B. F. & Weinberg, B. A. Age dynamics in scientific creativity. Proc. Natl Acad. Sci. USA 108 , 18910–18914 (2011).

Dennis, W. Age and productivity among scientists. Science 123 , 724–725 (1956).

Sanyal, D. K., Bhowmick, P. K. & Das, P. P. A review of author name disambiguation techniques for the PubMed bibliographic database. J. Inf. Sci. 47 , 227–254 (2021).

Haak, L. L., Fenner, M., Paglione, L., Pentz, E. & Ratner, H. ORCID: a system to uniquely identify researchers. Learn. Publ. 25 , 259–264 (2012).

Malmgren, R. D., Ottino, J. M. & Amaral, L. A. N. The role of mentorship in protégé performance. Nature 465 , 662–667 (2010).

Oettl, A. Reconceptualizing stars: scientist helpfulness and peer performance. Manage. Sci. 58 , 1122–1140 (2012).

Morgan, A. C. et al. The unequal impact of parenthood in academia. Sci. Adv. 7 , eabd1996 (2021).

Morgan, A. C. et al. Socioeconomic roots of academic faculty. Nat. Hum. Behav. 6 , 1625–1633 (2022).

San Francisco Declaration on Research Assessment (DORA) (American Society for Cell Biology, 2012).

Falk‐Krzesinski, H. J. et al. Advancing the science of team science. Clin. Transl. Sci. 3 , 263–266 (2010).

Cooke, N. J. et al. Enhancing the Effectiveness of Team Science (National Academies Press, 2015).

Börner, K. et al. A multi-level systems perspective for the science of team science. Sci. Transl. Med. 2 , 49cm24 (2010).

Leahey, E. From sole investigator to team scientist: trends in the practice and study of research collaboration. Annu. Rev. Sociol. 42 , 81–100 (2016).

AlShebli, B. K., Rahwan, T. & Woon, W. L. The preeminence of ethnic diversity in scientific collaboration. Nat. Commun. 9 , 5163 (2018).

Hsiehchen, D., Espinoza, M. & Hsieh, A. Multinational teams and diseconomies of scale in collaborative research. Sci. Adv. 1 , e1500211 (2015).

Koning, R., Samila, S. & Ferguson, J.-P. Who do we invent for? Patents by women focus more on women’s health, but few women get to invent. Science 372 , 1345–1348 (2021).

Barabâsi, A.-L. et al. Evolution of the social network of scientific collaborations. Physica A 311 , 590–614 (2002).

Newman, M. E. Scientific collaboration networks. I. Network construction and fundamental results. Phys. Rev. E 64 , 016131 (2001).

Newman, M. E. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys. Rev. E 64 , 016132 (2001).

Palla, G., Barabási, A.-L. & Vicsek, T. Quantifying social group evolution. Nature 446 , 664–667 (2007).

Ross, M. B. et al. Women are credited less in science than men. Nature 608 , 135–145 (2022).

Shen, H.-W. & Barabási, A.-L. Collective credit allocation in science. Proc. Natl Acad. Sci. USA 111 , 12325–12330 (2014).

Merton, R. K. Matthew effect in science. Science 159 , 56–63 (1968).

Ni, C., Smith, E., Yuan, H., Larivière, V. & Sugimoto, C. R. The gendered nature of authorship. Sci. Adv. 7 , eabe4639 (2021).

Woolley, A. W., Chabris, C. F., Pentland, A., Hashmi, N. & Malone, T. W. Evidence for a collective intelligence factor in the performance of human groups. Science 330 , 686–688 (2010).

Feldon, D. F. et al. Postdocs’ lab engagement predicts trajectories of PhD students’ skill development. Proc. Natl Acad. Sci. USA 116 , 20910–20916 (2019).

Boudreau, K. J. et al. A field experiment on search costs and the formation of scientific collaborations. Rev. Econ. Stat. 99 , 565–576 (2017).

Holcombe, A. O. Contributorship, not authorship: use CRediT to indicate who did what. Publications 7 , 48 (2019).

Murray, D. et al. Unsupervised embedding of trajectories captures the latent structure of mobility. Preprint at https://doi.org/10.48550/arXiv.2012.02785 (2020).

Deville, P. et al. Career on the move: geography, stratification, and scientific impact. Sci. Rep. 4 , 4770 (2014).

Edmunds, L. D. et al. Why do women choose or reject careers in academic medicine? A narrative review of empirical evidence. Lancet 388 , 2948–2958 (2016).

Waldinger, F. Peer effects in science: evidence from the dismissal of scientists in Nazi Germany. Rev. Econ. Stud. 79 , 838–861 (2012).

Agrawal, A., McHale, J. & Oettl, A. How stars matter: recruiting and peer effects in evolutionary biology. Res. Policy 46 , 853–867 (2017).

Fiore, S. M. Interdisciplinarity as teamwork: how the science of teams can inform team science. Small Group Res. 39 , 251–277 (2008).

Hvide, H. K. & Jones, B. F. University innovation and the professor’s privilege. Am. Econ. Rev. 108 , 1860–1898 (2018).

Murray, F., Aghion, P., Dewatripont, M., Kolev, J. & Stern, S. Of mice and academics: examining the effect of openness on innovation. Am. Econ. J. Econ. Policy 8 , 212–252 (2016).

Radicchi, F., Fortunato, S. & Castellano, C. Universality of citation distributions: toward an objective measure of scientific impact. Proc. Natl Acad. Sci. USA 105 , 17268–17272 (2008).

Waltman, L., van Eck, N. J. & van Raan, A. F. Universality of citation distributions revisited. J. Am. Soc. Inf. Sci. Technol. 63 , 72–77 (2012).

Barabási, A.-L. & Albert, R. Emergence of scaling in random networks. Science 286 , 509–512 (1999).

de Solla Price, D. A general theory of bibliometric and other cumulative advantage processes. J. Am. Soc. Inf. Sci. 27 , 292–306 (1976).

Cole, S. Age and scientific performance. Am. J. Sociol. 84 , 958–977 (1979).

Ke, Q., Ferrara, E., Radicchi, F. & Flammini, A. Defining and identifying sleeping beauties in science. Proc. Natl Acad. Sci. USA 112 , 7426–7431 (2015).

Bornmann, L., de Moya Anegón, F. & Leydesdorff, L. Do scientific advancements lean on the shoulders of giants? A bibliometric investigation of the Ortega hypothesis. PLoS ONE 5 , e13327 (2010).

Mukherjee, S., Romero, D. M., Jones, B. & Uzzi, B. The nearly universal link between the age of past knowledge and tomorrow’s breakthroughs in science and technology: the hotspot. Sci. Adv. 3 , e1601315 (2017).

Packalen, M. & Bhattacharya, J. NIH funding and the pursuit of edge science. Proc. Natl Acad. Sci. USA 117 , 12011–12016 (2020).

Zeng, A., Fan, Y., Di, Z., Wang, Y. & Havlin, S. Fresh teams are associated with original and multidisciplinary research. Nat. Hum. Behav. 5 , 1314–1322 (2021).

Newman, M. E. The structure of scientific collaboration networks. Proc. Natl Acad. Sci. USA 98 , 404–409 (2001).

Larivière, V., Ni, C., Gingras, Y., Cronin, B. & Sugimoto, C. R. Bibliometrics: global gender disparities in science. Nature 504 , 211–213 (2013).

West, J. D., Jacquet, J., King, M. M., Correll, S. J. & Bergstrom, C. T. The role of gender in scholarly authorship. PLoS ONE 8 , e66212 (2013).

Gao, J., Yin, Y., Myers, K. R., Lakhani, K. R. & Wang, D. Potentially long-lasting effects of the pandemic on scientists. Nat. Commun. 12 , 6188 (2021).

Jones, B. F., Wuchty, S. & Uzzi, B. Multi-university research teams: shifting impact, geography, and stratification in science. Science 322 , 1259–1262 (2008).

Chu, J. S. & Evans, J. A. Slowed canonical progress in large fields of science. Proc. Natl Acad. Sci. USA 118 , e2021636118 (2021).

Wang, J., Veugelers, R. & Stephan, P. Bias against novelty in science: a cautionary tale for users of bibliometric indicators. Res. Policy 46 , 1416–1436 (2017).

Stringer, M. J., Sales-Pardo, M. & Amaral, L. A. Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal. J. Assoc. Inf. Sci. Technol. 61 , 1377–1385 (2010).

Bianconi, G. & Barabási, A.-L. Bose-Einstein condensation in complex networks. Phys. Rev. Lett. 86 , 5632 (2001).

Bianconi, G. & Barabási, A.-L. Competition and multiscaling in evolving networks. Europhys. Lett. 54 , 436 (2001).

Yin, Y. & Wang, D. The time dimension of science: connecting the past to the future. J. Informetr. 11 , 608–621 (2017).

Pan, R. K., Petersen, A. M., Pammolli, F. & Fortunato, S. The memory of science: Inflation, myopia, and the knowledge network. J. Informetr. 12 , 656–678 (2018).

Yin, Y., Wang, Y., Evans, J. A. & Wang, D. Quantifying the dynamics of failure across science, startups and security. Nature 575 , 190–194 (2019).

Candia, C. & Uzzi, B. Quantifying the selective forgetting and integration of ideas in science and technology. Am. Psychol. 76 , 1067 (2021).

Milojević, S. Principles of scientific research team formation and evolution. Proc. Natl Acad. Sci. USA 111 , 3984–3989 (2014).

Guimera, R., Uzzi, B., Spiro, J. & Amaral, L. A. N. Team assembly mechanisms determine collaboration network structure and team performance. Science 308 , 697–702 (2005).

Newman, M. E. Coauthorship networks and patterns of scientific collaboration. Proc. Natl Acad. Sci. USA 101 , 5200–5205 (2004).

Newman, M. E. Clustering and preferential attachment in growing networks. Phys. Rev. E 64 , 025102 (2001).

Iacopini, I., Milojević, S. & Latora, V. Network dynamics of innovation processes. Phys. Rev. Lett. 120 , 048301 (2018).

Kuhn, T., Perc, M. & Helbing, D. Inheritance patterns in citation networks reveal scientific memes. Phys. Rev. 4 , 041036 (2014).

Jia, T., Wang, D. & Szymanski, B. K. Quantifying patterns of research-interest evolution. Nat. Hum. Behav. 1 , 0078 (2017).

Zeng, A. et al. Increasing trend of scientists to switch between topics. Nat. Commun. https://doi.org/10.1038/s41467-019-11401-8 (2019).

Siudem, G., Żogała-Siudem, B., Cena, A. & Gagolewski, M. Three dimensions of scientific impact. Proc. Natl Acad. Sci. USA 117 , 13896–13900 (2020).

Petersen, A. M. et al. Reputation and impact in academic careers. Proc. Natl Acad. Sci. USA 111 , 15316–15321 (2014).

Jin, C., Song, C., Bjelland, J., Canright, G. & Wang, D. Emergence of scaling in complex substitutive systems. Nat. Hum. Behav. 3 , 837–846 (2019).

Hofman, J. M. et al. Integrating explanation and prediction in computational social science. Nature 595 , 181–188 (2021).

Lazer, D. et al. Computational social science. Science 323 , 721–723 (2009).

Lazer, D. M. et al. Computational social science: obstacles and opportunities. Science 369 , 1060–1062 (2020).

Albert, R. & Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 74 , 47 (2002).

Newman, M. E. The structure and function of complex networks. SIAM Rev. 45 , 167–256 (2003).

Song, C., Qu, Z., Blumm, N. & Barabási, A.-L. Limits of predictability in human mobility. Science 327 , 1018–1021 (2010).

Alessandretti, L., Aslak, U. & Lehmann, S. The scales of human mobility. Nature 587 , 402–407 (2020).

Pastor-Satorras, R. & Vespignani, A. Epidemic spreading in scale-free networks. Phys. Rev. Lett. 86 , 3200 (2001).

Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. Epidemic processes in complex networks. Rev. Mod. Phys. 87 , 925 (2015).

Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).

Bishop, C. M. Pattern Recognition and Machine Learning (Springer, 2006).

Dong, Y., Johnson, R. A. & Chawla, N. V. Will this paper increase your h-index? Scientific impact prediction. In Proc. 8th ACM International Conference on Web Search and Data Mining, 149–158 (ACM 2015)

Xiao, S. et al. On modeling and predicting individual paper citation count over time. In IJCAI, 2676–2682 (IJCAI, 2016)

Fortunato, S. Community detection in graphs. Phys. Rep. 486 , 75–174 (2010).

Chen, C. Science mapping: a systematic review of the literature. J. Data Inf. Sci. 2 , 1–40 (2017).

CAS   Google Scholar  

Van Eck, N. J. & Waltman, L. Citation-based clustering of publications using CitNetExplorer and VOSviewer. Scientometrics 111 , 1053–1070 (2017).

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521 , 436–444 (2015).

Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577 , 706–710 (2020).

Krenn, M. & Zeilinger, A. Predicting research trends with semantic and neural networks with an application in quantum physics. Proc. Natl Acad. Sci. USA 117 , 1910–1916 (2020).

Iten, R., Metger, T., Wilming, H., Del Rio, L. & Renner, R. Discovering physical concepts with neural networks. Phys. Rev. Lett. 124 , 010508 (2020).

Guimerà, R. et al. A Bayesian machine scientist to aid in the solution of challenging scientific problems. Sci. Adv. 6 , eaav6971 (2020).

Segler, M. H., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555 , 604–610 (2018).

Ryu, J. Y., Kim, H. U. & Lee, S. Y. Deep learning improves prediction of drug–drug and drug–food interactions. Proc. Natl Acad. Sci. USA 115 , E4304–E4311 (2018).

Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172 , 1122–1131.e9 (2018).

Peng, H., Ke, Q., Budak, C., Romero, D. M. & Ahn, Y.-Y. Neural embeddings of scholarly periodicals reveal complex disciplinary organizations. Sci. Adv. 7 , eabb9004 (2021).

Youyou, W., Yang, Y. & Uzzi, B. A discipline-wide investigation of the replicability of psychology papers over the past two decades. Proc. Natl Acad. Sci. USA 120 , e2208863120 (2023).

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. & Galstyan, A. A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR) 54 , 1–35 (2021).

Way, S. F., Morgan, A. C., Larremore, D. B. & Clauset, A. Productivity, prominence, and the effects of academic environment. Proc. Natl Acad. Sci. USA 116 , 10729–10733 (2019).

Li, W., Aste, T., Caccioli, F. & Livan, G. Early coauthorship with top scientists predicts success in academic careers. Nat. Commun. 10 , 5170 (2019).

Hendry, D. F., Pagan, A. R. & Sargan, J. D. Dynamic specification. Handb. Econ. 2 , 1023–1100 (1984).

Jin, C., Ma, Y. & Uzzi, B. Scientific prizes and the extraordinary growth of scientific topics. Nat. Commun. 12 , 5619 (2021).

Azoulay, P., Ganguli, I. & Zivin, J. G. The mobility of elite life scientists: professional and personal determinants. Res. Policy 46 , 573–590 (2017).

Slavova, K., Fosfuri, A. & De Castro, J. O. Learning by hiring: the effects of scientists’ inbound mobility on research performance in academia. Organ. Sci. 27 , 72–89 (2016).

Sarsons, H. Recognition for group work: gender differences in academia. Am. Econ. Rev. 107 , 141–145 (2017).

Campbell, L. G., Mehtani, S., Dozier, M. E. & Rinehart, J. Gender-heterogeneous working groups produce higher quality science. PLoS ONE 8 , e79147 (2013).

Azoulay, P., Graff Zivin, J. S. & Wang, J. Superstar extinction. Q. J. Econ. 125 , 549–589 (2010).

Furman, J. L. & Stern, S. Climbing atop the shoulders of giants: the impact of institutions on cumulative research. Am. Econ. Rev. 101 , 1933–1963 (2011).

Williams, H. L. Intellectual property rights and innovation: evidence from the human genome. J. Polit. Econ. 121 , 1–27 (2013).

Rubin, A. & Rubin, E. Systematic Bias in the Progress of Research. J. Polit. Econ. 129 , 2666–2719 (2021).

Lu, S. F., Jin, G. Z., Uzzi, B. & Jones, B. The retraction penalty: evidence from the Web of Science. Sci. Rep. 3 , 3146 (2013).

Jin, G. Z., Jones, B., Lu, S. F. & Uzzi, B. The reverse Matthew effect: consequences of retraction in scientific teams. Rev. Econ. Stat. 101 , 492–506 (2019).

Azoulay, P., Bonatti, A. & Krieger, J. L. The career effects of scandal: evidence from scientific retractions. Res. Policy 46 , 1552–1569 (2017).

Goodman-Bacon, A. Difference-in-differences with variation in treatment timing. J. Econ. 225 , 254–277 (2021).

Callaway, B. & Sant’Anna, P. H. Difference-in-differences with multiple time periods. J. Econ. 225 , 200–230 (2021).

Hill, R. Searching for Superstars: Research Risk and Talent Discovery in Astronomy Working Paper (Massachusetts Institute of Technology, 2019).

Bagues, M., Sylos-Labini, M. & Zinovyeva, N. Does the gender composition of scientific committees matter? Am. Econ. Rev. 107 , 1207–1238 (2017).

Sampat, B. & Williams, H. L. How do patents affect follow-on innovation? Evidence from the human genome. Am. Econ. Rev. 109 , 203–236 (2019).

Moretti, E. & Wilson, D. J. The effect of state taxes on the geographical location of top earners: evidence from star scientists. Am. Econ. Rev. 107 , 1858–1903 (2017).

Jacob, B. A. & Lefgren, L. The impact of research grant funding on scientific productivity. J. Public Econ. 95 , 1168–1177 (2011).

Li, D. Expertise versus bias in evaluation: evidence from the NIH. Am. Econ. J. Appl. Econ. 9 , 60–92 (2017).

Pearl, J. Causal diagrams for empirical research. Biometrika 82 , 669–688 (1995).

Pearl, J. & Mackenzie, D. The Book of Why: The New Science of Cause and Effect (Basic Books, 2018).

Traag, V. A. Inferring the causal effect of journals on citations. Quant. Sci. Stud. 2 , 496–504 (2021).

Traag, V. & Waltman, L. Causal foundations of bias, disparity and fairness. Preprint at https://doi.org/10.48550/arXiv.2207.13665 (2022).

Imbens, G. W. Potential outcome and directed acyclic graph approaches to causality: relevance for empirical practice in economics. J. Econ. Lit. 58 , 1129–1179 (2020).

Heckman, J. J. & Pinto, R. Causality and Econometrics (National Bureau of Economic Research, 2022).

Aggarwal, I., Woolley, A. W., Chabris, C. F. & Malone, T. W. The impact of cognitive style diversity on implicit learning in teams. Front. Psychol. 10 , 112 (2019).

Balietti, S., Goldstone, R. L. & Helbing, D. Peer review and competition in the Art Exhibition Game. Proc. Natl Acad. Sci. USA 113 , 8414–8419 (2016).

Paulus, F. M., Rademacher, L., Schäfer, T. A. J., Müller-Pinzler, L. & Krach, S. Journal impact factor shapes scientists’ reward signal in the prospect of publication. PLoS ONE 10 , e0142537 (2015).

Williams, W. M. & Ceci, S. J. National hiring experiments reveal 2:1 faculty preference for women on STEM tenure track. Proc. Natl Acad. Sci. USA 112 , 5360–5365 (2015).

Collaboration, O. S. Estimating the reproducibility of psychological science. Science 349 , aac4716 (2015).

Camerer, C. F. et al. Evaluating replicability of laboratory experiments in economics. Science 351 , 1433–1436 (2016).

Camerer, C. F. et al. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nat. Hum. Behav. 2 , 637–644 (2018).

Duflo, E. & Banerjee, A. Handbook of Field Experiments (Elsevier, 2017).

Tomkins, A., Zhang, M. & Heavlin, W. D. Reviewer bias in single versus double-blind peer review. Proc. Natl Acad. Sci. USA 114 , 12708–12713 (2017).

Blank, R. M. The effects of double-blind versus single-blind reviewing: experimental evidence from the American Economic Review. Am. Econ. Rev. 81 , 1041–1067 (1991).

Boudreau, K. J., Guinan, E. C., Lakhani, K. R. & Riedl, C. Looking across and looking beyond the knowledge frontier: intellectual distance, novelty, and resource allocation in science. Manage. Sci. 62 , 2765–2783 (2016).

Lane, J. et al. When Do Experts Listen to Other Experts? The Role of Negative Information in Expert Evaluations for Novel Projects Working Paper #21-007 (Harvard Business School, 2020).

Teplitskiy, M. et al. Do Experts Listen to Other Experts? Field Experimental Evidence from Scientific Peer Review (Harvard Business School, 2019).

Moss-Racusin, C. A., Dovidio, J. F., Brescoll, V. L., Graham, M. J. & Handelsman, J. Science faculty’s subtle gender biases favor male students. Proc. Natl Acad. Sci. USA 109 , 16474–16479 (2012).

Forscher, P. S., Cox, W. T., Brauer, M. & Devine, P. G. Little race or gender bias in an experiment of initial review of NIH R01 grant proposals. Nat. Hum. Behav. 3 , 257–264 (2019).

Dennehy, T. C. & Dasgupta, N. Female peer mentors early in college increase women’s positive academic experiences and retention in engineering. Proc. Natl Acad. Sci. USA 114 , 5964–5969 (2017).

Azoulay, P. Turn the scientific method on ourselves. Nature 484 , 31–32 (2012).

Download references

Acknowledgements

The authors thank all members of the Center for Science of Science and Innovation (CSSI) for invaluable comments. This work was supported by the Air Force Office of Scientific Research under award number FA9550-19-1-0354, National Science Foundation grant SBE 1829344, and the Alfred P. Sloan Foundation G-2019-12485.

Author information

Authors and affiliations.

Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA

Lu Liu, Benjamin F. Jones, Brian Uzzi & Dashun Wang

Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA

Kellogg School of Management, Northwestern University, Evanston, IL, USA

College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, USA

National Bureau of Economic Research, Cambridge, MA, USA

Benjamin F. Jones

Brookings Institution, Washington, DC, USA

McCormick School of Engineering, Northwestern University, Evanston, IL, USA

Dashun Wang

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Dashun Wang .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Human Behaviour thanks Ludo Waltman, Erin Leahey and Sarah Bratt for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Liu, L., Jones, B.F., Uzzi, B. et al. Data, measurement and empirical methods in the science of science. Nat Hum Behav 7 , 1046–1058 (2023). https://doi.org/10.1038/s41562-023-01562-4

Download citation

Received : 30 June 2022

Accepted : 17 February 2023

Published : 01 June 2023

Issue Date : July 2023

DOI : https://doi.org/10.1038/s41562-023-01562-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

experimental method research articles

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How the Experimental Method Works in Psychology

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

experimental method research articles

Amanda Tust is a fact-checker, researcher, and writer with a Master of Science in Journalism from Northwestern University's Medill School of Journalism.

experimental method research articles

sturti/Getty Images

The Experimental Process

Types of experiments, potential pitfalls of the experimental method.

The experimental method is a type of research procedure that involves manipulating variables to determine if there is a cause-and-effect relationship. The results obtained through the experimental method are useful but do not prove with 100% certainty that a singular cause always creates a specific effect. Instead, they show the probability that a cause will or will not lead to a particular effect.

At a Glance

While there are many different research techniques available, the experimental method allows researchers to look at cause-and-effect relationships. Using the experimental method, researchers randomly assign participants to a control or experimental group and manipulate levels of an independent variable. If changes in the independent variable lead to changes in the dependent variable, it indicates there is likely a causal relationship between them.

What Is the Experimental Method in Psychology?

The experimental method involves manipulating one variable to determine if this causes changes in another variable. This method relies on controlled research methods and random assignment of study subjects to test a hypothesis.

For example, researchers may want to learn how different visual patterns may impact our perception. Or they might wonder whether certain actions can improve memory . Experiments are conducted on many behavioral topics, including:

The scientific method forms the basis of the experimental method. This is a process used to determine the relationship between two variables—in this case, to explain human behavior .

Positivism is also important in the experimental method. It refers to factual knowledge that is obtained through observation, which is considered to be trustworthy.

When using the experimental method, researchers first identify and define key variables. Then they formulate a hypothesis, manipulate the variables, and collect data on the results. Unrelated or irrelevant variables are carefully controlled to minimize the potential impact on the experiment outcome.

History of the Experimental Method

The idea of using experiments to better understand human psychology began toward the end of the nineteenth century. Wilhelm Wundt established the first formal laboratory in 1879.

Wundt is often called the father of experimental psychology. He believed that experiments could help explain how psychology works, and used this approach to study consciousness .

Wundt coined the term "physiological psychology." This is a hybrid of physiology and psychology, or how the body affects the brain.

Other early contributors to the development and evolution of experimental psychology as we know it today include:

  • Gustav Fechner (1801-1887), who helped develop procedures for measuring sensations according to the size of the stimulus
  • Hermann von Helmholtz (1821-1894), who analyzed philosophical assumptions through research in an attempt to arrive at scientific conclusions
  • Franz Brentano (1838-1917), who called for a combination of first-person and third-person research methods when studying psychology
  • Georg Elias Müller (1850-1934), who performed an early experiment on attitude which involved the sensory discrimination of weights and revealed how anticipation can affect this discrimination

Key Terms to Know

To understand how the experimental method works, it is important to know some key terms.

Dependent Variable

The dependent variable is the effect that the experimenter is measuring. If a researcher was investigating how sleep influences test scores, for example, the test scores would be the dependent variable.

Independent Variable

The independent variable is the variable that the experimenter manipulates. In the previous example, the amount of sleep an individual gets would be the independent variable.

A hypothesis is a tentative statement or a guess about the possible relationship between two or more variables. In looking at how sleep influences test scores, the researcher might hypothesize that people who get more sleep will perform better on a math test the following day. The purpose of the experiment, then, is to either support or reject this hypothesis.

Operational definitions are necessary when performing an experiment. When we say that something is an independent or dependent variable, we must have a very clear and specific definition of the meaning and scope of that variable.

Extraneous Variables

Extraneous variables are other variables that may also affect the outcome of an experiment. Types of extraneous variables include participant variables, situational variables, demand characteristics, and experimenter effects. In some cases, researchers can take steps to control for extraneous variables.

Demand Characteristics

Demand characteristics are subtle hints that indicate what an experimenter is hoping to find in a psychology experiment. This can sometimes cause participants to alter their behavior, which can affect the results of the experiment.

Intervening Variables

Intervening variables are factors that can affect the relationship between two other variables. 

Confounding Variables

Confounding variables are variables that can affect the dependent variable, but that experimenters cannot control for. Confounding variables can make it difficult to determine if the effect was due to changes in the independent variable or if the confounding variable may have played a role.

Psychologists, like other scientists, use the scientific method when conducting an experiment. The scientific method is a set of procedures and principles that guide how scientists develop research questions, collect data, and come to conclusions.

The five basic steps of the experimental process are:

  • Identifying a problem to study
  • Devising the research protocol
  • Conducting the experiment
  • Analyzing the data collected
  • Sharing the findings (usually in writing or via presentation)

Most psychology students are expected to use the experimental method at some point in their academic careers. Learning how to conduct an experiment is important to understanding how psychologists prove and disprove theories in this field.

There are a few different types of experiments that researchers might use when studying psychology. Each has pros and cons depending on the participants being studied, the hypothesis, and the resources available to conduct the research.

Lab Experiments

Lab experiments are common in psychology because they allow experimenters more control over the variables. These experiments can also be easier for other researchers to replicate. The drawback of this research type is that what takes place in a lab is not always what takes place in the real world.

Field Experiments

Sometimes researchers opt to conduct their experiments in the field. For example, a social psychologist interested in researching prosocial behavior might have a person pretend to faint and observe how long it takes onlookers to respond.

This type of experiment can be a great way to see behavioral responses in realistic settings. But it is more difficult for researchers to control the many variables existing in these settings that could potentially influence the experiment's results.

Quasi-Experiments

While lab experiments are known as true experiments, researchers can also utilize a quasi-experiment. Quasi-experiments are often referred to as natural experiments because the researchers do not have true control over the independent variable.

A researcher looking at personality differences and birth order, for example, is not able to manipulate the independent variable in the situation (personality traits). Participants also cannot be randomly assigned because they naturally fall into pre-existing groups based on their birth order.

So why would a researcher use a quasi-experiment? This is a good choice in situations where scientists are interested in studying phenomena in natural, real-world settings. It's also beneficial if there are limits on research funds or time.

Field experiments can be either quasi-experiments or true experiments.

Examples of the Experimental Method in Use

The experimental method can provide insight into human thoughts and behaviors, Researchers use experiments to study many aspects of psychology.

A 2019 study investigated whether splitting attention between electronic devices and classroom lectures had an effect on college students' learning abilities. It found that dividing attention between these two mediums did not affect lecture comprehension. However, it did impact long-term retention of the lecture information, which affected students' exam performance.

An experiment used participants' eye movements and electroencephalogram (EEG) data to better understand cognitive processing differences between experts and novices. It found that experts had higher power in their theta brain waves than novices, suggesting that they also had a higher cognitive load.

A study looked at whether chatting online with a computer via a chatbot changed the positive effects of emotional disclosure often received when talking with an actual human. It found that the effects were the same in both cases.

One experimental study evaluated whether exercise timing impacts information recall. It found that engaging in exercise prior to performing a memory task helped improve participants' short-term memory abilities.

Sometimes researchers use the experimental method to get a bigger-picture view of psychological behaviors and impacts. For example, one 2018 study examined several lab experiments to learn more about the impact of various environmental factors on building occupant perceptions.

A 2020 study set out to determine the role that sensation-seeking plays in political violence. This research found that sensation-seeking individuals have a higher propensity for engaging in political violence. It also found that providing access to a more peaceful, yet still exciting political group helps reduce this effect.

While the experimental method can be a valuable tool for learning more about psychology and its impacts, it also comes with a few pitfalls.

Experiments may produce artificial results, which are difficult to apply to real-world situations. Similarly, researcher bias can impact the data collected. Results may not be able to be reproduced, meaning the results have low reliability .

Since humans are unpredictable and their behavior can be subjective, it can be hard to measure responses in an experiment. In addition, political pressure may alter the results. The subjects may not be a good representation of the population, or groups used may not be comparable.

And finally, since researchers are human too, results may be degraded due to human error.

What This Means For You

Every psychological research method has its pros and cons. The experimental method can help establish cause and effect, and it's also beneficial when research funds are limited or time is of the essence.

At the same time, it's essential to be aware of this method's pitfalls, such as how biases can affect the results or the potential for low reliability. Keeping these in mind can help you review and assess research studies more accurately, giving you a better idea of whether the results can be trusted or have limitations.

Colorado State University. Experimental and quasi-experimental research .

American Psychological Association. Experimental psychology studies human and animals .

Mayrhofer R, Kuhbandner C, Lindner C. The practice of experimental psychology: An inevitably postmodern endeavor . Front Psychol . 2021;11:612805. doi:10.3389/fpsyg.2020.612805

Mandler G. A History of Modern Experimental Psychology .

Stanford University. Wilhelm Maximilian Wundt . Stanford Encyclopedia of Philosophy.

Britannica. Gustav Fechner .

Britannica. Hermann von Helmholtz .

Meyer A, Hackert B, Weger U. Franz Brentano and the beginning of experimental psychology: implications for the study of psychological phenomena today . Psychol Res . 2018;82:245-254. doi:10.1007/s00426-016-0825-7

Britannica. Georg Elias Müller .

McCambridge J, de Bruin M, Witton J.  The effects of demand characteristics on research participant behaviours in non-laboratory settings: A systematic review .  PLoS ONE . 2012;7(6):e39116. doi:10.1371/journal.pone.0039116

Laboratory experiments . In: The Sage Encyclopedia of Communication Research Methods. Allen M, ed. SAGE Publications, Inc. doi:10.4135/9781483381411.n287

Schweizer M, Braun B, Milstone A. Research methods in healthcare epidemiology and antimicrobial stewardship — quasi-experimental designs . Infect Control Hosp Epidemiol . 2016;37(10):1135-1140. doi:10.1017/ice.2016.117

Glass A, Kang M. Dividing attention in the classroom reduces exam performance . Educ Psychol . 2019;39(3):395-408. doi:10.1080/01443410.2018.1489046

Keskin M, Ooms K, Dogru AO, De Maeyer P. Exploring the cognitive load of expert and novice map users using EEG and eye tracking . ISPRS Int J Geo-Inf . 2020;9(7):429. doi:10.3390.ijgi9070429

Ho A, Hancock J, Miner A. Psychological, relational, and emotional effects of self-disclosure after conversations with a chatbot . J Commun . 2018;68(4):712-733. doi:10.1093/joc/jqy026

Haynes IV J, Frith E, Sng E, Loprinzi P. Experimental effects of acute exercise on episodic memory function: Considerations for the timing of exercise . Psychol Rep . 2018;122(5):1744-1754. doi:10.1177/0033294118786688

Torresin S, Pernigotto G, Cappelletti F, Gasparella A. Combined effects of environmental factors on human perception and objective performance: A review of experimental laboratory works . Indoor Air . 2018;28(4):525-538. doi:10.1111/ina.12457

Schumpe BM, Belanger JJ, Moyano M, Nisa CF. The role of sensation seeking in political violence: An extension of the significance quest theory . J Personal Social Psychol . 2020;118(4):743-761. doi:10.1037/pspp0000223

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Experimental Method In Psychology

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

The experimental method involves the manipulation of variables to establish cause-and-effect relationships. The key features are controlled methods and the random allocation of participants into controlled and experimental groups .

What is an Experiment?

An experiment is an investigation in which a hypothesis is scientifically tested. An independent variable (the cause) is manipulated in an experiment, and the dependent variable (the effect) is measured; any extraneous variables are controlled.

An advantage is that experiments should be objective. The researcher’s views and opinions should not affect a study’s results. This is good as it makes the data more valid  and less biased.

There are three types of experiments you need to know:

1. Lab Experiment

A laboratory experiment in psychology is a research method in which the experimenter manipulates one or more independent variables and measures the effects on the dependent variable under controlled conditions.

A laboratory experiment is conducted under highly controlled conditions (not necessarily a laboratory) where accurate measurements are possible.

The researcher uses a standardized procedure to determine where the experiment will take place, at what time, with which participants, and in what circumstances.

Participants are randomly allocated to each independent variable group.

Examples are Milgram’s experiment on obedience and  Loftus and Palmer’s car crash study .

  • Strength : It is easier to replicate (i.e., copy) a laboratory experiment. This is because a standardized procedure is used.
  • Strength : They allow for precise control of extraneous and independent variables. This allows a cause-and-effect relationship to be established.
  • Limitation : The artificiality of the setting may produce unnatural behavior that does not reflect real life, i.e., low ecological validity. This means it would not be possible to generalize the findings to a real-life setting.
  • Limitation : Demand characteristics or experimenter effects may bias the results and become confounding variables .

2. Field Experiment

A field experiment is a research method in psychology that takes place in a natural, real-world setting. It is similar to a laboratory experiment in that the experimenter manipulates one or more independent variables and measures the effects on the dependent variable.

However, in a field experiment, the participants are unaware they are being studied, and the experimenter has less control over the extraneous variables .

Field experiments are often used to study social phenomena, such as altruism, obedience, and persuasion. They are also used to test the effectiveness of interventions in real-world settings, such as educational programs and public health campaigns.

An example is Holfing’s hospital study on obedience .

  • Strength : behavior in a field experiment is more likely to reflect real life because of its natural setting, i.e., higher ecological validity than a lab experiment.
  • Strength : Demand characteristics are less likely to affect the results, as participants may not know they are being studied. This occurs when the study is covert.
  • Limitation : There is less control over extraneous variables that might bias the results. This makes it difficult for another researcher to replicate the study in exactly the same way.

3. Natural Experiment

A natural experiment in psychology is a research method in which the experimenter observes the effects of a naturally occurring event or situation on the dependent variable without manipulating any variables.

Natural experiments are conducted in the day (i.e., real life) environment of the participants, but here, the experimenter has no control over the independent variable as it occurs naturally in real life.

Natural experiments are often used to study psychological phenomena that would be difficult or unethical to study in a laboratory setting, such as the effects of natural disasters, policy changes, or social movements.

For example, Hodges and Tizard’s attachment research (1989) compared the long-term development of children who have been adopted, fostered, or returned to their mothers with a control group of children who had spent all their lives in their biological families.

Here is a fictional example of a natural experiment in psychology:

Researchers might compare academic achievement rates among students born before and after a major policy change that increased funding for education.

In this case, the independent variable is the timing of the policy change, and the dependent variable is academic achievement. The researchers would not be able to manipulate the independent variable, but they could observe its effects on the dependent variable.

  • Strength : behavior in a natural experiment is more likely to reflect real life because of its natural setting, i.e., very high ecological validity.
  • Strength : Demand characteristics are less likely to affect the results, as participants may not know they are being studied.
  • Strength : It can be used in situations in which it would be ethically unacceptable to manipulate the independent variable, e.g., researching stress .
  • Limitation : They may be more expensive and time-consuming than lab experiments.
  • Limitation : There is no control over extraneous variables that might bias the results. This makes it difficult for another researcher to replicate the study in exactly the same way.

Key Terminology

Ecological validity.

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables which are not independent variables but could affect the results (DV) of the experiment. EVs should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of participating in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

Print Friendly, PDF & Email

Using experimental methods in higher education research

  • Published: March 2005
  • Volume 16 , pages 39–64, ( 2005 )

Cite this article

experimental method research articles

  • Steven M. Ross 1 ,
  • Gary R. Morrison 2 &
  • Deborah L. Lowther 1  

1119 Accesses

14 Citations

Explore all metrics

EXPERIMENTAL METHODS have been used extensively for many years to conduct research in education and psychology. However, applications of experiments to investigate technology and other instructional innovations in higher education settings have been relatively limited. The present paper examines ways in which experiments can be used productively by higher education researchers to increase the quality and rigor of studies. Specific topics include types of experiments, common validity threats, advantages and disadvantages of experiments, operational procedures for designing and conducting experiments, and reporting and disseminating results. Emphasis is given to helping prospective researchers evaluate the circumstances that favor or disfavor usage of experimental designs relative to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

experimental method research articles

Understanding the Complexities of Experimental Analysis in the Context of Higher Education

experimental method research articles

How Scientific Is Educational Psychology Research? The Increasing Trend of Squeezing Causality and Recommendations from Non-intervention Studies

Borg, W. R., Gall, J.P., & Gall, M.D. (1993). Applying educational research (3rd ed.). New York: Longman.

Google Scholar  

Campbell, D.T., & Stanley, J.C. (1963). Experimental and quasi-experimental designs for research and teaching. In N. L. Gage (Ed.), Handbook of research on teaching (pp. 171–246). Chicago, IL: Rand McNally.

Creswell, J.W. (2002). Educational research . Upper Saddle River, NJ: Pearson Education.

Clark, R. E. (Ed.). (2001). Learning from media: Arguments, analysis, and evidence . Greenwich, CT: Information Age.

Eisenhart, M., & Towne, L. (2003). Contestation and change in national policy on “scientifically based” education research. Educational Researcher , 32(7), 31–38.

Article   Google Scholar  

Feuer, M. J., Towne, L., & Shavelson, R.J. (2002). Scientific culture and educational research. Educational Researcher , 31(8), 4–14.

Glenn, D. (2004, May 28). No classroom left unstudied. The Chronicle of Higher Education , 50(38), p. A12.

Katsikitis, M., Hay, P.J., Barrett, R.J., & Wade, T. (2002, June 1) Problem-versus case-based approaches in teaching medical students about eating disorders: A controlled comparison. Education Psychology , 22(3), 277–283(7).

Lamwers, L.L., & Jazwinski, C. H. (1989). A Comparison of three strategies to reduce Student procrastination in PSI. Teaching of Psychology , 16, 8–12.

Onwuegbuzie, A., & Teddlie, C. (2003). A framework for analyzing data in mixed methods research. In A. Tashakkori & C. Teddlie (Eds.), Handbook of mixed methods in social and behavioral research (pp. 351–383). Thousand Oaks, CA: Sage.

Reiser, R.A. (1984). Reducing student procrastination in a personalized system of instruction course. ECTJ , 32, 41–49.

Robin, A.L. (1978). The timing of feedback in personalized instruction. Journal of Personalized Instruction , 3, 81–88.

Ross, S. M., & Morrison, G.R. (1991). Delivering your presentations at AECT. Tech Trends , 36, 66–68.

Ross, S. M., & Morrison, G.R. (1992). Getting started as a researcher: Designing and conducting research studies in instructional technology. Tech Trends , 36, 66–68.

Ross, S. M., & Morrison, G.R. (1993). How to get research studies published in professional journals. Tech Trends , 38, 29–33.

Ross, S. M., & Morrison, G.R. (2001). Getting started in educational technology research (3rd ed.). Bloomington, IN: Association for Educational Communications and Technology. Retrieved August 27, 2004 from https://www.aect.org/intranet/Publications/Research/index.html.

Ross, S. M., & Morrison, G.R. (2004). Experimental research methods. In D. H. Jonassen (Ed.). Handbook on educational communications and technology (2nd ed; pp. 1021–1043). Mahwah, NJ: Lawrence Erlbaum Associates.

Schafer, J.E. (1981). Effect of individualized goal-setting on college biology students’ locus of control. Journal of Research in Science Teaching , 18, 397–402.

Slavin, R. E. (2002). Evidence-based educational policies: Transforming educational practice and research. Educational Researcher , 31(7), 15–21.

Thompson, B. (1998). Review of What if there were no significance tests? Educational and Psychological Measurement , 58, 332–344.

Thompson, B. (2002). “Statistical,” “practical,” and “clinical”: How many kinds of significance do counselors need to consider? Journal of Counseling and Development , 80, 64–70.

Thyer, B.A. (1994). Successful publishing in scholarly journals . Thousand Oaks, CA: Sage.

U.S. Congress (2001). No Child Left Behind Act of 2001 . Washington, DC: Author.

Wainer, H., & Robinson, D. H. (2003). Shaping up practice of null hypothesis significance testing. Educational Researcher , 32(7), 22–30.

Weinstein, C.E. (1982). Training students to use elaboration learning strategies. Contemporary Educational Psychology , 7, 301–311.

Woolfolk, A. (2003). Educational psychology (9th ed.). New York: Allyn & Bacon.

Download references

Author information

Authors and affiliations.

The University of Memphis, USA

Steven M. Ross & Deborah L. Lowther

Old Dominion University, USA

Gary R. Morrison

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

About this article

Ross, S.M., Morrison, G.R. & Lowther, D.L. Using experimental methods in higher education research. J. Comput. High. Educ. 16 , 39–64 (2005). https://doi.org/10.1007/BF02961474

Download citation

Issue Date : March 2005

DOI : https://doi.org/10.1007/BF02961474

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • experimental studies
  • Find a journal
  • Publish with us
  • Track your research
  • Tools and Resources
  • Customer Services
  • Corrections
  • Crime, Media, and Popular Culture
  • Criminal Behavior
  • Criminological Theory
  • Critical Criminology
  • Geography of Crime
  • International Crime
  • Juvenile Justice
  • Prevention/Public Policy
  • Race, Ethnicity, and Crime
  • Research Methods
  • Victimology/Criminal Victimization
  • White Collar Crime
  • Women, Crime, and Justice
  • Share This Facebook LinkedIn Twitter

Article contents

Experimental methods in criminology.

  • Rylan Simpson Rylan Simpson Simon Fraser University
  • https://doi.org/10.1093/acrefore/9780190264079.013.841
  • Published online: 20 March 2024

Experimental methods have been a hallmark of the scientific enterprise since its inception. Over time, experiments have become much more sophisticated, complex, and nuanced. Experiments have also become much more diverse, and their use within research settings has expanded from the physical sciences to the social sciences, including criminology.

Within criminology, experimental methods can manifest in the form of laboratory experiments, field experiments, and quasi-experiments, each of which present their own strengths and weaknesses. Experimental methods can also be applied in the context of between-subject and within-subject paradigms, both of which exhibit unique characteristics and implications. Experimental methods—as a research method —are unique in their ability to help establish causal relationships among variables. This article introduces the topic of experimental methods in criminology, with a specific focus on the subfield of policing.

  • criminology
  • experimental criminology
  • experiments
  • field experiment
  • laboratory experiment
  • quasi-experiment
  • randomization
  • social science research

You do not currently have access to this article

Please login to access the full content.

Access to the full content requires a subscription

Printed from Oxford Research Encyclopedias, Criminology and Criminal Justice. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 14 May 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • Accessibility
  • [66.249.64.20|81.177.182.174]
  • 81.177.182.174

Character limit 500 /500

Loading metrics

Open Access

Peer-reviewed

Research Article

Network representation of multicellular activity in pancreatic islets: Technical considerations for functional connectivity analysis

Roles Conceptualization, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing

Affiliations Faculty of Natural Sciences and Mathematics, University of Maribor, Maribor, Slovenia, Faculty of Medicine, University of Maribor, Maribor, Slovenia

Roles Data curation, Investigation

Affiliation Department of Pediatrics, Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America

Affiliation Faculty of Medicine, University of Maribor, Maribor, Slovenia

Roles Data curation, Investigation, Writing – review & editing

Roles Investigation, Writing – review & editing

Roles Investigation, Resources, Writing – review & editing

Affiliation Department of Bioengineering, Barbara Davis Center for Diabetes, Aurora, Colorado, United States of America

Roles Conceptualization, Funding acquisition, Investigation, Validation, Writing – review & editing

* E-mail: [email protected] (AS); [email protected] (VK); [email protected] (MG)

Roles Conceptualization, Funding acquisition, Investigation, Resources, Validation, Writing – review & editing

Roles Conceptualization, Funding acquisition, Investigation, Methodology, Software, Supervision, Validation, Writing – original draft

Affiliations Faculty of Natural Sciences and Mathematics, University of Maribor, Maribor, Slovenia, Faculty of Medicine, University of Maribor, Maribor, Slovenia, Alma Mater Europaea, Maribor, Slovenia

ORCID logo

  • Marko Šterk, 
  • Yaowen Zhang, 
  • Viljem Pohorec, 
  • Eva Paradiž Leitgeb, 
  • Jurij Dolenšek, 
  • Richard K. P. Benninger, 
  • Andraž Stožer, 
  • Vira Kravets, 
  • Marko Gosak

PLOS

  • Published: May 13, 2024
  • https://doi.org/10.1371/journal.pcbi.1012130
  • Reader Comments

This is an uncorrected proof.

Fig 1

Within the islets of Langerhans, beta cells orchestrate synchronized insulin secretion, a pivotal aspect of metabolic homeostasis. Despite the inherent heterogeneity and multimodal activity of individual cells, intercellular coupling acts as a homogenizing force, enabling coordinated responses through the propagation of intercellular waves. Disruptions in this coordination are implicated in irregular insulin secretion, a hallmark of diabetes. Recently, innovative approaches, such as integrating multicellular calcium imaging with network analysis, have emerged for a quantitative assessment of the cellular activity in islets. However, different groups use distinct experimental preparations, microscopic techniques, apply different methods to process the measured signals and use various methods to derive functional connectivity patterns. This makes comparisons between findings and their integration into a bigger picture difficult and has led to disputes in functional connectivity interpretations. To address these issues, we present here a systematic analysis of how different approaches influence the network representation of islet activity. Our findings show that the choice of methods used to construct networks is not crucial, although care is needed when combining data from different islets. Conversely, the conclusions drawn from network analysis can be heavily affected by the pre-processing of the time series, the type of the oscillatory component in the signals, and by the experimental preparation. Our tutorial-like investigation aims to resolve interpretational issues, reconcile conflicting views, advance functional implications, and encourage researchers to adopt connectivity analysis. As we conclude, we outline challenges for future research, emphasizing the broader applicability of our conclusions to other tissues exhibiting complex multicellular dynamics.

Author summary

Islets of Langerhans, multicellular microorgans in the pancreas, are pivotal for whole-body energy homeostasis. Hundreds of beta cells within these networks synchronize to produce insulin, a crucial hormone for metabolic control. Coordinated activity disruptions in these multicellular networks contribute to irregular insulin secretion, a hallmark of diabetes. Recognizing the significance of collective activity, network science approaches have been increasingly applied in islet research. However, variations in experimental setups, imaging techniques, signal processing, and connectivity analysis methods across different research groups pose challenges for integrating findings into a comprehensive picture. Therefore, we present here a systematic analysis of various approaches impacting results in islet activity network representation. We find that methods for constructing functional connectivity maps aren’t critical, but caution is necessary when aggregating data from different islets. Network analysis conclusions are notably influenced by factors such as time series pre-processing, the oscillatory component of signals, and experimental preparation. Despite these challenges, this paper advocates for the adoption of connectivity analysis in future islet research, emphasizing that the insights gained extend beyond pancreatic islets to provide valuable contributions for understanding connectivity in other multicellular systems.

Citation: Šterk M, Zhang Y, Pohorec V, Leitgeb EP, Dolenšek J, Benninger RKP, et al. (2024) Network representation of multicellular activity in pancreatic islets: Technical considerations for functional connectivity analysis. PLoS Comput Biol 20(5): e1012130. https://doi.org/10.1371/journal.pcbi.1012130

Editor: Jonathan Rubin, University of Pittsburgh, UNITED STATES

Received: January 3, 2024; Accepted: May 2, 2024; Published: May 13, 2024

Copyright: © 2024 Šterk et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All code written in support of this publication is publicly available at https://github.com/MarkoSterk/beta_cell_analysis_suite .

Funding: The authors acknowledge the financial support provided from the Slovenian Research and Innovation Agency (grants num. P3-3096 (AS), J3-3077 (MG), N3-0133 (AS), IO-0029), by the Burroughs Wellcome Fund Grant (grant num. 25B1756 (VK)), and from Foundation for the National Institutes of Health (grant numb. R01 DK102950 (RKPB), R01 DK106412 (RKPB). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Proper insulin secretion and insulin sensitivity of peripheral tissues are crucial in regulating uptake and disposal of energy rich molecules, thereby sustaining metabolic homeostasis [ 1 ]. Pancreatic beta cells constitute part of a crucial negative feedback loop, sensing changes in plasma levels of energy-rich nutrients and accordingly adjusting release of insulin into the bloodstream [ 2 ]. The cascade of cellular events connecting the changes in plasma nutrient levels with the proper insulin secretion have been studied in detail [ 3 – 10 ]. The crucial steps in the stimulus-secretion coupling cascade involve an increase in intracellular ATP concentration, closure of ATP-sensitive potassium channels, membrane depolarization, opening of voltage-activated Ca 2+ channels and an increase in intracellular calcium concentration ([Ca 2+ ] i ), leading ultimately to exocytosis of insulin-containing vesicles. Beta cell response is further modulated by homo- and heterologous cell-to-cell interactions within islets [ 11 – 14 ], by autonomous nerve control [ 15 – 17 ], and by hormones released by the gut [ 18 – 20 ]. Of importance, beta cells display complex oscillatory activity and are intrinsically heterogeneous [ 8 , 21 , 22 ], with differences observed on molecular [ 23 ], morphological [ 24 , 25 ] and functional level [ 26 , 27 ], and it is only due to strong coupling within the islets that beta cells properly respond to glucose excursions.

In a coupled system of beta cells, glucose stimulation triggers two distinct and qualitatively different phases [ 8 , 27 – 32 ]. The initial response consists of a phasic increase in activity, characterized by membrane depolarization and increase in [Ca 2+ ] i which occurs sooner in higher glucose concentrations [ 27 , 33 ]. In case that the stimulus is still present, a complex tonic activity follows. This second phase is characterized by repetitive membrane potential and [Ca 2+ ] i oscillations, as well as pulses of insulin secretion. These oscillations are not generated randomly among cells. Rather, they are phase-lagged between cells, such that waves of membrane depolarization and [Ca 2+ ] i are formed, spreading from cell to cell from different wave-initiating cells near the islet periphery [ 13 , 16 , 34 – 36 ]. An increase in glucose concentration is coded as a fractional increase in activity within a time period, termed also relative active time [ 16 , 27 , 33 ] or duty cycle. The mechanistic substrate for such cohesive functioning of beta cells is intercellular communication via gap junction channels, consisting of connexin 36 (Cx36) [ 34 , 37 – 40 ]. Cx36 provides both metabolic and electrical coupling between spatially organized heterogenous beta cells [ 34 , 39 , 41 ]. While other mechanism, such as autonomic innervation [ 42 ], autocrine and paracrine [ 43 , 44 ] signaling also contribute to cell-cell communication, Cx36 have been shown to play the main role in synchronizing beta cell collectives and maintaining proper insulin secretion [ 45 , 46 ]. Indeed, expression of Cx36 is decreased in diabetic conditions [ 47 , 48 ] leading to desynchronisation of [Ca 2+ ] i oscillations and perturbations in pulsatile insulin secretion [ 37 , 40 , 49 – 52 ]. Hence, the gap-junctional connections among beta cells are imperative for optimal beta cell function and comprehending their collective dynamics holds significance in elucidating the mechanisms underlying diabetes pathogenesis and its treatment.

Due to their highly heterogeneous nature, the presence of distinct subpopulations, and an ever-changing environment, beta cells display intricate yet coherent intercellular activity patterns [ 8 , 53 ]. Because coordinated intercellular activity is not only crucial for tightly regulated insulin secretion but is also known to be altered in diabetes, researchers are investing considerable effort in describing and studying how collective rhythmicity is established in beta cell populations and how the underlying mechanisms change in disease. In recent years, the emergence of network analyses has provided a promising tool for evaluating data obtained through advanced multicellular imaging, with the goal to objectively characterize collective activity in islets [ 16 , 42 , 54 – 58 ]. In this approach, individual cells serve as nodes, and their positions correspond to their physical locations within the tissue. The connections between cells reflect functional associations and are determined based on the temporal similarity of the measured cellular dynamics, most often [Ca 2+ ] i activity [ 56 ]. The application of network approaches has uncovered a modular organization in the functional beta cell networks, that exhibit greater heterogeneity than anticipated in a gap junction coupled syncytium. The identified indicators of small-worldness and a heavy-tailed degree distribution imply the existence of highly connected cells, called hubs [ 54 , 56 ]. Although their precise function remains somewhat enigmatic, these hubs are believed to represent a subpopulation with distinct attributes that confer upon them an above-average impact on the synchronized behavior [ 8 , 13 , 55 , 59 – 61 ]. Furthermore, the collective responses to stimulation and the mediation of intercellular signals were also found to be influenced by other beta cell subpopulations. Specifically, the first responder cells were found crucial in mediating the responses to increasing stimulation during first phase of the islet’s response [ 57 ], whilst the wave initiator cells act as triggers of intercellular signals that synchronize the cells [ 13 , 34 , 62 ], being thereby presumably implicated in the regulation of pulsatile insulin release during the second phase [ 14 , 41 ]. In recent years, advanced methodological approaches, including optogenetics, photopharmacological methods, and RNA sequencing, along with network analyses, have unveiled specific characteristics within these subpopulations [ 8 , 13 , 41 , 54 , 63 ]. Acknowledging their unique attributes and significant contribution to shaping overall islet activity, there is a growing interest in their role in diabetes development [ 45 , 54 , 64 ].For this reason, it becomes even more important to precisely define these subpopulations, and objectively determine them through network analyses.

Nevertheless, due to variations in experimental preparations, microscopic imaging techniques, the nature of recorded signals, the following signal processing techniques, and the methods for deriving functional connectivity patterns that are employed by different research groups, comparing findings and integrating them into a comprehensive bigger picture becomes challenging even for experts in islet research. Additionally, the introduction of new terminology has further contributed to disputes in data interpretation, as well as to apparent contradictions regarding functional connectivity and the role of different beta cell subpopulations, which can be in part attributed to aforementioned methodological discrepancies [ 8 , 13 , 27 , 53 , 65 – 67 ]. To at least partly address these issues, we present here a systematic analysis of how different experimental designs and computational approaches impact the results obtained from network representations of multicellular islet activity. Specifically, we analyze how the results are affected by different methods used to evaluate coordinated cellular behavior and network construction, different timescales of observed oscillatory calcium activity, different mouse strains used for tissue slice preparation, and the type of experimental preparation (i.e., tissue slices vs. isolated islets). All of the above represents some of the most prevalent genuine variations due to the diverse nature of work, experimental techniques, and the availability of equipment in laboratories worldwide.

The role of different methods for the evaluation of time series similarities

We start by examining the effect of the type of time series similarity measure used to extract functional beta cell networks. We analyzed the beta cell [Ca 2+ ] i responses to glucose stimulation obtained by means of multicellular confocal imaging in acute tissue slices from NMRI mice. The stimulatory glucose concentration was 12 mM and a 15-minute interval of sustained oscillatory activity (i.e., plateau phase) was used for the analysis, as indicated in Fig 1A . Fig 1B shows the extracted functional networks obtained by three different techniques: Pearson correlation coefficient (left panel, red), coactivity coefficient (blue, middle-left panel), and mutual information (purple, middle-right panel). A variable threshold was used so that roughly the same average node degrees ( k avg between 8 and 9) were obtained in all three networks, facilitating a robust inter-network comparison. The comparison of methods for constructing networks from similarity matrices is analyzed separately in continuation. The right-most panel of Fig 1B shows calculated network parameters. Upon visual inspection of the networks and their corresponding parameters we can see a high degree of similarity between all displayed networks. Consistent with previous findings, all the networks exhibit high levels of clustering, modularity, and small-worldness [ 16 , 27 , 56 , 68 – 70 ]. In Fig 1C and 1D the degree and edge length distributions of the same networks as in Fig 1B are presented, and Fig 1E shows the calculated internetwork similarities (see Methods section for details). All three panels further underline the observed resemblance between the networks extracted from different methods. Furthermore, the similarity in the degree distribution of all three networks suggests comparable variations in the number of functional connections, and, in addition, the level of heterogeneity indicates the presence of hub cells in all three cases.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

(A) Average signals of unprocessed (black line) and fast oscillatory activity (grey line) (upper panel) and the raster plot (lower panel) showing the binarized fast beta cell dynamics of all cells in the slice. (B) Functional networks designed with fixed average network node degrees ( k avg ≈ 8) based on three distinct time series similarity measures, and the corresponding network parameters. The correlation method (red) is represented on the left, the coactivity method (blue) in the middle, and the mutual information method (purple) on the right. Colored dots indicate physical locations of cells within islets, while grey lines represent functional connections between them. The table on the right shows the average network node degree ( k avg ), average clustering coefficient ( C avg ), modularity ( Q ), global efficiency ( E ), relative largest component ( S max ), and the small-world coefficient ( SW ) for each network. Degree distributions (C) and distribution of functional connection lengths (D) for networks presented in panel A. Boxes on panels (D) determine the 25 th and 75 th percentile, whiskers denote the 10 th and 90 th percentile, and the lines within boxes indicate the median values. E) Jaccard internetwork similarity of the three networks extracted from different methods. (F) Pairwise comparison of node degrees in all networks from panel (B). Gray dots represent the node degree of the same cells in the graphs. (G) Relative active time as a function of node degree in networks built with the correlation method (red), coactivity method (blue) and mutual information method (purple). Dots denote average values and bars denote the standard error.

https://doi.org/10.1371/journal.pcbi.1012130.g001

To investigate the level of similarity between networks further, we present the pair-wise relationships between the node degrees of the same cells in all three constructed networks in Fig 1F . A clear relationship can be observed, with cells having a high degree in one network also having a high degree in the other. There is a consistent trend of matching connection numbers between different networks. The highest overlap is noticed between the coactivity and mutual information, as both these methods rely on binarized signals. Furthermore, previous studies have indicated that there is a tendency that the cells with many functional connections exhibit a higher-than-average activity [ 13 , 27 ]. This characteristic was alluded to when describing hub cell [Ca 2+ ] i dynamics as “preceding and outlasting that of the follower cells” in Johnston et al, 2016 [ 54 ]. As such, hub cells typically manifest durations of oscillations exceeding the average, which contributes to higher cellular activity. Nevertheless, it is crucial to emphasize that this does not inherently suggest the role of hubs as wave initiators. In other words, the cells with longest oscillations do not necessarily initiate the intercellular waves. Subsequent analyses conducted on more extensive datasets in human [ 61 ] and mouse [ 13 ] islets revealed a distinct lack of overlap between wave initiators and hubs, although they confirmed that hubs tend to have the longest oscillations. Here, we aimed to confirm whether a comparable relation between relative active time and node degrees can be obtained when employing different techniques to quantify the similarity between [Ca 2+ ] i signals. In Fig 1G the relationships between the relative active time and node degree of the same cells are depicted. For all three methods, very similar trends are observed. Specifically, there is a positive relationship between the relative active times of cells and their corresponding node degrees, which indicates that the extracted functional relationships are roughly independent of the methods used to evaluate synchronous cellular activity.

The role of different techniques used for network construction

Since no significant differences were found among various methods for evaluating intercellular synchronicity, we proceeded with the correlation method in subsequent analyses, which also has the advantage of not requiring signal binarization. In Fig 2 we explore the influence of network construction methods on the functional beta cell network structure and their relationship with the cellular activity parameter. Fig 2A displays functional networks of three different islets (rows) constructed using a fixed similarity threshold (left column), fixed average node degree (middle column), and the multilayer minimum spanning tree (MST) technique (right column). Due to the differences in [Ca 2+ ] i signals, such as differences in overall activity, nature of [Ca 2+ ] i signals (e.g., fast electrical vs. slow metabolic oscillations, see below for more details), presence of noise, etc., the average node degrees of networks constructed with the fixed similarity thresholds can vary greatly (between 4.4 and 17.5 in the three islets analyzed, see red networks in Fig 2A ), whereas the average degrees for the other two techniques are fixed around 8 (see blue and purple networks in Fig 2A ). In Fig 2B, 2C and 2D , we present the relationships between the relative active times for the data pooled from all three islets. Notably, a positive correlation between node degree and relative active time is inferred only for the variable threshold, i.e., a fixed average degree, and multilayer MST techniques. This tendency of hub cells exhibiting higher-than-average activity is in accordance with previous reports. In contrast, this relationship is apparently blurred for the fixed threshold technique due to variations in the number of connections along with differences in intrinsic activities among different islets, and hence even an opposite trend can be obtained, despite the fact that the relation is positive in all individual islets.

thumbnail

(A) Networks of three different islets (rows) constructed with three distinct construction methods (columns) with indicated average network node degrees ( k avg ). Construction methods are fixed correlation threshold (left), fixed average network node degree (middle), and multilayer minimum-spanning tree (right). (B-D) Relative active time of cells as a function of the relative node degree for networks constructed with the fixed correlation threshold method (B; red), fixed average network node degree method (C; blue) and multilayer minimum-spanning tree method (D; purple). Dots indicate the average values of cells within the same degree intervals and the error bars the corresponding SE. Note that the node degrees were normalized to facilitate the comparison of different islets.

https://doi.org/10.1371/journal.pcbi.1012130.g002

To further illustrate the challenge of pooling data from different islets, we present experimental data in Fig 3 , where islets were subjected to stimulatory glucose in the first interval and subsequently treated with the GLP-1 receptor agonist exendin-4 (Ex-4) in the second interval. The precise protocol and the average unprocessed Ca 2+ signal of all cells is shown in Fig 3A . Previous research has demonstrated that Ex-4 increases the density of functional beta cell networks [ 71 ], and our results validate this observation under both fixed threshold R th and fixed average degree k avg methods ( Fig 3B ). However, the fixed R th approach encapsulates substantial heterogeneity among islets, obscuring the effects of Ex-4 stimulation when aggregating data across multiple islets, despite discernible trends at the individual islet level. In such scenarios, employing a fixed k avg technique proves to be a more appropriate option for analyzing network differences induced by the pharmacological agent. Specifically, utilizing a variable threshold to maintain a consistent average degree in the initial interval mitigates inter-islet heterogeneity. Subsequently applying the same threshold in the second interval enables an unbiased assessment of the pharmacological intervention, normalized by the network characteristics observed in the first interval of each respective islet. This not only facilitates a robust comparison of network parameters across different islets but also ensures a more accurate statistical evaluation, as demonstrated in Fig 3C .

thumbnail

A) Average Ca 2+ signals from all cells within a representative islet are depicted, stimulated with 10 mM glucose and 10 mM glucose + the GLP-1 receptor agonist exendin-4 (Ex-4), with specified intervals for network analysis. B) Functional networks were constructed for two intervals (Interval 1: 10 mM glucose only; Interval 2: 10 mM glucose + 20 nM Ex-4) across two different islets, utilizing two distinct network design techniques (fixed R th = 0.75 and fixed average degree with k avg (Int. 1) = 8). The use of the fixed R th method resulted in significant variations in network density between the two islets, complicating the comparison of network metrics. Conversely, fixing the average degree in Interval 1 normalized inherent differences in overall coherence of intercellular Ca 2+ activity, facilitating the assessment of the pharmacological manipulation effect across different islets. C) To illustrate the issue posed by the fixed R th method and the resulting high disparities in network densities, we compared the pooling of data from 10 islets subjected to closely matched protocols using both thresholding techniques (i.e., fixed R th and fixed k avg ). While both methods revealed a denser network in Interval 2 in response to Ex-4, these differences were almost completely masked by inter-islet variability inherent in the fixed R th method. In such scenarios, normalizing the average degree proves to be the superior approach, as it facilitates a robust evaluation of data from different islets. Data used for this analysis is from Ref [ 71 ].

https://doi.org/10.1371/journal.pcbi.1012130.g003

It is worth noting, however, that in some cases, the fixed R th method is the more suitable choice. It is known that with gradually increasing glucose concentration, both activity and intercellular communication levels increase, leading to denser functional networks in these cases [ 72 ]. When using fixed k avg or multilayer MST methods, which impose a specific number of connections regardless of the nature of activity, such networks do not differ in the number of connections, which is incorrect in these scenarios. Moreover, under conditions of low stimulation, an unjustifiably large number of connections is obtained due to, for example, a low threshold, which does not reflect correlations in dynamics but rather random associations. Such an example is illustrated in S1 Fig .

The role of the mouse strain used in tissue slice preparation

Laboratory mice are a vital source of islets of Langerhans in beta cell physiology research; however, various laboratories employ various mouse models. Previous research has indicated that there is a considerable phenotypic variation between different mouse strains as well as substrains of the inbred strains [ 73 – 75 ] which manifest themselves also in beta cell responses to glucose and [Ca 2+ ] i signalling characteristics [ 33 , 76 ]. For that reason, we investigate here whether the functional beta cell network structure extracted from multicellular [Ca 2+ ] i recordings in tissue slices depends on the mouse strain. To this purpose, we compared the beta cell networks from outbred NMRI mice and inbred C57BL/6J mice. We used the correlation method to evaluate similarity between [Ca 2+ ] i signals and the fixed average degree method ( k avg = 8) to construct networks. In all recordings we used a 6-10-6 mM glucose protocol, as presented in Fig 4A . Intervals of 10–20 min sustained activity in the plateau phase were then used for the analysis. In Fig 4B we show typical networks from both strains, which, upon visual inspection, exhibit a rather similar topological organization. To provide a more detailed and quantitative insight, we computed various network metrics from pooled data from multiple islets. Results in Fig 4C, 4D and 4E indicate that the edge length, clustering coefficient, and degree distributions are very similar. Furthermore, the computation of network parameters presented in Fig 4F has revealed that beta cell networks from different mouse strains exhibit a similar degree of functional segregation, efficiency, and small-worldness; none of the results were identified as significant.

thumbnail

(A) Average signals of unprocessed and fast oscillatory activity and the raster plot showing the binarized fast beta cell dynamics of all cells in slices from NMRI mice islets (upper panel, blue) and C57BL/6J mice islets (lower panel, purple). (B) Functional networks derived from representative recordings in islet from NMRI (blue, upper panel) and C57BL/6J (purple, lower panel) mice. (C) Edge length distributions, (D) clustering coefficient distributions, and (E) node degree distributions from a pooled data set from NMRI (blue) and BL6J (purple) mouse recordings. (F) Network parameters for extracted networks from NMRI and BL6J mouse recordings: modularity (left), relative largest component (middle left), global efficiency (middle right), and small-worldness coefficient (right). Dots represent values of individual recordings with horizontal lines indicating median values. Boxes on panels (B) and (C) determine the 25 th and 75 th percentile, whiskers denote the 10 th and 90 th percentile and the lines within boxes indicate median values. Data was pooled from islets/cells: 6/779 (NMRI), 6/617 (C57BL/6J). In all recordings, the islets were stimulated with 10 mM glucose and 10–20 minute intervals in the plateau phase were used for the analysis.

https://doi.org/10.1371/journal.pcbi.1012130.g004

The role of different time scales of oscillatory [Ca 2+ ] i activity and time series preparation

Next, we investigate how the type of oscillatory activity and signal preparation impact the functional beta cell network topology. To this purpose, we performed prolonged multicellular imaging in tissue slices from NMRI mice. Fig 5A displays an average [Ca 2+ ] i signal of all cells in a representative islet under stimulation with 8 mM glucose. Three different temporal traces are presented: the unprocessed (i.e., raw recorded) signal (top, red), the filtered slow oscillations (middle, purple), and the filtered fast oscillations (middle, blue).The fast and slow oscillations principally represent the electrical and metabolic activity of cells, respectively [ 77 ]. The lower panels in Fig 5A feature raster plots depicting binarized activity of the slow and fast oscillatory component. Notably, both types of oscillatory activity exhibit distinct, regular patterns. In Fig 5B we present correlation-based functional networks constructed with the fixed average degree technique for the three distinct signal types. A visual assessment points out a clear difference between the three extracted networks. The fast oscillation-based network (middle panel) exhibits shorter edge lengths and a more clustered, localized, structure, while the slow oscillation-based network (right panel) shows more long-range edges and a less clustered structure. A quantitative assessment of the networks confirms the observed differences. The slow oscillatory component network is more heterogeneous, less clustered and exhibits longer functional connections ( Fig 5C, 5D and 5E ). The reason for this is in the type of cellular dynamics the networks encode. The fast oscillations are representative of the electrical activity of cells, which is mediated by gap-junction-driven intercellular waves and thus contributes to the shorter, more clustered network structure which is quite similar to the underlying physical network. On the other hand, the slow component signal is associated with cellular metabolism which is to a greater extent affected by the similarity of intrinsic metabolic characteristics of cells and less by cell-to-cell coupling [ 56 , 70 , 78 , 79 ]. Interestingly, the raw-signal-derived functional network appears to be poised in between, which is somehow expected, as it encompasses both types of oscillatory activity. To evaluate the properties of different networks further, we quantified the extracted functional connectivity patterns using conventional network metrics ( Fig 5F ). The results indicate that the networks derived from different dynamical components have comparable values of the small-world coefficient and the relative largest component, but there are profound differences in modularity and global efficiency. Namely, the fast oscillations-derived network is more segregated and exhibits lower efficiency, primarily due to the less pronounced long-range connections. Moreover, we present in Fig 5G the relationship between the relative active time of cells and their corresponding node degrees in all three types of networks. The tendency of hub cells being the most active is most pronounced in the case of fast oscillations, whereas the relation is less apparent for the raw and slow component. Notably, the latter aligns with recent theoretical predictions [ 78 ]. Finally, we assess the similarities between the three networks and present in Fig 5H the pair-wise relationships between the node degrees in different networks. The results indicate that the strongest relation exists between the fast and raw oscillatory signals, while the relationship is the weakest between the fast and the slow component. To investigate this in further detail, we quantified the overlap between different networks, including the hypothesized structural network that was modeled as a geometric network in which nearby cells are connected. From Fig 5I , we can observe a substantial similarity in both inter-network similarity and overlap of hub cells between the unprocessed signal and the signals of both oscillatory components, with a higher level of similarity observed in the fast component. This is expected in signals from slices, as the fast component is very pronounced. However, the key point is that the highest level of similarity between the structural network and the functional networks is obtained from fast oscillations, while the similarity between the structural and slow networks is substantially lower. Similarly, the connection between the fast and slow component-derived networks is relatively low, as previously indicated by the results in Fig 5H . These quantitative results can be further visually assessed with the illustrations in S2 Fig , depicting all four types of networks for all 5 islets included in the analysis. It can be observed that the networks of unprocessed signals and signals of the slow component contain many long-range connections, while those in the fast component network are significantly fewer, making it visually more similar to the structural network. Importantly, fast oscillations may be more strongly determined by slow oscillations, such as in the case of compound oscillations [ 80 , 81 ]. Such an example is depicted in S3 Fig and in this case, the functional network based on slow oscillations rather than fast oscillations is most similar to the functional network based on the raw signal. However, the functional network based on fast oscillations remains the one that is most similar to the structural network ( S3B and S3C Fig ).

thumbnail

(A) Unprocessed (red), fast-component only (blue), and slow-component only (purple) average [Ca 2+ ] i signal of all cells in the islet from acute tissue slice from NMRI mouse. The lower panels display raster plots that represent the binarized activity of the slow and fast oscillatory components. (B) Functional networks designed based on raw cellular signals (left), fast-component only signals (middle), and slow-component only signals (right). Networks were constructed with the fixed average network node degree method ( k avg ≈8.0) based on time series correlations as the similarity measure. Distribution of node degrees (C), clustering coefficients (D) and functional connection lengths (E) for the three networks presented in panel B. (F) Network parameters extracted from functional connectivity maps derived from different oscillatory components. (G) Relative active time of cells as a function of their corresponding node degrees in networks constructed based on raw signals (red), fast-component only signals (blue) and slow-component only signals (purple). Colored dots represent average values of cells within the same degree intervals and the error bars denote SE. Individual values were normalized by the average value of the relative active time within the given islet so to ease comparison between different islets. (H) The pairwise relationships between node degrees in different networks. The grey dots denote values from individual cells and the black line indicates the linear fit, whereby R 2 indicates goodness-of-fit. I) Similarity between different types of networks (left) and the relative overlap of hub cells (right), identified as the top 1/6 of the most connected cells. The structural networks were modeled as equivalent geometric networks, in which nearby cells are deemed connected (see Materials and Methods and S4 Fig ). Boxes in panels (D) and (E) determine the 25 th and 75 th percentile, whiskers denote the 10 th and 90 th percentile and the horizontal lines within boxes indicate the median values. Dots in panel (F) indicate the values from individual islets and the horizontal line denote the median. Stars denote statistical differences; *p<0.05,**p<0.01. Data presented in panels (F-I) is based on 5 different islets.

https://doi.org/10.1371/journal.pcbi.1012130.g005

Functional connectivity networks in isolated islets

In addition to acute tissue slices, isolated islets play a prevalent role in beta cell research, including in the context of collective activity network analyses. Thus, we proceed with analyzing the nature of multicellular dynamics and the underlying functional networks within islets isolated from C57BL/6J mice. In Fig 6A , we present the responses of a representative isolated islet upon transitioning from 2 mM to 11 mM glucose. The cells exhibit an initial, profound elevation in [Ca 2+ ] i levels, followed by the emergence of coordinated [Ca 2+ ] i oscillations after approximately 8–10 minutes. The raster plots indicate that these oscillations frequently span the entire islet. The functional network extracted from the phase of sustained oscillatory activity, constructed based on time series correlation as the similarity measure along with the fixed average network node degree method, is shown in Fig 6B . The characterization of beta cell networks was based on 5 different isolated islets subjected to the same protocol. In the table shown in Fig 6C the average values of network parameters are provided and Fig 6D shows the pooled degree distributions. We can observe that the topological parameters of networks from isolated islets do not differ much from those in slice-based networks: they are quite modular and exhibit features of small-world networks. However, upon visually evaluating the network illustrated in Fig 6B and considering the characteristics of clustering coefficient ( Fig 6E ) and functional connection length distributions ( Fig 6F ), it becomes evident that the networks observed in isolated islets exhibit properties that are more similar to the networks characterized by slow activity in slices. Note that for comparison the data on fast and slow activity-derived networks from slices from C57BL/6J mice are provided separately. For this comparison, the same dataset was used as in Fig 4 , where also the stimulatory glucose concertation was similar (i.e., 10 mM). In contrast to fast oscillation-based networks in slices, isolated islet networks manifest a higher efficiency, a reduced modularity, and low clustering coefficient values. Moreover, the distribution of relative connection lengths indicates that there is a larger fraction of long-range connections in isolated islets. All these attributes can be observed in slow oscillation-based networks in slices. Notably, within isolated islets, a discernible trend emerges where cells with an increased number of functional connections consistently demonstrate higher relative active times—reminiscent of the observed behavior in slices—regardless of the temporal aspect ( Fig 6G ).

thumbnail

(A) Average [Ca 2+ ] i signal of a representative isolated islet recording with indicated plateau phase for signal analysis (upper panel) and corresponding binarized oscillatory activity of all cells in the recording (lower panel). (B) Extracted functional network based on cellular signals in panel (A) constructed with the fixed average network node degree method with an average network node degree k avg ≈8.0. Green dots represent physical locations of cells within the islet and grey lines indicate functional connections between them. (C) Extracted average functional network parameters: average network node degree ( k avg ), average clustering coefficient ( C avg ), modularity ( Q ), global efficiency ( E ), average shortest path length ( L avg ), small-world coefficient ( SW ), and relative largest component ( S max ). Degree distributions of all extracted functional networks (D), and corresponding distributions of clustering coefficients (E), and relative edge lengths (F). To ease comparison between different islets, the physical lengths of connections were normalized with the average distance to the eight nearest neighbors. Additionally, in panels (D-F) data illustrating network attributes derived from fast and slow activities in slices from C57BL/6J mice are presented for comparison. (G) Relative active time as a function of node degree for all extracted functional networks. Boxes on panels (E-F) determine the 25 th and 75 th percentile, whiskers denote the 10 th and 90 th percentile, and the lines within boxes indicate the median values. Dots in panel (G) represent average values and vertical bars denote the standard error. Data for panels (C-G) for isolated islets was pooled from islets/cells: 5/468 and for slices the same dataset was used as in Fig 4 (islets/cells: 6/617). *p<0.05,**p<0.01, ***p<0.001.

https://doi.org/10.1371/journal.pcbi.1012130.g006

Furthermore, to further assess the differences and similarities between beta cell networks from slices and isolated islets and how they relate to different types of oscillatory activity, we present in S3 Fig an analysis of an isolated islet where the fast component of oscillations was relatively well present, which is frequency-wise highly comparable to that in tissue slices. This enabled the separate consideration of individual oscillatory components, and similarly to tissue slices, it was found in this case as well that there is a significant similarity between the structural network and the functional network obtained from the fast component, while the similarity between the slow and structural is considerably lower. It is also worth mentioning that in isolated islets, there is much greater similarity between networks derived from unprocessed signals and the slow component, whereas in tissue slices, there is greater similarity between networks based on unprocessed signals and the fast component. The reason for this is that in tissue slices, fast oscillations are the more dominant type of signal, while in isolated islets, slow oscillatory activity prevails.

Functional connectivity analysis is a powerful tool applicable to studying the interactions between different components in a plethora of real-life systems. In recent years, it is becoming increasingly more popular to describe interactions between individual cells, particularly within the islets (for review see [ 56 ]). However, due to relatively demanding computational approaches, encompassing both data extraction and subsequent analyses of coordinated functioning, obtaining patterns of functional connectivity is not straightforward and can easily become ambiguous. In neuroscience, where the greatest progress in this field has been made, it has become evident that objectively assessing connectivity patterns is challenged by various objective reasons tied to experimental variations and computational methodologies, such as thresholding techniques [ 82 – 84 ], techniques used for data pooling [ 85 , 86 ], number of sensors used to record brain activity [ 87 , 88 ], and the selection of frequency intervals [ 89 , 90 ]. Most importantly, similar issues are witnessed in the network-based analysis of spatiotemporal cellular dynamics in islets. More specifically, different research groups employ diverse experimental techniques and preparations leading to discrepancies in types of oscillatory signals and the multicellular activity is recorded at varying spatial and temporal resolutions. There are also variations in how recordings are preprocessed before network analysis, as well as in the techniques used for the analysis itself. These, along with some terminological discrepancies in the scientific literature, are the primary reasons why we chose to investigate how various factors influence network analyses and their interpretation.

First, we evaluated the role of metrics that are used for the evaluation of synchronized activity between the measured cellular dynamics. We compared three different methods, namely one that is based directly on the recorded [Ca 2+ ] i activity (Pearson’s correlation), and two that are based on binarized time series (coactivity and mutual information). It turned out that irrespective of the method used to quantify synchronous behavior, similar networks are obtained, characterized by small-worldness, modularity, high degree of clustering, a heavy-tailed degree distribution which indicates the presence of hub cells, and a similar relation between the relative active time and the node degree (see Fig 1 ). Another crucial aspect in the process of extracting functional connectivity maps involves the thresholding of similarity matrices. As highlighted in Figs 2A and 3B , utilizing a fixed threshold can yield significant disparities among different islets, potentially introducing biases into the relations drawn from aggregated data. To mitigate this concern, using a variable threshold and a fixed average degree proves advantageous. With this approach we can firmly evaluate the effect of pharmacological interventions or extract the relations between network and classical physiological parameters when data is pooled from multiple islets, as a variable threshold can mask the inter-islet heterogeneity (see Figs 2C and 3 ). Specifically in multi-phase experiments, where consecutive intervals have to be analyzed [ 71 , 91 ], application of a variable threshold has proven beneficial, as it overcomes inter-islet variability. For example, by establishing the variable threshold based on the first interval, thereby maintaining a fixed average node degree, one can consistently apply the same threshold to construct networks during the second interval. This normalization procedure facilitates an objective assessment of alterations in islet network structure, despite inherent differences in networks from different islets ( Fig 3 ). It is important to note, however, that this method has a limitation: its fixed average number of connections prevents it from capturing the variations in overall synchronicity that are depicted by the network density. For instance, it is known that an increase in glucose concentration leads to increased and more global spatiotemporal activity, resulting in denser functional networks [ 27 , 72 ]. If a fixed average degree is then employed, these differences become obscured, and in conditions of low stimulation, numerous connections emerge that lack statistical significance. This occurs because, with a low threshold, these connections predominantly signify random associations rather than synchronized activity ( S1 Fig ). In this study we have also introduced a third option encompassing the construction of functional networks through a multilayered MST. A notable advantage of this method lies in its absence of explicit thresholding, with the singular free parameter being the number of layers, which in turn specify the average degree. Nonetheless, the drawback of the minimum spanning tree method is that it enforces at least one connection to each cell (or more in case of multilayered MST), so that even the cells which are completely desynchronized can have a comparable number of functional connections as an average cell. Therefore, while the method is attractive for its apparent objectivity, its appropriateness diminishes when the signals are rather heterogeneous and if there are subpopulations of cells whose dynamics are weakly or not at all correlated with the rest of the cells (such as those of alpha cells, see S4 Fig ). To sum up at this point, the choice of the best method to construct networks is not always straightforward and may depend on the context, i.e., both the experimental protocol and the parameters we want to objectively describe through network analysis. In doing so, we must, of course, be aware of the strengths and weaknesses of different approaches.

In previous studies, variations in glucose-induced [Ca 2+ ] i activity among different mouse strains and substrains have been reported. Compared to outbred NMRI mice, cells from the inbred C57BL/6J and C57BL/6N substrains show a rightward shift in activations and earlier deactivations. In addition, during the plateau phase, the encoding mechanisms to enhance calcium activity in response to glucose differ quantitatively in all three groups [ 33 ]. Secretagogues other than glucose also cause [Ca 2+ ] i oscillations to vary greatly [ 76 ]. Generally, however, there are similarities between C57BL/6J, C57BL/6N, and NMRI mice in the sense that all three groups showed glucose-dependent activation and deactivation responses, as well as a 3% increase in relative active time per millimole of glucose [ 33 ]. Notably, up until now, differences between strains of mice at the level of multicellular activity have not been studied. In this study, we addressed these questions using network analyses and found that the functional networks of islets in different mice are structurally very similar. Apparently, the mechanisms that coordinate fast oscillatory activity across the islets from NMRI or C57BL/6N mice, i.e., gap-junction mediated depolarization and [Ca 2+ ] i waves, are the same and do not differ between mouse strains.

In response to glucose and many other secretagogues, electrical activity, intracellular calcium, and insulin secretion oscillate in synchrony at two different time scales [ 92 , 93 ]. The first are the so-called metabolic or slow oscillations with a frequency of around 0.1–0.2 min -1 , and the second the so-called electrical or fast oscillations with a frequency of around 1–5 min -1 [ 53 , 94 ]. Noteworthy, fast oscillations show variations and have the highest frequency rates around the peaks of the slow component and the lowest around the nadirs [ 77 , 95 , 96 ]. Additionally, the relative active time or duty cycle of the fast component characteristically increases with increasing stimulation, whereas the frequency of slow oscillations remain unaltered [ 27 , 33 , 96 – 98 ]. According to the recent metronome model of beta cell function, slow oscillations set the pace for insulin pulses, whereas the fast oscillations fine-tune their amplitude [ 94 ]. Both slow and fast oscillations are phase-locked between different beta cells within a given islet by means of intercellular waves [ 14 , 34 , 35 , 55 , 62 , 68 ]. In accordance with this, the average correlation between calcium traces of different cells from the same islet decreases with intercellular distance for both the slow and the fast component, implying that intercellular coupling mediates the synchronicity of both types of oscillations [ 78 , 96 ].

If one constructs and compares functional connectivity maps for the raw signal and both dynamic components separately ( Fig 5 ), the distributions of node degrees do not differ significantly. However, the networks of fast oscillatory activity are more locally clustered and segregated, more modular, and have lower average edge lengths and global efficiency, while the slow oscillations are principally more global, resulting in numerous long-range connections and consequently a more cohesive structure that shows a lower modularity and higher global efficiency. Importantly, for the raw signal, it seems that except for the node degree, most of the network measures are more strongly determined by the slow component [ 56 ]. A logical consequence of the abovementioned differences in functional network structure is the finding that there is a relatively weak correlation between the fast and slow network layer [ 96 ], implying that different synchronization principles are at work [ 70 , 78 ], and one should not directly compare results of studies relying on fast oscillations with the ones relying on slow oscillations. Importantly, even with the same experimental model, e.g., isolated mouse islets, and set of analytical tools applied to extracting and analyzing [Ca 2+ ] i oscillations, islets with preponderance of fast, mixed or slow oscillations might coexist [ 99 – 101 ], and in this case, data should not be simply pooled, since this may obscure relevant biological differences, but analyzed for the two temporal components and for oscillatory phenotypes separately. Extrapolating this reasoning further, the caveats we pointed out in this paragraph should also be kept in mind when comparing experimental traces from different animal models, even when using the same experimental approach and the same set of analytical tools. For instance, the presence and relative importance of fast and slow oscillations may vary between beta cells from zebrafish [ 55 , 102 ], mice [ 100 , 103 ], rats [ 104 , 105 ], sand rats [ 106 , 107 ], pigs [ 108 , 109 ], and humans [ 61 , 110 ], to name only a few. To facilitate interspecies comparison, future studies shall clearly specify the type of oscillations they are addressing. Finally, at present, it is difficult to experimentally compare the relationship between the structural networks of beta cells and their functional counterparts, but modelling studies suggest that the intricate structure of functional beta cell networks based on fast and slow oscillations may be at least partly explained by heterogeneity in beta cell activity and heterogenous intercellular coupling [ 68 , 70 , 78 ].

Different groups that employ network measures in their analyses typically use different experimental approaches to obtain [Ca 2+ ] i traces. While most groups use cultured isolated islets in combination with CCD camera-based or confocal imaging, some use the acute tissue slices in combination with confocal imaging. To be able to compare findings from different groups or combine them into a coherent bigger picture of islet network properties, these differences also need to be addressed as they are an important possible systematic confounding variable. Essentially, the methodology and experimental setup would not seem to be key parameters if they did not entail differences in the nature of the oscillatory signals. In tissue slices fast or mixed oscillations are more predominant (see Figs 4A or 5A ), whereas in isolated islets the slow oscillations are predominant (see Fig 6A ). Here, we explicitly demonstrated that the distinct nature of oscillations leads to different functional beta cell networks. While some network properties in fast-derived and slow-derived networks are similar, such as heterogeneity and small-worldness, they fundamentally differ from each other, and the significance of certain subpopulations in one network is therefore not equivalent to that in the other network. Moreover, even if oscillations qualify as fast, in isolated islets, they are typically longer than 10 seconds at concentrations > 10 mM glucose [ 34 , 92 , 99 ], whereas in slices, they tend to be shorter than 10 seconds [ 16 , 27 , 33 , 111 ]. The exact mechanism behind these differences remains to be explained, but in addition to possible differences in ionic composition and the presence of additional secretagogues in the extracellular fluid that can affect the patterns of oscillations [ 92 , 112 , 113 ], the mechanical and enzymatic stress during preparation of isolated islets [ 114 , 115 ], as well as culture conditions and duration [ 99 , 116 ] have been put forward as possible sources of these differences. More specifically, alpha cells have been suggested as a potential source of local proglucagon peptides [ 117 ]. They are primarily situated in the mantle of pancreatic islets in mice, and this outer region is particularly susceptible to damage during the islet isolation process, potentially resulting in the loss of alpha cells during islet preparation. Given that both glucagon and GLP-1 have been shown to elevate the frequency of oscillations in beta cells, the diminished intra-islet alpha cell signalling could be a contributing factor to the observed decrease in beta cell oscillatory frequency in isolated islets [ 71 , 91 , 118 , 119 ]. Further, there may be a run-down of certain ion channels and changes in the expression [ 120 , 121 ] with time, which obviously impact the identity and physiology of beta cells in the cultured isolated islets more than in the immediately used islets in slices [ 122 , 123 ]. This theory is at least partly supported by the finding that oscillations in mouse islets cultured for less than one day closely resemble oscillations in non-cultured islets [ 103 , 124 ] and in islets studied in vivo [ 125 , 126 ] or rapidly after the death of the animal [ 92 ], as well as the oscillations in tissue slices [ 27 ]. Until the influence of the above factors is fully understood, we can provide at least two practical suggestions. First, studies on isolated islets and tissue slices should always exactly state what the composition of the extracellular fluid was, and which type of oscillations were used for the network analyses, as well as provide details about the basic characteristics of these oscillations, i.e., their frequency and duration. Second, freshly microdissected islets or islets cultured for shorter time periods may yield results that are more closely comparable with results from tissue slices. Finally, the above advice also applies for studies utilizing yet other experimental preparations, such as in vivo imaging of isolated and transplanted mouse islets in the anterior chamber of the eye [ 127 ] and islets from other species, as mentioned in the preceding paragraph. In the present study, we used a range of different stimulatory concentrations. They are not intended to illustrate possible glucose-dependencies of different physiological and network metrics as these are covered elsewhere [ 13 , 27 , 33 ], but to demonstrate that the analytical tools work robustly across a range of frequently used stimulatory conditions. Given that the slow oscillations are rather glucose-insensitive in terms of their frequency in both slices [ 96 ] and islets [ 98 ] and that fast oscillations have comparable dose-response relationships in slices [ 27 , 33 ] and isolated islets [ 128 – 130 ], we believe that the different concentrations we used did not introduce any critical bias and that most of our findings are applicable to concentrations beyond the range used here.

In conclusion, we would like to stress that the scope of network analyses has, in recent years, been extended to investigate intercellular interactions and functional connectivity patterns in different types of tissues. These encompass various kinds of neural assemblies [ 131 ], pituitary endocrine cells [ 132 , 133 ], astrocytes [ 134 ], yeast cells [ 135 ], distinct epithelial cell types [ 136 , 137 ], acinar cells [ 138 ], and hepatocytes [ 139 ]. As such, the insights we present herein hold relevance for comprehending the intricacies of collective cellular activity across diverse contexts, where the assessment of multicellular dynamics can be achieved through suitable imaging techniques. Moreover, in tandem with advancements in imaging methods, which are expected to soon enable the simultaneous high-resolution assessment of multiple variables defining multicellular activity, potentially even in three dimensions, it is imperative to stay attuned to progress on the computational front. Over recent years, a plethora of sophisticated methods has emerged for evaluating dynamic interactions within complex systems, such as multilayer networks [ 140 , 141 ], detection of higher-order interactions [ 142 , 143 ], information-theoretic metrics describing causal relationships [ 144 , 145 ], and deep learning-based methods [ 146 , 147 ]. These approaches hold substantial potential for further and more profound research, extending even into the realm of multicellular systems, as already demonstrated by some recent studies [ 13 , 56 , 148 – 150 ]. We strongly believe that future progress in this field will rely on such interdisciplinary endeavors that combine cutting-edge experiments with innovative computational procedures. Along these lines, we anticipate a deeper understanding of how heterogeneous populations of interacting cells, placed within a dynamic and noisy environment, operate to ensure proper functionality, and how the regulatory mechanisms are altered in disease.

Materials and methods

Ethics statement.

We conducted the study in strict accordance with all national and European recommendations on care and handling experimental animals, and all efforts were made to minimize the suffering of animals. Mice were used under protocols approved by the University of Colorado Institutional Animal Care and Use Committee (IACUC Protocol number: 00024) and The Administration of the Republic of Slovenia for Food Safety, Veterinary and Plant Protection (permit numbers: U34401-35/2018-2).

Animals and [Ca2+]i imaging in tissue slices

Slice preparation..

C57Bl6J and NMRI male and female mice were held in a temperature-controlled environment with a 12 h light/dark cycle and given continuous access to food and water. Preparation of mouse-derived acute pancreas tissue slices was executed as described previously in full [ 122 ]. In brief, after sacrifice with CO 2 and cervical dislocation, the abdominal cavity is accessed via laparotomy and the papilla Vateri is clamped. 1.9% Low melting agarose dissolved in ECS containing (in mM) 125 NaCl, 26 NaHCO3, 6 glucose, 6 lactic acid, 3 myo-inositol, 2.5 KCl, 2 Na-pyruvate, 2 CaCl2, 1.25 NaH2PO4, 1 MgCl2, 0.5 ascorbic acid is heated to 40°C and injected through the bile duct. The pancreas is cooled with ice-cold ECS, extracted, and cut into tissue blocks, which are embedded in low melting point agarose and cut with a vibratome (VT 1000 S, Leica) to yield 140 μm slices. The slices are kept in HEPES-buffered saline (HBS) consisting of (in mM) 150 NaCl, 10 HEPES, 6 glucose, 5 KCl, 2 CaCl2, 1 MgCl2; titrated to pH = 7.4 with 1 M NaOH at room temperature and stained with a HBS staining solution containing 7 μM Calbryte 520 AM (AAT Bioquest), 0.03% Pluronic F-127 (w/v), and 0.12% dimethyl sulfoxide (v/v) for 50 min at room temperature. All chemicals were obtained from Sigma-Aldrich (St. Louis, Missouri, USA) unless stated otherwise. Individual tissue slices were placed into the recording chamber and used for one stimulation protocol. The recording chamber was continuously perifused with carbogenated ECS containing 6 mM glucose heated to 37°C at basal conditions. At 20–40 minutes, the perifusion was manually changed to stimulatory (8–12) mM glucose before it was returned to the basal glucose concentration.

Beta cell calcium dynamics were imaged using an upright confocal microscope system Leica TCS SP5 AOBS Tandem II with a 20X HCX APO L water immersion objective, NA 1.0, and an inverted confocal system Leica TCS SP5 DMI6000 CS with a 20X HC PL APO water/oil immersion objective, NA 0.7 (all from Leica Microsystems, Germany). A 488 nm argon laser was used to excite the fluorescent dye, and a Leica HyD hybrid detector operating in the 500–700 nm range was used to detect the fluorescence that was released (all from Leica Microsystems, Germany), as previously described [ 27 , 122 ]. The resolution used for time series acquisition was 512 X 512 pixels with a frequency of 2–10 Hz.

[Ca 2+ ] i imaging in isolated islets

Islet isolation and culture..

Islets were isolated from mice under ketamine/xylazine anaesthesia (80 and 16 mg/kg) by collagenase delivery into the pancreas via injection into the bile duct. The collagenase-inflated pancreas was surgically removed and digested. Islets were handpicked and planted into the glass-bottom dishes (MatTek) using CellTak cell tissue adhesive (Sigma-Aldrich). Islets were cultured in RPMI medium (Corning, Tewksbury, MA) containing 10% fetal bovine serum, 100 U/mL penicillin, and 100 mg/mL streptomycin. Islets were incubated at 37C, 5% CO2 for 24–72 h before imaging.

An hour prior to imaging nutrition media from the isolated islets was replaced by an imaging solution (125 mM NaCl, 5.7 mM KCl, 2.5 mM CaCl2, 1.2 mM MgCl2, 10 mM HEPES, and 0.1% BSA, pH 7.4) containing 2 mM glucose and fluo4 AM [Ca 2+ ] i sensitive dye (4 mM). After one hour the solution was replaced by dye-free imaging solution. During imaging the glucose level was raised from 2 mM to 11 mM. Islets were imaged using either a LSM780 system (Carl Zeiss, Oberkochen, Germany) with a 40x 1.2 NA objective or with an LSM800 system (Carl Zeiss) with 20x 0.8 NA PlanApochromat objective or a 40x 1.2 NA objective, with samples held at 37°C. The resolution was 512x512 pixels and time series were recorded with frequencies 1–2 Hz.

Pre-processing of recorded [Ca 2+ ] i time series

Fluorescence signals of Calbryte 520 AM or Fluo-4 representing time series for manually selected regions of interest (ROIs), i.e., individual beta cells, were exported along with their corresponding coordinates using a custom software called ImageFiltering (copyright Denis Špelič) or ImageJ [ 151 ]. As both dyes can detect both fast- and slow-component in beta [ 61 , 152 ], data obtained by either dye was pre-processed equally. Time series that exhibited large artifacts, low signal-to-noise ratio, or dynamics inconsistent with beta cells were excluded after visual inspection. The recordings from tissue slices underwent band-pass filtering using a zero-lag filter to extract either the fast-activity component (with typical cut-off frequencies of 0.05 and 2.0 Hz) or the slow-activity component (with cut-off frequencies of 0.001 and 0.07 Hz). Similarly, the recordings from isolated islets underwent band-pass filtering to eliminate baseline drifts and capture the oscillatory component (with typical cut-off frequencies of 0.005 and 0.25 Hz). Fast-component signals from slices and oscillatory signals from isolated islets were further smoothed using an adjacency averaging procedure and then binarized by setting values to 1 (active state) for periods of increased [Ca 2+ ] i signals or 0 (inactivity) for periods of low-amplitude signals. All subsequent analyses were performed either on the raw, filtered (fast or slow oscillatory component), or binarized cellular signals. The binarized signals were also used to calculate the relative active time. This metric represents the ratio of the time a given cell is in an active state, indicating thereby the overall cellular activity.

Evaluating synchronicity between [Ca 2+ ] i traces

experimental method research articles

By using Eqs ( 1 ), ( 2 ), and ( 6 ), we can construct similarity matrices of size ( N , N ), whereby N stands for the number of cells, that encode the correlation, coactivity, and normalized mutual information between all cell pairs in individual recordings, respectively. Notably, MI captures also non-linear relationships between the discretized time series.

Network construction and analysis

experimental method research articles

Alternatively, a variable similarity threshold technique can be used instead of a fixed threshold, which can create a network with a pre-set target average node degree, so the threshold is varied until a network with the target average node degree is designed. In our analyses the variable threshold was determined so that the resulting network had an average degree 8. This value was used to mimic the connectivity of realistic beta cell network architectures [ 156 ] and to obtain adequately dens networks suitable for analyses. However, it should be noted that previous studies have demonstrated that, within reasonable limits, the conclusions drawn from network analyses are not significantly influenced by the somewhat arbitrary choice of the average degree [ 13 , 96 ].

experimental method research articles

Based on the computed abstract distances, an MST can be constructed with so-called greedy algorithms such as Kruskal’s [ 157 ] or Prim’s [ 158 ] algorithm. These algorithms create graphs with N -1 edges ( N –number of nodes) which contain the lowest possible sum of edge weights (Σ D i , j ) without creating any cycles. We expand this idea for the generation of a multilayer MST, where a single MST is computed sequentially for the same network, but already existing edges (i.e., cell pairs) are excluded from the calculation of the next MST layer. In our analyses we calculated four layers of MST’s, which yielded an average node degree of 8 (the average degree of the original MST is 2, and each of the three subsequent layers contributes an additional 2 degrees).

For each extracted network, we calculated several basic network parameters, such as average network node degree ( k avg ) and degree distribution, average clustering coefficient ( C avg ) and clustering coefficient distribution, modularity ( Q ), global efficiency ( E ), relative largest component ( S max ), and edge length distribution, and small-world coefficient ( SW ). See Ref. [ 159 ] for technical details and Ref. [ 56 ] for a physiological meaning of these specific network parameters.

Quantifying inter-network similarity

experimental method research articles

In other words, inter-network similarity is defined as the ratio between the cardinality, i.e., the total number of edges of the intersection of edges in networks α and α ′, and the cardinality of the corresponding union. The resulting value of NSI ranges from 0 to 1, where 0 indicates no common edges between the networks and 1 indicates identical networks. This method was used to assess the similarity between functional networks derived from various oscillatory components and constructed with the above-described construction techniques. We additionally quantified the similarity of these networks with the postulated structural networks of islet cells, which we constructed as geometric networks by appropriate intercellular distance thresholding.

Methods for the time series processing, analyses of cellular signals, and network analyses were designed with Python programming language version 3.11.1, using the following packages: Numpy ( https://numpy.org/ ), Matplotlib ( https://matplotlib.org/ ), and NetworkX ( https://networkx.org/ ). All code is available on the GitHub repository: https://github.com/MarkoSterk/beta_cell_analysis_suite

Supporting information

S1 fig. collective beta cell activity under the protocol of a glucose ramp..

A) Ca 2+ traces of all responding beta cells in the slice (upper panel) and the corresponding raster plot of binarized fast Ca 2+ oscillations. The glucose concentration was ramped from 6 mM to 12 mM, as indicated at the top. B) Functional beta cell networks extracted in different glucose concentrations and with different thresholding techniques. The fixed threshold approach ( R th = 0.8) leads to very different network structures under different stimulation levels. Under lower glucose, when the degree of correlated beta cell dynamics is low, the networks are sparse and segregated. With increasing stimulation, the networks become progressively more integrated and dense (i.e., average node degree k avg is increasing), highlighting the heightened intercellular coordination. Conversely, the fixed avg. degree and multilayer MST approaches fail to capture this behavior, as they enforce a fixed number of connections, irrespective of the level of coordinated intercellular activity. Furthermore, utilizing a fixed average degree under conditions of low multicellular activity results in exceedingly low thresholds ( R th < 0.5), thereby promoting the establishment of functional connections by chance, which introduces unpredictability into the network analysis. Consequently, techniques that enforce a fixed number of connections are unsuitable for experiments where the level of activity changes significantly.

https://doi.org/10.1371/journal.pcbi.1012130.s001

S2 Fig. Exploring the Impact of Oscillatory Components and Calcium Signal Processing on Functional Network Structure.

The figure presents four types of networks derived from analysis of the five different islets examined in Fig 5 : i) A structural network modelled as a geometric network, wherein nearby cells are deemed connected. ii) A functional network derived from unprocessed signals. iii) A functional network extracted from the fast oscillatory component. iv) A functional network constructed based on the slow oscillatory component. All four networks were designed with a fixed average degree k avg = 8. Remarkably, across all five islets, the functional network based on the fast oscillatory component exhibits the fewest long-range connections and shows the highest similarity to the hypothesized structural network. In contrast, networks derived from unprocessed or slow-component signals display a greater proportion of long-range connections, exhibit similar characteristics to each other, and diverge significantly from the structural network.

https://doi.org/10.1371/journal.pcbi.1012130.s002

S3 Fig. Investigating the influence of oscillatory component on functional network structure in an isolated islet.

A) The average unprocessed (black) and extracted slow-component Ca 2+ signal (blue) from a Gcamp mouse islet are depicted. The inset shows the corresponding derived fast-component signal (red). B) Different types of beta cell networks: structural (modelled as a geometric network) and three functional networks derived from the unprocessed, slow-component, and fast-component Ca 2+ dynamics. Hub cells are highlighted in red. C) Inter-network similarity matrix quantifying the degree of overlap between the four networks. Evidently, the networks extracted from the unprocessed and slow-component traces are very similar, while the fast component network exhibits the highest degree of similarity with the structural network. In contrast, the similarity between the networks derived from unprocessed and slow-component signals and the structural network is notably lower, mirroring observations in tissue slices (see Figs 6 and S2 ).

https://doi.org/10.1371/journal.pcbi.1012130.s003

S4 Fig. Comparative analysis of functional intercellular network design methods.

A) Three representative beta cell signals (red line) and three alpha cell signals (blue line) subjected to the indicated stimulation protocol: 9 mM -> 10 mM -> 11 mM -> 11 mM glucose + μM epinephrine. This protocol was used to functionally discriminate alpha and beta cells, as the addition of 1 μM epinephrine activates alpha cells and inhibits beta cells. B) Functional networks were extracted using two methods: the fixed average degree method (left) and the four-layered multilayer minimum spanning tree (MST) method (right). The multilayer MST method enforced connections to all cells, including those with asynchronous dynamics, such as alpha cells. Consequently, alpha cells were integrated into the functional network despite their lack of correlation with the rest of the syncytium. This highlights the unsuitability of the MST method for network analyses involving elements with diverse dynamics. Alpha cells are indicated with blue circles and beta cells with red circles.

https://doi.org/10.1371/journal.pcbi.1012130.s004

Acknowledgments

We thank Jasmina Jakopiček, Nika Polšak, Rudi Mlakar, and Maruša Plesnik Rošer for their excellent technical assistance.

  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 28. Kravets V, Dwulet JM, Schleicher WE, Hodson DJ, Davis AM, Pyle L, et al. Functional architecture of the pancreatic islets reveals first responder cells which drive the first-phase [Ca2+] response. bioRxiv. 2021; 2020.12.22.424082. https://doi.org/10.1101/2020.12.22.424082
  • 50. Clair JRS, Westacott MJ, Miranda J, Farnsworth NL, Kravets V, Schleicher WE, et al. Restoring Connexin-36 Function in Diabetogenic Environments Precludes Mouse and Human Islet Dysfunction. bioRxiv; 2023. p. 2020.11.03.366179. https://doi.org/10.1101/2020.11.03.366179

Help | Advanced Search

Computer Science > Robotics

Title: an effectiveness study across baseline and neural network-based force estimation methods on the da vinci research kit si system.

Abstract: In this study, we further investigate the robustness and generalization ability of an neural network (NN) based force estimation method, using the da Vinci Research Kit Si (dVRK-Si). To evaluate our method's performance, we compare the force estimation accuracy with several baseline methods. We conduct comparative studies between the dVRK classic and dVRK-Si systems to benchmark the effectiveness of these approaches. We conclude that the NN-based method provides comparable force estimation accuracy across the two systems, as the average root mean square error (RMSE) over the average range of force ratio is approximately 3.07% for the dVRK classic, and 5.27% for the dVRK-Si. On the dVRK-Si, the force estimation RMSEs for all the baseline methods are 2 to 4 times larger than the NN-based method in all directions. One possible reason is, we made assumptions in the baseline methods that static forces remain the same or dynamics is time-invariant. These assumptions may hold for the dVRK Classic, as it has pre-loaded weight and maintains horizontal self balance. Since the dVRK-Si configuration does not have this property, assumptions do not hold anymore, therefore the NN-based method significantly outperforms.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

ORIGINAL RESEARCH article

A maize seed variety identification method based on improving deep residual convolutional network.

Jian Li,

  • 1 College of Information Technology, Jilin Agricultural University, Changchun, China
  • 2 College of Information Technology, Jilin Bioinformatics Research Center, Changchun, China
  • 3 School of Data Science and Artificial Intelligence, Jilin Engineering Normal University, Changchun, China
  • 4 College of Engineering Technical, Jilin Agricultural University, Changchun, China

Seed quality and safety are related to national food security, and seed variety purity is an essential indicator in seed quality detection. This study established a maize seed dataset comprising 5877 images of six different types and proposed a maize seed recognition model based on an improved ResNet50 framework. Firstly, we introduced the ResStage structure in the early stage of the original model, which facilitated the network’s learning process and enabled more efficient information propagation across the network layers. Meanwhile, in the later residual blocks of the model, we introduced both the efficient channel attention (ECA) mechanism and depthwise separable (DS) convolution, which reduced the model’s parameter cost and enabled the capturing of more precise and detailed features. Finally, a Swish-PReLU mixed activation function was introduced globally to improve the overall predictive power of the model. The results showed that our model achieved an impressive accuracy of 91.23% in corn seed classification, surpassing other related models. Compared with the original model, our model improved the accuracy by 7.07%, reduced the loss value by 0.19, and decreased the number of parameters by 40%. The research suggested that this method can efficiently classify corn seeds, holding significant value in seed variety identification.

1 Introduction

Corn is the most widely grown cereal crop worldwide and is extensively used in food processing and as a primary component of animal feed ( Shafinas et al., 2022 ). Seed purity refers to the degree of consistency in typical characteristics, directly impacting the yield and quality of corn. During seed harvesting and storage, impurities may inadvertently infiltrate average seeds, leading to economic losses in agricultural production and processing. During seed sales, some individuals or companies exploit inferior maize varieties to impersonate superior ones, aiming to make excessive profits ( Sun and Zou, 2022 ). This erroneous behavior may damage investors’ interests and disrupt the seed market ( Park et al., 2016 ). Therefore, an urgent need is to explore a non-destructive and efficient identification method for screening and grading maize seeds before they are marketed to ensure agricultural production, quality control, and market regulation ( Tenaillon and Charcosset, 2011 ).

Traditional methods for seed purity identification include morphological inspection, field planting inspection, chemical identification, and electrophoresis technology ( Ye-Yun et al., 2005 ; Sundaram et al., 2008 ; Pallavi et al., 2011 ; Satturu et al., 2018 ). However, these methods generally take a long time, require professional personnel and specialized equipment, and are often subject to the subjective experience of the testers. Additionally, the identification process may damage the samples. Hence, there is a need to develop a rapid, accurate, and non-destructive classification method for maize seed identification.

Deep learning has emerged as a critical research focus across various domains, particularly in the realm of computer vision. Integrating deep learning techniques with image processing has found widespread applications in seed classification and identification ( Javanmardi et al., 2021 ; Tu et al., 2021 ). For instance, Zhu ( Zhu et al., 2019 ) developed a self-designed Convolutional Neural Network (CNN) to classify seven varieties of cotton seeds, achieving an accuracy rate exceeding 80%—outperforming residual networks and other traditional models. Similarly, Rybacki ( Rybacki et al., 2023 ) constructed a CNN with a fixed architecture comprising five alternating layers of Conv2D, MaxPooling2D, and Dropout. This model successfully identified seeds from three winter rapeseed varieties, attaining the highest validation accuracy of 85.6%. Atlanta ( Altuntaş et al., 2019 ) employed a transfer learning approach using CNN to automatically differentiate between haploid and diploid corn kernels, achieving accuracy rates of over 90% across all models. In another study, Zhang ( Zhang et al., 2020 ) proposed a deep learning model that combines near-infrared hyperspectral imaging (NIR-HSI) to determine the variety of coated maize seeds. Spectral reflectance values were extracted to train both CNN and Long Short-Term Memory (LSTM) models. The test results demonstrate that all models achieved classification accuracies exceeding 90%. Ma ( Ma et al., 2020 ) integrated NIR-HSI and CNN deep learning techniques to differentiate between viable and non-viable seeds, achieving a seed detection rate of 90% in the process. Zhang ( Zhang et al., 2021 ) investigated the feasibility of combining hyperspectral imaging (HSI) with deep CNN for classifying four varieties of maize seeds. The study showed that the classification performance of the deep CNN model was generally the highest among all varieties, with a validation accuracy of 93.3%. Yu ( Yu et al., 2021 ) utilized HSI (948.17-1649.20nm) combined with CNN technology to identify 18 species of hybrid okra seeds, achieving a recognition rate of 93.79%. This demonstrates the reliable advantage of CNN models in achieving high accuracy and stability. Wang ( Wang and Song, 2023 ) utilized hyperspectral imaging technology combined with deep learning methods to identify various varieties of sweet corn seeds. The results indicated that the deep learning model achieved a classification accuracy of over 95% on both the training and testing datasets. Bi ( Bi et al., 2022 ) improved the Swin Transformer model and applied transfer learning to achieve high-precision classification and recognition of corn seed images, with an average accuracy of 96.53%. Xing ( Xing et al., 2023 ) proposed a network model called GC_DRNet, incorporating the concept of dense networks and achieving an accuracy of 96.98% on a wheat seed dataset. Deep learning algorithms are gradually becoming optimal for establishing lossless detection models ( Zhou et al., 2020 ; Zhang et al., 2023 ).

According to the studies above, efficient identification of seed varieties is challenging due to similar appearance, genetic diversity, and growth environment. Therefore, combining neural networks and hyperspectral data has been predominantly relied upon to recognize seed varieties effectively. Although this approach outperforms using convolutional neural networks alone for recognition, acquiring hyperspectral data is not easily accessible, and the processing involved is complex. In order to address the limitations above, this study is based on a pure image dataset of corn seeds. By improving the classical ResNet50 model, a new convolutional neural network model for corn seed identification is proposed. The main contributions and novelties of this work are listed as follows.

1. We introduced the IResStage structure in the early stages of ResNet50, enhancing the residual blocks to improve the model’s feature extraction and network representation capabilities. This enables it to capture and convey image features more effectively.

2. In the later stages of the network, we incorporated the ECA module and depthwise separable convolution. The attention mechanism strengthens the focus on channel information, while the use of depthwise separable convolution aims to reduce time costs, further enhancing the model’s ability to capture more precise and detailed features, thereby increasing the efficiency of model recognition.

3. We introduced a global hybrid activation function by combining different activation functions, enhancing the model’s generalization ability and accuracy during the prediction phase, allowing it to process input data better and make more accurate predictions.

The following of this article was organized as the section “Materials and Methods” described the details of the datasets and the overview of the methods, the experimental results were described and discussed in the section “Results and Discussions”, and the section “Conclusions” was the concluding remarks.

2 Materials and methods

2.1 image acquisition and preprocessing, 2.1.1 data source and acquisition.

The six different varieties of corn seeds, including JD407, JD50, JD83, JD953, JD209, and JD626, were provided by the Corn Institute of Jilin Academy of Agricultural Sciences in Jilin Province. These seeds were photographed using a Canon 70D camera. The high-definition color images are shown in Figure 1 . During sampling, experts selected and certified the seeds and manually screened them to select whole, uniformly shaped seeds as experimental samples while removing impurities and dust. Subsequently, image acquisition work was carried out. All samples appeared normal, displaying a neat exterior without any visible damage. Approximately 900 to 1000 samples were randomly selected from each variety for imaging and stored in sealed plastic packaging at room temperature (20 ± 1°C). This sample size is because deep learning networks require a large number of samples for proper training ( Wen, 2020 ).

www.frontiersin.org

Figure 1 Image of maize seed varieties.

2.1.2 Image preprocessing and segmentation

The research on maize variety identification focuses on authenticating and ensuring the purity of maize seeds. Seed purity comprises the authenticity of individual seeds. A single-seed identification method is employed to identify the variety of seeds, requiring the segmentation of images containing multiple maize seeds. Firstly, the image is converted to a grayscale image to facilitate the removal of color information and to highlight brightness-related features. Subsequently, automatic global thresholding and morphological filtering are then applied to obtain a binary image, simplifying the image and extracting the contours of the targets. Finally, morphological filtering of the binary image is used to score the mask of the maize seed region. This is then used to partition it into individual maize seeds of size 224*224, resulting in a total of 5877 original images. The image cutting process is shown in Figure 2 .

www.frontiersin.org

Figure 2 Corn seed image cutting processing.

Each model underwent training using a 5-fold cross-validation method to address the many uncertainties in the experiments. 80% of the dataset is randomly selected as the training set and 20% as the test set. Due to the limited sample size and to ensure the generalization ability of the model, the validation set is also used as the test set to evaluate the results. The dataset comprised 4703 images for training and 1174 images for validation, as shown in Table 1 . Therefore, the final experimental results in this paper were based on the average of the results of five experiments.

www.frontiersin.org

Table 1 Dataset partition.

2.2 Building the model

2.2.1 resnet50 model.

ResNet is a deep neural network proposed by He et al ( He et al., 2016 ). Due to its deeper network structure, unique residual connection design, and higher parameter efficiency, it can learn complex features, alleviate the vanishing gradient problem, and exhibit good generalization ability while maintaining a relatively fast inference speed. Due to these advantages, ResNet50 has become an ideal model for maize seed recognition, effectively extracting useful information from images for classification purposes. The residual block is an essential structure of ResNet50, which addresses the vanishing gradient and exploding gradient problems in deep neural networks by introducing skip connections and identity mapping. The residual block consists of two primary operations: the main path and the shortcut connection. The main path comprises a series of convolutional, normalization, and non-linear activation layers, which extract high-level representations of the input features. The shortcut connection is a simple mapping that achieves cross-layer information propagation by directly adding the input to the output of the main path. The structure of the residual block is shown in Figure 3 .

www.frontiersin.org

Figure 3 Residual block.

2.2.2 ResStage structure

To facilitate the network’s learning process, we need to provide better pathways for information propagation across network layers. C. Duta et al. proposed a simple, practical, stage-based CNN module called the ResStage structure ( Duta et al., 2020 ), as shown in Figure 4 . The ResStage structure has modified the arrangement of components, dividing each central stage into three parts: a Start ResBlock, a Middle ResBlock, and an End ResBlock. The Start ResBlock includes a BN layer after the last conv operation, preparing for element-wise addition through normalization. The End ResBlock is completed by BN and ReLU operations, preparing for a stable transition into the next stage. The module aims to achieve efficient information flow while maintaining controlled signal propagation through learning in these three stages.

www.frontiersin.org

Figure 4 ResStage structure: (A) Start ResBlock, (B) Middle ResBlock, (C) End ResBlock.

In the original residual block, the number of ReLU units on the main propagation path is directly proportional to the network depth. In contrast, ResStage contains a fixed number of ReLU units on the main path, facilitating forward and backward information propagation. In the main stage, there are only four ReLU units along the main information propagation path, and they are not affected by changes in depth. This design enables the network to prevent signal obstruction as information passes through multiple layers, enhancing information extraction and learning capability. The complexity of maize seed morphology may challenge traditional feature extraction methods in capturing all essential features. The ResStage structure effectively reduces information loss, extracts more comprehensive feature information, prevents model gradient vanishing, and reduces hyperparameter demand.

2.2.3 Improved residual block

The attention mechanism plays a crucial role in deep learning, effectively and accurately filtering out valuable information from a large amount of data. This is highly beneficial for various image-processing tasks ( Mi et al., 2020 ; Zang et al., 2022 ; Feng et al., 2024 ). Therefore, in this study, we introduced an attention mechanism called ECA (Efficient Channel Attention) after the first convolution of the subsequent residual block ( Wang et al., 2020 ). The ECA module can adaptively adjust the weights of channel features, allowing the network to focus on essential features better. Most maize seeds have similar shapes and delicate textures, affecting recognition after downsampling and making it difficult to extract detailed features from the network. The ECA module helps improve the discriminative ability of features and suppress unimportant features, thereby reducing the risk of overfitting. Ultimately, this enhances feature representation and improves the model’s generalization ability without significantly increasing computational costs. The structure of the ECA module is shown in Figure 5 .

www.frontiersin.org

Figure 5 Structure of the efficient channel attention module.

The forward process of the ECA module is as follows: first, the input feature map with a size of H×W×C undergoes global average pooling to obtain feature information. Then, new weight values ω are generated through a one-dimensional convolution of size k and a sigmoid activation function, completing inter-channel information interaction, as shown in Equation (1) .

where C 1 D k represents a one-dimensional convolution with a kernel size of k, and σ is the sigmoid activation function. The number of channels C is proportional to the one-dimensional convolution with kernel k, as shown in Equation (2) .

Thus, we can obtain the final kernel size k, as shown in Equation (3) .

where t is the nearest odd number to | t | odd , γ is 2, and b is 1.

In addition, to reduce the computational cost and time consumption of the network model, we incorporate depthwise separable convolution ( Chollet, 2017 ) into the subsequent residual blocks of the maize seed recognition model. Depthwise separable convolution consists of two sub-layers: depthwise convolution and pointwise convolution, as illustrated in Figure 6 . In the first stage of depthwise convolution, convolution operations are performed individually on each channel. In the second stage of pointwise convolution, the number of channels is adjusted to match a predefined output channel number. Unlike conventional convolution, where each kernel operates on the entire input volume, each kernel is responsible for a single channel in depthwise convolution. For example, in a three-channel color image, the first stage of depthwise convolution performs a two-dimensional convolution operation for each channel independently, resulting in three feature maps. Subsequently, the pointwise convolution process is akin to traditional convolution, as it entails a weighted combination of the preceding feature maps along the channel dimension to produce new feature maps.

www.frontiersin.org

Figure 6 Structure of the depthwise separable convolution.

Depthwise convolution utilizes a single convolution kernel to perform channel-wise convolutions on input channels, effectively reducing computational complexity and accelerating forward and backward propagation, lowering computation and storage costs. Furthermore, depthwise separable convolution combines information from different channels through pointwise convolution, thus preserving a specific feature extraction capability.

In conclusion, we incorporated the ECA module into the subsequent residual blocks to efficiently recognize maize seeds and replaced the second convolution with a depthwise separable convolution. This enables us to reduce the model parameter count while enhancing the model’s overall performance. The improved residual block is shown in the Figure 7 .

www.frontiersin.org

Figure 7 Structure of the improved residual block.

2.2.4 Using mixed activation functions

The activation function is a non-linear function used to increase the non-linearity of the network model between the output of upper-layer nodes and the input of lower-layer nodes in a multi-layer neural network ( Ohn and Kim, 2019 ). For a specific training model, selecting an appropriate activation function can effectively improve the neural network’s performance ( Apicella et al., 2021 ). In order to maximize the expressive power of the model, this paper selects the Swish and PReLU activation functions to replace the original ReLU function at different positions. The corresponding image is shown in Figure 8 .

www.frontiersin.org

Figure 8 ReLU, Swish and PReLU activation function curves.

ReLU is the most commonly used activation function, which effectively alleviates the gradient vanishing problem in deep neural networks. Its proposal has led to significant advances in the field of deep learning ( Wang et al., 2018 ). The expression is defined as shown in Equation (4) :

where x is the input. When the input value is less than or equal to 0, the gradient of ReLU is 0, which means that the neuron becomes “dead” and cannot update its weights, resulting in information loss. Therefore, the PReLU activation function was proposed to address the issues of the ReLU function ( He et al., 2015 ). The expression is defined as shown in Equation (5) :

where x is the input, and α is a learnable parameter. PReLU is an improvement over LReLU, as it can adaptively learn parameters from the data, offering the advantages of fast convergence and low error rates. Additionally, PReLU can be used for backpropagation training and can be jointly optimized with other layers.

Swish is a novel composite activation function ( Ramachandran et al., 2017 ), and its expression is defined as shown in Equation (6) :

where x is the input. The Swish activation function possesses the characteristics of having no upper bound, a lower bound, smoothness, and non-monotonicity, which can alleviate the gradient vanishing problem. Furthermore, its performance in deep models surpasses that of the ReLU activation function.

The PReLU and Swish activation functions can, to some extent, address the drawbacks of the ReLU activation function. Therefore, in this study, a combination of these two activation functions is employed to replace the ReLU function at different positions, aiming to enhance the model’s predictive capability for maize seed classification.

2.2.5 Proposed model

To minimize information loss during the recognition process of corn seeds, we introduced the ResStage structure in the early stages of our model. This structure optimizes the positioning of BN layers and the ReLU activation function, effectively mitigating the negative impact of non-linear activations on information propagation. These adjustments significantly enhance feature extraction and information propagation capabilities. Furthermore, we enhanced the residual structure in later stages by incorporating the ECA module and depthwise separable convolution into each residual block. This enhancement fosters effective feature interaction while reducing computational costs, thus improving recognition capabilities. Lastly, we globally integrated a mixed activation function into the model. We replaced the activation functions after skipping connections in the End ResBlock and improved residual blocks, as well as the initial activation function in the network input layer, with the Swish activation function. Additionally, in all other positions, we replaced the activation function with the PReLU activation function to enhance the overall predictive capacity of the model. The improved model is shown in the Figure 9 .

www.frontiersin.org

Figure 9 The improved model.

3 Results and discussions

3.1 experimental setup.

The configuration environment for this antler classification experiment is: processor: Xeon 5220R, graphics card: NVIDIA TESLA T4, operating system: windows10, Python3.8.16 based Pytorch1.13.1 deep learning framework built on Python3.8.16 programming language, software configuration installed as Anaconda3-2021.11- windows version. The specific parameter settings in the experiment are shown in Table 2 .

www.frontiersin.org

Table 2 Training hyperparameter information.

To select the optimal learning rate, comparative experiments were conducted with the learning rate set to 0.01, 0.001, and 0.0001, respectively, to determine the best parameters. The test results are shown in Table 3 . The experimental results indicate that when the learning rate was 0.001, the original model achieved the highest recognition accuracy in the test set, at 84.16%, higher than the models with other parameters. Therefore, it was confirmed that a learning rate of 0.001 is the training parameter.

www.frontiersin.org

Table 3 Performance comparison results of different learning rate.

3.2 Comparison experiments of different models

In order to validate the effectiveness and advancement of the new network model, we used model accuracy, model loss, model parameters, model floating-point operations per second (FLOPs), and model training time per epoch as evaluation metrics for the model’s performance. We compared the new network model with five classic convolutional neural networks (ResNet50, Res2Next50, DenseNet201, ConvNext_T, and RepVgg_A2) to assess its performance. The results are shown in Table 4 , Figures 10 , 11 .

www.frontiersin.org

Table 4 Comparison experiments of different models.

www.frontiersin.org

Figure 10 Results of the accuracy of different model comparison experiments.

www.frontiersin.org

Figure 11 Results of the loss of different model comparison experiments.

By analyzing the results of comparative experiments, this paper’s proposed corn seed classification model achieved the best accuracy of 91.23%. It also demonstrated the lowest loss value of 0.27, the lowest parameter count of 14.12 M, the lowest FLOPs value of 3.2 GMac, and a running time of only 57s per epoch. Compared to the original model, it showed an improvement of 7.07% in accuracy, a reduction of 0.19 in loss value, a 40% decrease in parameter count, a decrease of 0.92 GMac in FLOPs, and a 3s acceleration in running time per epoch. In comparison, other models exhibit slower recognition speed, lower accuracy, and weaker generalization ability when classifying corn seed image samples. These findings provide evidence for the superior performance of the proposed model in this paper, as it converges rapidly to find the optimal values. This proves the superior performance of the model, which converges quickly to find the best value.

3.3 Ablation experiments

To assess the impact of the ResStage structure, improved residual structure, and mixed activation functions on model performance, we conducted ablation experiments using ResNet50 as the base network. The results, as shown in Table 5 , indicate that integrating these three modules enhances model performance, thereby improving its suitability for classifying maize seed varieties. Furthermore, the simultaneous integration of these modules further enhances model accuracy, providing more reliable and precise classification results for maize seed classification.

www.frontiersin.org

Table 5 Comparison of ResNet50 experimental models with different module combinations.

3.3.1 Effect of depthwise separable convolution on network model performance

This study replaced traditional convolution operations with depthwise separable convolutions, which embrace the concept of lightweight design. Compared to the original model, the accuracy improvement was only 0.6%. However, by restructuring the residual blocks while ensuring a slight increase in accuracy, there was a significant reduction in the number of model parameters. This change enhanced the model’s floating-point computation capabilities, ultimately leading to a practical improvement in the model’s training efficiency. The overall results of the model before and after the introduction of depthwise separable convolutions are shown in Table 6 .

www.frontiersin.org

Table 6 Comparison results before and after adding depthwise separable convolution to the model.

3.3.2 Effect of attentional mechanisms on network model performance

Adding appropriate attention mechanisms in the network can enhance its ability to extract effective image features. In this experiment, we kept other factors constant and introduced different attention mechanisms into the proposed maize seed classification model for comparison. The results are shown in Figure 12 ; after introducing the Squeeze-and-Excitation(SE), Convolutional Block Attention Module(CBAM), Coordinate attention(CA), and ECA modules, the model’s accuracy increased by 1.18%, 0.59%, 1.65%, and 3.07%, respectively, compared to the original model. Among them, the ECA module has a more significant effect on improving network performance. This indicates that by efficiently and accurately calculating attention across channel dimensions, the ECA module can better capture the dependency between features, utilize contextual information, and suppress irrelevant noise, thereby achieving better performance in the task of maize seed recognition.

www.frontiersin.org

Figure 12 Recognition results comparison of different attention mechanism models.

To intuitively analyze the effectiveness of the improved maize seed classification model, we utilized the visualization tool Grad-CAM ( Selvaraju et al., 2016 ). Grad-CAM visualizes the image regions focused on by the model during prediction by calculating the gradients of the target class concerning the feature maps, multiplying these gradients with the feature maps to obtain weights, and ultimately generating a heatmap. The original images are displayed in the first row, while the second and third columns show the Grad-CAM mapping images before and after incorporating the ECA module. The color spectrum from red to blue indicates the degree of contribution.

The visualization of the experimental results is shown in Figure 13 . Before the introduction of the attention mechanism, the model might have focused more on the local features of the seeds, possibly due to the model’s insufficient grasp of the global features of the entire image. Consequently, the heatmaps mainly concentrated on the local areas of the seeds, causing the model to prioritize certain local features while neglecting overall features during prediction. However, after incorporating the ECA module, the model’s attention to channel information increased, enhancing its ability to grasp global features. This enabled the model to better focus on the features of the entire seed, not just the local features, during prediction. Therefore, the ECA module has enhanced the feature extraction capability of the corn seed classification model, enabling it to locate valuable areas within the corn seed images more accurately.

www.frontiersin.org

Figure 13 Visualization results of the new network thermal characteristic map before and after improvement.

3.3.3 Effect of mixed activation function on network model performance

The choice of activation function is also crucial during the training process, as it significantly impacts the performance of the same model. We experimented with three activation functions (LeakyReLU, Swish, and PReLU) and the original ReLU activation function to improve the ResNet50 network architecture. We explored the impact of mixed activation functions on the performance of deep networks. We divided the overall activation functions into two categories. Activation 1 represents the activation function used after skipping connections in the End ResBlock and improved residual blocks, and the first activation function is in the network input layer. Activation 2 represents another activation function used in other positions. The results are shown in Figure 14 .

www.frontiersin.org

Figure 14 Recognition results comparison of for different combinations of activation functions. Note: A single name represents the global activation function. The activation function before the “-” symbol denotes activation 1, while the activation function after the “-” symbol denotes activation 2.

The results indicate that compared to the original global use of the ReLU activation function, the accuracy improved by 1.12% when using the Swish-PReLU mixed activation function. It outperformed other global activation functions and combinations of mixed activation functions. The Swish activation function, with its non-zero mean within the input range, preserves more information and helps enhance the expressive power of the model. On the other hand, the PReLU activation function provides more detailed information to maximize inter-class differences, such as the texture, lines, and colors of corn seeds, enabling the extraction of detailed features that are challenging to capture. Using the Swish-PReLU mixed activation function, we can leverage the advantages of both functions to achieve better generalization performance and recognition results. This significantly improves the performance of the corn seed classification model.

3.4 Comparison of relevant indicators

This article also cites three metrics: precision, as seen in Equation 7 ( Kosmopoulos et al., 2015 ), recall, as seen in Equation 8 ( Zhu et al., 2010 ), and F1-score, as seen in Equation 9 ( Hai et al., 2017 ), as evaluations of the model’s performance on different classes. Precision refers to the probability of a specific category being correctly predicted among all predicted results. Recall refers to the probability of a specific category being correctly predicted among all actual values. The F1 score is the harmonic mean of precision and recall.

where TP refers to the correctly classified positive samples, FP refers to the negative samples mistakenly classified as positive, TN refers to the correctly classified negative samples, and FN refers to the positive samples mistakenly classified as negative.

The results from the Table 7 demonstrate that the improved model, when compared to the original model, has enhanced various indicators for all six types of corn seeds. The Precision for each category of corn seeds has increased by 5.7%, 4.8%, 3.5%, 13.2%, 8.3%, and 6.7% respectively. The Recall has seen improvements of 10.3%, 12.3%, 1%, 4.6%, 2.6%, and 11.6% respectively. Furthermore, the F1 scores have shown improvements of 0.08, 0.085, 0.022, 0.089, 0.056, and 0.093 respectively. These findings indicate that the improved network exhibits better recognition performance in the classification of corn seed images.

www.frontiersin.org

Table 7 Comparison of model recognition performance evaluation metrics.

In order to further validate the recognition capability of the original identification model proposed in this paper, we have provided visual confusion matrix comparison charts for the model before and after improvement in Figure 15 . It can be observed that the improved network model has effectively reduced the error rates for each category, especially significantly decreasing the misclassification of the first category seed as the fourth category, the misclassification of the second category seed as the fifth category, and the misclassification of the sixth category seed as the fourth category.

www.frontiersin.org

Figure 15 Model confusion matrix visualization. (A) Original model confusion matrix visualization. (B) Our model confusion matrix visualization.

In summary, the improved model can better extract fine-grained features such as color and texture information from corn seeds, leading to a significant reduction in recognition error rates. However, the model still needs to improve in identifying seeds in the fourth and fifth categories. Therefore, improving the recognition rates for these particular categories will be a focal point of our future research efforts.

3.5 Comparison of related studies

Detailed comparisons with related studies were not feasible in this experiment due to the different methods, datasets, and classification criteria employed. Nonetheless, we compared some applications in agricultural classification tasks, considering several criteria such as dataset size, applications, methods used, and accuracy. The comparisons, as shown in Table 8 , indicate that the accuracy of different classification tasks is above 85%, with most methods utilizing deep learning models combined with HSI or employing transfer learning. In contrast, the method proposed in this paper achieved an accuracy of over 90% solely using CNN. This demonstrates the rationality of the sample size selection and the effectiveness of the proposed approach. In this scenario, the credibility of this study has been enhanced, providing a valuable reference for agricultural product classification.

www.frontiersin.org

Table 8 Comparison of the proposed model and related studies (seeds).

3.6 Validation of model generalization ability

To further validate the generalizability and robustness of the model, this study selected the maize dataset used by Chunguang Bi et al ( Wang and Song, 2023 ). The dataset consists of 19 categories of maize seeds, making it representative and challenging. As shown in Table 9 , the improved model achieved an accuracy increase from 90.17% to 93.96% on this dataset. The analysis of other performance metrics, including precision, recall, and F1 score, also showed significant improvements. This indicates that the improved model can adapt to different maize seed conditions and maintain high performance when faced with new datasets. It demonstrates the effectiveness of the latest model in handling data from various sources and characteristics, highlighting its strong generalization ability and robustness.

www.frontiersin.org

Table 9 Comparison of the model before and after improvement on a new dataset.

4 Conclusions

Our research involves image acquisition of six different types of corn seeds, namely JD407, JD50, JD83, JD953, JD209, and JD626. We introduce the ResStage structure early in the model to facilitate better information propagation throughout the network layers, thereby promoting the learning process and reducing information loss. In addition, we have introduced both the ECA module and depthwise separable convolution on the residual blocks in the later stages of our model. This simultaneous integration allows us to capture global correlations between features better while significantly reducing the required number of model parameters and computational workload. Finally, we globally introduced the Swish-PReLU hybrid activation function, which combines the unbounded lower-bound, smooth, and non-monotonic properties of the Swish activation function with the adaptive parameter learning capabilities of the PReLU activation function. This was done to enhance the model’s predictive ability for corn seeds. Integrating these three improvements and conducting experiments on datasets comprising six different types of corn seeds demonstrated that the proposed method achieved an impressive accuracy of 91.23%.

Our proposed network model outperforms other commonly used image classification models, including ResNet50, Res2Next50, DenseNet201, ConvNext_T, and RepVgg_A2, in terms of performance while maintaining lower model complexity. Compared to the original network models, our model has achieved a 7.07% increase in accuracy, reduced the loss value by 0.19, decreased the parameter count by 40%, lowered FLOPs by 0.92GMac, and shortened the training time per epoch by 3s.

In conclusion, our proposed method has shown good performance in applying maize seed variety identification. However, seed variety identification involves crucial decisions in agricultural production, such as planting time, fertilization methods, irrigation levels, etc. Moreover, the design and optimization of the model should provide deep insights into seed variety characteristics, growing environmental conditions, and agricultural production management decisions. Therefore, in future research, in addition to considering the impact of factors such as seed storage time, cultivation conditions, and shooting angles on the model’s performance, we will also focus on the model’s management impact and insights into decision-making purposes. This aims to achieve effective support and guidance for seed variety identification and production.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

JL: Writing – original draft, Writing – review & editing. FX: Writing – original draft, Writing – review & editing. SS: Writing – original draft, Writing – review & editing. JQ: Writing – review & editing.

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was funded by the Natural Science Foundation of Jilin Province (No.2020122348JC), Innovation Capacity Project on Development and Reform Commission of Jilin Province (No.2020C019-6).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Altuntaş, Y., Cömert, Z., Kocamaz, A. F. (2019). Identification of haploid and diploid maize seeds using convolutional neural networks and a transfer learning approach. Comput. Electron. Agric. 163, 104874. doi: 10.1016/j.compag.2019.104874

CrossRef Full Text | Google Scholar

Apicella, A., Donnarumma, F., Isgrò, F., Prevete, R. A. (2021). survey on modern trainable activation functions. Neural Netw. 138, 14–32. doi: 10.1016/j.neunet.2021.01.026

PubMed Abstract | CrossRef Full Text | Google Scholar

Bi, C., Hu, N., Zou, Y., Zhang, S., Xu, S., Yu, H. (2022). Development of deep learning methodology for maize seed variety recognition based on improved swin transformer. Agronomy 12, 1843. doi: 10.3390/agronomy12081843

Chollet, F. (2017). “Xception: deep learning with depthwise separable convolutions,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA., 1800–1807. doi: 10.1109/CVPR.2017.195

Duta, I. C., Liu, L., Zhu, F., Shao, L. (2020). “Improved residual networks for image and video recognition,” in 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy., 9415–9422. doi: 10.1109/ICPR48806.2021.9412193

Feng, Y., Liu, C., Han, J., Lu, Q., Xing, X. (2024). Identification of wheat seedling varieties based on MssiapNet. Front. Plant Sci. 14. doi: 10.3389/fpls.2023.1335194

Hai, M., Zhang, Y., Zhang, Y. (2017). A performance evaluation of classification algorithms for big data. Proc. Comput. Sci. 122, 1100–1107. doi: 10.1016/j.procs.2017.11.479

He, K., Zhang, X., Ren, S., Sun, J. (2016). “Deep residual learning for image recognition,” in 2016 Ieee Conference On Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 770–778. doi: 10.1109/CVPR.2016.90

He, K., Zhang, X., Ren, S., Sun, J. (2015). “Delving deep into rectifiers: surpassing human-level performance on imageNet classification,” in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile., 1026–1034. doi: 10.1109/ICCV.2015.123

Javanmardi, S., Miraei Ashtiani, S., Verbeek, F. J., Martynenko, A. (2021). Computer-vision classification of corn seed varieties using deep convolutional neural network. J. Stored Prod. Res. 92, 101800. doi: 10.1016/j.jspr.2021.101800

Kosmopoulos, A., Partalas, I., Gaussier, E., Paliouras, G., Androutsopoulos, I. (2015). Evaluation measures for hierarchical classification: A unified view and novel approaches. Data Min. Knowl. Discov. 29, 820–865. doi: 10.1007/s10618-014-0382-x

Ma, T., Tsuchikawa, S., Inagaki, T. (2020). Rapid and non-destructive seed viability prediction using near-infrared hyperspectral imaging coupled with a deep learning approach. Comput. Electron. Agric. 177, 105683. doi: 10.1016/j.compag.2020.105683

Mi, Z., Zhang, X., Su, J., Han, D., Su, B. (2020). Wheat stripe rust grading by deep learning with attention mechanism and images from mobile devices. Front. Plant Sci. 11. doi: 10.3389/fpls.2020.558126

Ohn, I., Kim, Y. (2019). Smooth function approximation by deep neural networks with general activation functions. Entropy 21, 627. doi: 10.3390/e21070627

Pallavi, H. M., Gowda, R., Shadakshari, Y. G., Bhanuprakash, K., Vishwanath, K. (2011). Identification of SSR markers for hybridity and seed genetic purity testing in sunflower(Helianthus annuus L.). Helia. 34, 59–66. doi: 10.2298/hel1154059p

Park, H. S., Choi, K. C., Kim, J. H., So, M. J., Lee, S. H., Lee, K. W. (2016). Discrimination and quantification between annual ryegrass and perennial ryegrass seeds by near-infrared spectroscopy. J. Anim. Plant Sci. 26, 1278–1283.

Google Scholar

Ramachandran, P., Zoph, B., Le, Q. V. (2017). Swish: a self-gated activation function (Arxiv: Neural and Evolutionary Computing).

Rybacki, P., Niemann, J., Bahcevandziev, K., Durczak, K. (2023). Convolutional neural network model for variety classification and seed quality assessment of winter rapeseed. Sensors 23, 2486. doi: 10.3390/s23052486

Satturu, V., Rani, D., Gattu, S., Md, J., Mulinti, S., Nagireddy, R. K., et al. (2018). DNA fingerprinting for identification of rice varieties and seed genetic purity assessment. Agric. Res. 7, 379–390. doi: 10.1007/s40003-018-0324-8

Selvaraju, R. R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., Batra, D. (2016). Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128, 336–359. doi: 10.1007/s11263-019-01228-7

Shafinas, M. N. I., Bernard, D., Nazira, M., Rosentrater, K. A. (2022). Effect of grain moisture content and roller mill gap size on various physical properties of yellow dent corn flour. J. Food Res. 11, 16. doi: 10.5539/jfr.v11n2p16

Sun, J., Zou, Y. (2022). Analysis on the method of corn seed purity identification. Hans J. Agric. Sci. 10, 292–298. doi: 10.12677/HJAS.2020.105043

Sundaram, R. M., Naveenkumar, B., Biradar, S. K., Balachandran, S. M., Mishra, B., Ilyasahmed, M., et al. (2008). Identification of informative SSR markers capable of distinguishing hybrid rice parental lines and their utilization in seed purity assessment. Euphytica 163, 215–224. doi: 10.1007/s10681-007-9630-0

Tenaillon, M. I., Charcosset, A. A. (2011). European perspective on maize history. C. R. Biol. 334, 221–228. doi: 10.1016/j.crvi.2010.12.015

Tu, K., Wen, S., Cheng, Y., Zhang, T., Pan, T., Wang, J., et al. (2021). non-destructive and highly efficient model for detecting the genuineness of maize variety ‘JINGKE 968′ using machine vision combined with deep learning. Comput. Electron. Agric. 182, 106002. doi: 10.1016/j.compag.2021.106002

Wang, S. H., Phillips, P., Sui, Y., Liu, B., Yang, M., Cheng, H. (2018). Classification of alzheimer’s disease based on eight-layer convolutional neural network with leaky rectified linear unit and max pooling. J. Med. Syst. 42, 85. doi: 10.1007/s10916-018-0932-7

Wang, Y., Song, S. R. (2023). Variety identification of sweet maize seeds based on hyperspectral imaging combined with deep learning. Infrared Phys. Technol. 130, 104611. doi: 10.1016/j.infrared.2023.104611

Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q. (2020). “ECA-net: efficient channel attention for deep convolutional neural networks,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA., 11531–11539. doi: 10.1109/CVPR42600.2020.01155

Wen, X. (2020). Modeling and performance evaluation of wind turbine based on ant colony optimization-extreme learning machine. Appl. Soft. Comput. 94, 106476. doi: 10.1016/j.asoc.2020.106476

Xing, X., Liu, C., Han, J., Feng, Q., Lu, Q., Feng, Y. (2023). Wheat-seed variety recognition based on the GC_DRNet model. Agriculture 13, 2056. doi: 10.3390/agriculture13112056

Ye-Yun, X., Zhan, Z., Yi-Ping, X., Long-Ping, Y. (2005). Identification and purity test of super hybrid rice with SSR molecular markers. Rice Sci. 12, 7.

Yu, Z., Fang, H., Zhangjin, Q., Mi, C., Feng, X., He, Y. (2021). Hyperspectral imaging technology combined with deep learning for hybrid okra seed identification. Biosyst. Eng. 212, 46–61. doi: 10.1016/j.biosystemseng.2021.09.010

Zang, H., Wang, Y., Ru, L., Zhou, M., Chen, D., Zhao, Q., et al. (2022). Detection method of wheat spike improved YOLOv5s based on the attention mechanism. Front. Plant Sci. 13. doi: 10.3389/fpls.2022.993244

Zhang, J., Dai, L., Cheng, F. (2021). Corn seed variety classification based on hyperspectral reflectance imaging and deep convolutional neural network. J. Food Meas. Charact. 15, 484–494. doi: 10.1007/s11694-020-00646-3

Zhang, F., Zhang, F., Wang, S., Li, L., Lv, Q., Fu, S., et al. (2023). Hyperspectral imaging combined with CNN for maize variety identification. Front. Plant Sci. 14. doi: 10.3389/fpls.2023.1254548

Zhang, C., Zhao, Y. Y., Yan, T. Y., Bai, X. L., Xiao, Q. L., Gao, P., et al. (2020). Application of near-infrared hyperspectral imaging for variety identification of coated maize kernels with deep learning. Infrared Phys. Technol. 111, 103550. doi: 10.1016/j.infrared.2020.103550

Zhou, L., Zhang, C., Taha, M. F., Wei, X., He, Y., Qiu, Z., et al. (2020). Wheat kernel variety identification based on a large near-infrared spectral dataset and a novel deep learning-based feature selection method. Front. Plant Sci. 11. doi: 10.3389/fpls.2020.575810

Zhu, W., Zeng, N., Wang, N. (2010). “Sensitivity, specificity, accuracy, associated confidence interval and ROC analysis with practical SAS implementations.” NESUG proceedings: health care and life sciences, Baltimore, Maryland. 19, 67.

Zhu, S., Zhou, L., Gao, P., Bao, Y., He, Y., Feng, L. (2019). Near-infrared hyperspectral imaging combined with deep learning to identify cotton seed varieties. Molecules 24, 3268. doi: 10.3390/molecules24183268

Keywords: artificial intelligence, computer vision, corn seeds, variety identification, ResNet model

Citation: Li J, Xu F, Song S and Qi J (2024) A maize seed variety identification method based on improving deep residual convolutional network. Front. Plant Sci. 15:1382715. doi: 10.3389/fpls.2024.1382715

Received: 06 February 2024; Accepted: 19 April 2024; Published: 13 May 2024.

Reviewed by:

Copyright © 2024 Li, Xu, Song and Qi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shaozhong Song, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

  • News & Media
  • Chemical Biology
  • Computational Biology
  • Ecosystem Science
  • Cancer Biology
  • Exposure Science & Pathogen Biology
  • Metabolic Inflammatory Diseases
  • Advanced Metabolomics
  • Mass Spectrometry-Based Measurement Technologies
  • Spatial and Single-Cell Proteomics
  • Structural Biology
  • Biofuels & Bioproducts
  • Human Microbiome
  • Soil Microbiome
  • Synthetic Biology
  • Computational Chemistry
  • Chemical Separations
  • Chemical Physics
  • Atmospheric Aerosols
  • Human-Earth System Interactions
  • Modeling Earth Systems
  • Coastal Science
  • Plant Science
  • Subsurface Science
  • Terrestrial Aquatics
  • Materials in Extreme Environments
  • Precision Materials by Design
  • Science of Interfaces
  • Friction Stir Welding & Processing
  • Dark Matter
  • Flavor Physics
  • Fusion Energy Science
  • Neutrino Physics
  • Quantum Information Sciences
  • Emergency Response
  • AGM Program
  • Tools and Capabilities
  • Grid Architecture
  • Grid Cybersecurity
  • Grid Energy Storage
  • Earth System Modeling
  • Energy System Modeling
  • Transmission
  • Distribution
  • Appliance and Equipment Standards
  • Building Energy Codes
  • Advanced Building Controls
  • Advanced Lighting
  • Building-Grid Integration
  • Building and Grid Modeling
  • Commercial Buildings
  • Federal Performance Optimization
  • Resilience and Security
  • Grid Resilience and Decarbonization
  • Building America Solution Center
  • Energy Efficient Technology Integration
  • Home Energy Score
  • Electrochemical Energy Storage
  • Flexible Loads and Generation
  • Grid Integration, Controls, and Architecture
  • Regulation, Policy, and Valuation
  • Science Supporting Energy Storage
  • Chemical Energy Storage
  • Waste Processing
  • Radiation Measurement
  • Environmental Remediation
  • Subsurface Energy Systems
  • Carbon Capture
  • Carbon Storage
  • Carbon Utilization
  • Advanced Hydrocarbon Conversion
  • Fuel Cycle Research
  • Advanced Reactors
  • Reactor Operations
  • Reactor Licensing
  • Solar Energy
  • Wind Resource Characterization
  • Wildlife and Wind
  • Community Values and Ocean Co-Use
  • Wind Systems Integration
  • Wind Data Management
  • Distributed Wind
  • Energy Equity & Health
  • Environmental Monitoring for Marine Energy
  • Marine Biofouling and Corrosion
  • Marine Energy Resource Characterization
  • Testing for Marine Energy
  • The Blue Economy
  • Environmental Performance of Hydropower
  • Hydropower Cybersecurity and Digitalization
  • Hydropower and the Electric Grid
  • Materials Science for Hydropower
  • Pumped Storage Hydropower
  • Water + Hydropower Planning
  • Grid Integration of Renewable Energy
  • Geothermal Energy
  • Algal Biofuels
  • Aviation Biofuels
  • Waste-to-Energy and Products
  • Hydrogen & Fuel Cells
  • Emission Control
  • Energy-Efficient Mobility Systems
  • Lightweight Materials
  • Vehicle Electrification
  • Vehicle Grid Integration
  • Contraband Detection
  • Pathogen Science & Detection
  • Explosives Detection
  • Threat-Agnostic Biodefense
  • Discovery and Insight
  • Proactive Defense
  • Trusted Systems
  • Nuclear Material Science
  • Radiological & Nuclear Detection
  • Nuclear Forensics
  • Ultra-Sensitive Nuclear Measurements
  • Nuclear Explosion Monitoring
  • Global Nuclear & Radiological Security
  • Disaster Recovery
  • Global Collaborations
  • Legislative and Regulatory Analysis
  • Technical Training
  • Additive Manufacturing
  • Deployed Technologies
  • Rapid Prototyping
  • Systems Engineering
  • 5G Security
  • RF Signal Detection & Exploitation
  • Climate Security
  • Internet of Things
  • Maritime Security
  • Artificial Intelligence
  • Graph and Data Analytics
  • Software Engineering
  • Computational Mathematics & Statistics
  • High-Performance Computing
  • Visual Analytics
  • Lab Objectives
  • Publications & Reports
  • Featured Research
  • Diversity, Equity, Inclusion & Accessibility
  • Lab Leadership
  • Lab Fellows
  • Staff Accomplishments
  • Undergraduate Students
  • Graduate Students
  • Post-graduate Students
  • University Faculty
  • University Partnerships
  • K-12 Educators and Students
  • STEM Workforce Development
  • STEM Outreach
  • Meet the Team
  • Internships
  • Regional Impact
  • Philanthropy
  • Volunteering
  • Available Technologies
  • Industry Partnerships
  • Licensing & Technology Transfer
  • Entrepreneurial Leave
  • Atmospheric Radiation Measurement User Facility
  • Electricity Infrastructure Operations Center
  • Energy Sciences Center
  • Environmental Molecular Sciences Laboratory
  • Grid Storage Launchpad
  • Institute for Integrated Catalysis
  • Interdiction Technology and Integration Laboratory
  • PNNL Portland Research Center
  • PNNL Seattle Research Center
  • PNNL-Sequim (Marine and Coastal Research)
  • Radiochemical Processing Laboratory
  • Shallow Underground Laboratory

Conservation management decreases surface runoff and soil erosion

Published: May 14, 2024

Research topics

IMAGES

  1. 10 Real-Life Experimental Research Examples (2024)

    experimental method research articles

  2. FREE 11+ Experimental Research Templates in PDF

    experimental method research articles

  3. 🎉 Methodology sample for experimental research. WRITING CHAPTER 3

    experimental method research articles

  4. FREE 11+ Experimental Research Templates in PDF

    experimental method research articles

  5. Experimental Research Designs: Types, Examples & Advantages (2023)

    experimental method research articles

  6. Experimental Research Methodology Examples / Methodology in research

    experimental method research articles

VIDEO

  1. The scientific approach and alternative approaches to investigation

  2. PCR: Past, Present, and Future

  3. Lecture 52: Mixed Method Research Approach

  4. Scientific Method for Research​​ #reseach #study

  5. What is research methodology?

  6. experimental method in psychology EXPERIMENTAL METHODS

COMMENTS

  1. Exploring Experimental Research: Methodologies, Designs, and

    Experimental research serves as a fundamental scientific method aimed at unraveling. cause-and-effect relationships between variables across various disciplines. This. paper delineates the key ...

  2. Beauty sleep: experimental study on the perceived health and

    Methods. Using an experimental design we photographed the faces of 23 adults (mean age 23, range 18-31 years, 11 women) between 14.00 and 15.00 under two conditions in a balanced design: after a normal night's sleep (at least eight hours of sleep between 23.00-07.00 and seven hours of wakefulness) and after sleep deprivation (sleep 02.00-07.00 and 31 hours of wakefulness).

  3. The past, present, and future of experimental methods in the social

    The first trend to emphasize from Fig. 1 is that all three disciplines are utilizing the experimental method more today than they were 30 years ago. On average across the three disciplines, roughly four percent of articles published in the early 1990s used experimental methods; in contrast, over fourteen percent of published articles used experiments in the late 2010s, more than a threefold ...

  4. Experimental Research Design

    Experimental research design is centrally concerned with constructing research that is high in causal (internal) validity. Randomized experimental designs provide the highest levels of causal validity. Quasi-experimental designs have a number of potential threats to their causal validity. Yet, new quasi-experimental designs adopted from fields ...

  5. Study/Experimental/Research Design: Much More Than Statistics

    Study, experimental, or research design is the backbone of good research. It directs the experiment by orchestrating data collection, defines the statistical analysis of the resultant data, and guides the interpretation of the results. When properly described in the written report of the experiment, it serves as a road map to readers, 1 helping ...

  6. Data, measurement and empirical methods in the science of science

    Here we review three quasi-experimental methods: difference-in-differences, instrumental variables and regression discontinuity (Fig. 3). Fig. 3: Quasi-experiment methods.

  7. Full article: Natural experiment methodology for research: a review of

    As such, researchers must use alternative yet robust research methods for determining the impact of such interventions. The evaluation of natural experiments (i.e. an intervention not controlled or manipulated by researchers), using various experimental and non-experimental design options can provide an alternative to the RCT.

  8. Experimental Research

    Experimental science is the queen of sciences and the goal of all speculation. Roger Bacon (1214-1294) Download chapter PDF. Experiments are part of the scientific method that helps to decide the fate of two or more competing hypotheses or explanations on a phenomenon. The term 'experiment' arises from Latin, Experiri, which means, 'to ...

  9. Experimental methods: Between-subject and within-subject design

    In the remainder of this article, we provide an overview in Section 2 and some simple examples in Section 3. We discuss experiments where the two different methods led to different results in Section 4, and to similar results in Section 5. We describe some econometric issues in Section 6, and conclude in Section 7. 2. Overview

  10. Frontiers

    The Practice of Experimental Psychology: An Inevitably Postmodern Endeavor. Roland Mayrhofer * Christof Kuhbandner Corinna Lindner. Department of Psychology, University of Regensburg, Regensburg, Germany. The aim of psychology is to understand the human mind and behavior. In contemporary psychology, the method of choice to accomplish this ...

  11. Journal of Experimental Psychology: General: Sample articles

    February 2011. by Jeff Galak and Tom Meyvis. The Nature of Gestures' Beneficial Role in Spatial Problem Solving (PDF, 181KB) February 2011. by Mingyuan Chu and Sotaro Kita. Date created: 2009. Sample articles from APA's Journal of Experimental Psychology: General.

  12. How the Experimental Method Works in Psychology

    The experimental method involves manipulating one variable to determine if this causes changes in another variable. This method relies on controlled research methods and random assignment of study subjects to test a hypothesis. For example, researchers may want to learn how different visual patterns may impact our perception.

  13. The Use of Research Methods in Psychological Research: A Systematised

    Introduction. Psychology is an ever-growing and popular field (Gough and Lyons, 2016; Clay, 2017).Due to this growth and the need for science-based research to base health decisions on (Perestelo-Pérez, 2013), the use of research methods in the broad field of psychology is an essential point of investigation (Stangor, 2011; Aanstoos, 2014).Research methods are therefore viewed as important ...

  14. Experimental Method In Psychology

    There are three types of experiments you need to know: 1. Lab Experiment. A laboratory experiment in psychology is a research method in which the experimenter manipulates one or more independent variables and measures the effects on the dependent variable under controlled conditions. A laboratory experiment is conducted under highly controlled ...

  15. Full article: An Empirical Review of Research Methodologies and Methods

    RESEARCH METHODOLOGY AND METHOD. Although research articles were often categorized by methods (e.g., Dai, Swanson, & Cheng, ... 62% of quantitative studies used psychometric methodology and 31% were experimental in creativity research. In contrast, there were 15% psychometric studies and 10% experimental studies in gifted education. ...

  16. Using experimental methods in higher education research

    EXPERIMENTAL METHODS have been used extensively for many years to conduct research in education and psychology. However, applications of experiments to investigate technology and other instructional innovations in higher education settings have been relatively limited. The present paper examines ways in which experiments can be used productively by higher education researchers to increase the ...

  17. [PDF] EXPERIMENTAL RESEARCH METHODS

    Published 2003. Education, Psychology. Experimental research has had a long tradition in psychology and education. When psychology emerged as an infant science during the 1900s, it modeled its research methods on the established paradigms of the physical sciences, which for centuries relied on experimentation to derive principals and laws.

  18. Experimental Methods in Criminology

    Experimental methods can also be applied in the context of between-subject and within-subject paradigms, both of which exhibit unique characteristics and implications. Experimental methods—as a research method —are unique in their ability to help establish causal relationships among variables. This article introduces the topic of ...

  19. Planning and Conducting Clinical Research: The Whole Process

    The pinnacle of non-experimental research is the comparative effectiveness study, which is grouped with other non-experimental study designs such as cross-sectional, ... Springer's Journal Author Academy, and SAGE's Research methods [34-37]. Standardized research reporting guidelines often come in the form of checklists and flow diagrams.

  20. Network representation of multicellular activity in pancreatic islets

    Recognizing the significance of collective activity, network science approaches have been increasingly applied in islet research. However, variations in experimental setups, imaging techniques, signal processing, and connectivity analysis methods across different research groups pose challenges for integrating findings into a comprehensive picture.

  21. An Effectiveness Study Across Baseline and Neural Network-based Force

    In this study, we further investigate the robustness and generalization ability of an neural network (NN) based force estimation method, using the da Vinci Research Kit Si (dVRK-Si). To evaluate our method's performance, we compare the force estimation accuracy with several baseline methods. We conduct comparative studies between the dVRK classic and dVRK-Si systems to benchmark the ...

  22. The Practice of Experimental Psychology: An Inevitably Postmodern

    The aim of psychology is to understand the human mind and behavior. In contemporary psychology, the method of choice to accomplish this incredibly complex endeavor is the experiment. This dominance has shaped the whole discipline from the self-concept as an empirical science and its very epistemological and theoretical foundations, via research ...

  23. Deep Neural Network for Direct Prediction of Analytic Signals ...

    Objective: A method for real-time prediction of analytic signals is needed for state-informed stimulation in electroencephalography (EEG) experiments. The currently available methods lack sufficient prediction accuracy or have a complicated selection process of experimental parameters. Approach: The proposed method uses a deep neural network (DNN) to predict current and future EEG phases and ...

  24. Digital Flow in a Pool Induced by a Vertical Jet

    Turbulent water jets remain a critical study area, particularly the relation of the water flow with air entrainment and its role in energy dissipation at different hydraulic structures. Plunge pools, formed by the impact of jets on water cushions, play a pivotal role in energy dissipation. Understanding the complex flow dynamics within these pools is essential for designing efficient hydraulic ...

  25. The Relative Merits of Observational and Experimental Research: Four

    The development of modern, systematic experimental technique for living environments is usually associated with the publication of "The design and analysis of experiments' and 'Statistical methods for research workers' by Sir Ronald Fisher [30,38,39]. Although Fisher's work is most heavily recognised and cited in the role of risk ...

  26. Frontiers

    The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was funded by the Natural Science Foundation of Jilin Province (No.2020122348JC), Innovation Capacity Project on Development and Reform Commission of Jilin Province (No.2020C019-6). Conflict of interest

  27. Conservation management decreases surface runoff and soil erosion

    Across all conservation management practices, surface runoff and erosion had respective mean decreases of 67% and 80% compared with controls. Use of cover cropping provided the largest decreases in erosion and surface runoff. Coarse- and medium-textured soils had greater decreases in both erosion and runoff than fine-textured soils. Changes in ...

  28. Selecting and Improving Quasi-Experimental Designs in Effectiveness and

    Alternative research methods are needed to test interventions for their effectiveness in many real-world settings—and later when evidence-based interventions are known, for spreading or scaling up these interventions to new settings and populations (23,40). In real-world settings, random allocation of the intervention may not be possible or ...