An Insider Look at One’s Own Publications: Research Trajectories, Long-Term Impact and Mentorship

The great tragedy of science – the slaying of a beautiful hypothesis by an ugly fact.

Thomas Huxley

Reaching 300 peer-reviewed manuscripts (not counting peer-reviewed conference proceeding manuscripts) is a symbolic milestone in any academic career. With the recent publication of “Retrospective evaluation of high-dose-rate brachytherapy multicriteria planning using physical dose versus radiobiological criteria for prostate cancer” in Nature Scientific Reports, and after 30+ years of publishing, I decided to take a data-driven, deep-dive look at my own research portfolio.

My very first peer-reviewed manuscript was published in 1992, for which I was a co-author. My first first-authored peer-reviewed manuscript was published in Nuclear Physics A in 1994, two years later. The very first Medical Physics paper was published in 1999 while in a postdoctoral fellowship position in Berkeley for a study linked to cross-section measurements of neutron production in the context boron-neutron capture therapy. The 100th publication was in 2008 by Bazalova et al. (and is one the most cited manuscripts I collaborated on, with close to 300 citations at this time). The 200th peer-reviewed published manuscript was in 2017 by Miksys et al. has part of a collaboration with Carleton University.

Before turning to those analyses, it is worth situating this body of work within a broader, field-normalized publication and citation context. Since 2018, I have been listed among the top 2% most-cited researchers worldwide in my scientific field, based on the standardized bibliometric database developed by Ioannidis et al. (PLoS Biology, 2019). This classification, which is field-specific and based on composite citation indicators, has remained valid through successive updates up to 2025. In the 2025 release, my profile additionally entered the career-long (lifetime) top-2% category, alongside continued inclusion in the single-year top-2% cohort. Finally, according to OpenAlex data, my field (sub-field) weighted citation index (FWCI) is well above the world average.

The above external benchmarks provide an independent validation that the citation patterns discussed below are not merely internally consistent, but also competitive at the international level within my discipline.

The purpose of the present analysis is therefore not to establish rank, but to understand the mechanisms underlying sustained impact: how citations accumulate over time, how impact is distributed across publications, and how recent contributions compare to earlier work.

Raw publication and citations data

Data where extracted using Publish or Perish (see reference at the end of this post) software with Google Scholar as its source. All figures and analyses presented here were generated using Python. 

The intent is not boosting (many researchers have much better publication, citation and overall impact records than me), but understanding how long-term research programs evolve, accumulate influence, and remain relevant (at least I think so) over time.

Figures 1 and 2 above display the number of peer-reviewed manuscripts published per year (left), starting in 1992, and the number of citations every year (right) for the same time span. Note that I have excluded conference proceedings, published conference abstracts, book chapters, and patents from Figure 1 (even though these tend to have low citation counts if any). 

These figures illustrate three key periods in a scientific career. 

1992–2000 — First step in research: PhD and postdoctoral phase

• low but increasing productivity,

• typical early-career trajectory.

~2000–2013 — Expansion phase

• establishing and growing one research program,

• steady rise culminating around 2012,

• reflects peak trainee throughput and collaborative projects.

~2014–present — Mature mentoring regime

• high mentorship intensity with relatively stable productivity, 

• sustained funding environment,

• coherent long-term research themes,

 • strong citation accumulation continues despite stable output.

Total citations over time: why quadratic growth is expected

Figure 3 below shows the total cumulative number of citations per year from 1992 to 2025, together with a quadratic fit and its 95% prediction interval.

The most salient feature is the smooth quadratic increase in cumulative citations. Importantly, this behaviour does not imply accelerating impact per paper. Instead, it reflects a simple and well-understood cumulative mechanism:

• the cumulative number of publications grows approximately linearly with time,

• each paper continues to accrue citations year after year,

• older papers remain active contributors to the citation pool.

Mathematically, when a linearly growing publication base is integrated over time under roughly constant per-paper citation rates, the result is a quadratic growth law. The quadratic fit therefore has a mechanistic interpretation in which citation behaviour per paper has remained stable (on average), but the accumulation of work drives the curve.

Citation distributions: heterogeneity with structure

Aggregate trends can obscure important structure. To examine this, the citation distribution of individual manuscripts (≥1992) was analyzed using unbinned data and modeled in log–log space as shown in Figure 4 (below).

The obtained distribution is best described by a smooth double-Pareto model, characterized by:

• a low-citation (low-visibility, early-career, or new entries) regime, where scaling is weak or absent,

• a high-citation regime following a genuine power law with a slope above 2 (tail of the distribution),

• a smooth crossover at approximately 45 citations.

Importantly, the heavy-tailed component is not dominated by a single early contribution; it is populated repeatedly across the career span, indicating selective but sustained high impact. Double-Pareto distributions are observed across economics, urban systems, network science, and natural phenomena — preferential growth, entry of new contributors, selection, and saturation interact. Thus, the observed citation patterns can be explained within a broader class of adaptive systems. 

Moreover, the presence of a smooth crossover (rather than a sharp break) can be interpreted (I think) as evidence of healthy system evolution, where growth is neither unconstrained nor artificially capped. The observed citation distribution is somewhat related to Lotka’s Law, one of the earliest empirical laws of bibliometrics. While Lotka described the power-law distribution of scientific productivity across authors, the present analysis examines the problem of impact across papers with time for a single authors.

h-index, m-index, and temporal consistency

At present, looking at data from Google Scholar, the following can be extracted:

h-index = 62

• career m-index ≈ 1.8 (first peer-reviewed publication: 34 years ago)

• 5-year m-index ≈ 3, computed using only papers published in the last five years

An m-index close to 2 over more than three decades is well above the conventional benchmark (m ≈ 1) associated with sustained impact. More strikingly, the recent 5-year m-index exceeds the career average, demonstrating that newer publications are entering the h-core at least as fast as the earlier work: they are quite relevant to the field!

This pattern directly contradicts the late-career scenario in which citation metrics are driven primarily by legacy papers. Instead, it points to continued intellectual leadership and contemporary relevance. Layman (i.e. my) interpretation: I am not ready to retire yet; a few good ideas remain in this brain of mine 😉

Projection: what happens if nothing changes?

Using the quadratic and double Pareto models fitted to the 1992–2025 data, and assuming:

• stable publication rates,

• stable citation behaviour per paper,

• no change in the double-Pareto behaviour.

The expected total citation counts 10 years from now (2035) would be approximately 26,000 citations from about 420 manuscripts and an h-index close to or slightly above 80.

This projection does not rely on acceleration, step changes, or exceptional future events. It is the direct consequence of maintaining the same structural dynamics observed over the past decades. Alternatively, departure from this behaviour might signal major changes, either decrease or increase, in productivity. Will see …

What can—and cannot—be concluded

Taken together, the analyses support several, I would say robust conclusions:

• Citation growth is structurally cumulative, not speculative.

• Impact is heterogeneous but reproducible, with a persistent high-impact tail, i.e. continued entry of new work into the high-impact regime.

• Recent publications are at least as influential as earlier ones.

• Independent bibliometric indicators — time series, distributional models, and index-based metrics — are mutually consistent.

Equally important are the conclusions that cannot be drawn:

• There is no evidence of exponential runaway growth.

• No reliance on a small number of outlier papers – in fact, a large fraction of the published manuscripts have 10 or more citations (Google Scholar !10 index),

• No indication of declining relevance.

Not the finish line…yet!

From a bibliometrics standpoint, this combination — long-term stability coupled with ongoing contemporary strong performance — is both refreshing to the researcher I am and more informative than any single metric. What is perhaps less visible in bibliometric data, but no less important (I would even contend even more important), is how this body of work was produced.

During the first decade of my academic career, I was most often the first author or co-author on peer-reviewed manuscripts, reflecting a learning phase (under supervision!), establishing research directions, methods, and collaborations. Over the last two decades, this authorship pattern shifted completely. Today, approximately 78% of the manuscripts list trainees as first authors, spanning more than 200 individuals, from undergraduate researchers to postdoctoral fellows.

This transition is not incidental. It reflects a transition to independent researcher status and becoming an active supervisor and mentor. It also reflects on the strength and capacity of those themes to generate new questions, ideas, and solutions in the hands of emerging, bright young scientists. The sustained citation impact observed across the portfolio is therefore not driven by a single individual, but by the creativity, independence, and intellectual ownership of successive generations of trainees.

Seen from this perspective, this 300th peer-reviewed manuscript does not sit at the top of a personal achievement pyramid. It rests on a collective effort, built over time, in which mentoring, training, and scientific curiosity are inseparable from research output itself.  I have also always been incredibly blessed to have had supervisors and mentors that were wonderful human beings. They created environments that were conducive to open discussion. I learned early on that having and sharing a good idea is not dependent on your level as a trainee, and that level should not prevent you from expressing yourself. They further granted me autonomy in my research activities, allowing me to explore new techniques, approaches, and ideas. However, they also provided me with the necessary supervision to steer the project back on track when things veered off course; I was encouraged to make mistakes, and it was perfectly acceptable. 

Since then, I’ve been trying to replicate this approach.  I always tell trainees that coming to the lab and conducting research should be enjoyable.  It’s not always easy, and setbacks happen, but overall, the experience should be positive.

Thus, I will be eternally grateful to all the supervisors, mentors, colleagues, collaborators, but most of all the trainees that have joined (or will join) me in the roller-coaster adventure that is scientific research. It was, it is, and I hope, it will remain fun.

Reference

Ioannidis JPA, Baas J, Klavans R, Boyack KW. A standardized citation metrics author database annotated for scientific field. PLoS Biology. 2019;17(8):e3000384. https://doi.org/10.1371/journal.pbio.3000384

2025 update of the above manuscript accessible via DOI: 10.17632/btchxktzyw.8  (all data accessible, including the previous versions up to the first publication)

Harzing, A.W. (2007) Publish or Perish, available from https://harzing.com/resources/publish-or-perish

My Google Scholar page: https://scholar.google.com/citations?user=X4J8eVUAAAAJ&hl=fr

Analysis of publication impact in predatory-journal – Nature

In case you’ve missed this one, an interesting analysis was recently published in Nature on citations of manuscripts published in predatory journal. If you contrast with a previous post of mine (here), when considering all journals about 24% of publications get 10 or more citations. This falls dramatically for predatory journal. More importantly no paper get over 32 citations in those journals while 1.8% of all published manuscripts get over 100 citations in general.

 

Source: Predatory-journal papers have little scientific impact

How many citations are actually a lot of citations?

In a previous blog post, I suggested to my younger colleagues that while they should not care so much about the impact factor of the journals they published in (as long as these journals are well-read in their respective fields of research), they should care quite a lot about these papers being cited, and cited by others not self-cited!

A few months ago, I was listening to the introductory talk of for a prestigious award from our national organization when one statement hit me: a physicist with 2000 or more citations is part of the 1% most cited physicists worldwide. There might have been a bit more to that statement but let’s work with it.

Continue reading

Journal Impact Factor: why you should not care… too much

I have been publishing scientific manuscripts for the past 22 years. My educated comments with regard to journal impact factor has always been the same (If you do not know what JIF is, please have a look at this Wikipedia entry). First order, you should publish in the most important journals for your field. If their JIF are low, who cares as long as your work is important to your field and well cited. For example, the scientific discovery of 2012 according to Science (very high JIF) is the publication of the experimental finding of the Higgs boson… in Phys Lett B (low JIF relative to Science)!

Do not forgot that from an historical perspective, we are awfully bad at predicting what will be the next important discovery down the road. A number of fundamental discoveries and early engineering feats were discarded at first. Similarly there are numerous example of scientists having had tremendous issues in getting those game-changing results published, even those ending up winning Nobel prizes. Karry Mullins’ PCR work is one of many examples of work was rejected by journals having top JIF but for which the application of this very technique was published in Science and Nature and the citations counts of these second generation papers also receiving higher numbers than the original, award-winning work!)

Now, you do not have to agree with this lone scientist opinion but certainly you should have a look at the The San Francisco Declaration on Research Assessment or DORA petition, which is supported by the “big boys” (no discrimination intended). The declaration statement is actually a very interesting read and it covers the historical origin of the JIF (which was not for evaluating researchers at all) and further call for dropping journal-based metrics in assessing scientific productivity for funding and promotion. Over 240 organizations and 6000 individuals have already signed the declaration.

In conclusion, do not loose a good night sleep over your favorite journals’ impact factors…

Applying the 80/20 principle to scientific productivity?

The secret [to scientific success] is comprised in three words— Work, Finish, Publish.
— Michael Faraday

One of the thing I really like to do when waiting for a connecting flight at a major airport is to spent time at a book store. Not too long ago, I came to this book about the 80/20 principle.

80-20-principle

It stands just about 200 pages, which means a quick read and it had reference to Pareto. Being involved in computer optimization problems, in particular involving two or more opposing constraints, the notion of Pareto front is fresh to my mind. Similarly the notion that 80% of  the work can be achieve with only 20% of the feature of a software or 80% of the riches is held by 20% of the population or that is takes 80% of effort to accomplish the most demanding 20% of a project are all well-known applications of the discovery made by Pareto.

The book

The book explains the above principle with examples and also discusses how it apply to business, project managements and personal life. As you can expect, it take about 20% of the book to reach at least 80% (if not more!) of the goals set forth by it 😉

Still, overall an interesting and very fast read.

Can it be applied to science?

Well, a lot of what we do in research is program (collection of projects) and project-based. Therefore, it is always worth the effort to ask yourself why you are undertaking a new project, if it will contribute significantly to your overall research program and if the resources needed to accomplish it are available. It may very-well be that you will need to spent an enormous amount of effort  (let say 80%!) on a given project such that you will have to halt almost everything else. It better mare sense and pay off!

Can it be apply to analyze scientific productivity?

While reading the book I was wondering if only a small portion of my research program was really contributing to citations and impact on the field. I decide to quickly look at this by using Google Scholar. GS can track citations and h-index base on all of your papers and it takes last than 5 minutes to set-up (go over to scholar.google.com and chose “my citations” at the top right)

I will not providing my absolute numbers here. Still, fair enough my h-index is such that the value corresponds exactly to 20% of my published papers i.e. 20% of my published papers contribute to my h-index value. For example, for my h-index was 20, this would means that 20 papers have 20 or more citations and, it would also corresponds to the 20% most cited among 100 published manuscripts.

Next I look at the citations of each paper individually. On the figure below, you will find the fraction of total citations as a function of the fraction of manuscripts published.

Fraction-of-citations

It is quite interesting to see that a small fraction of all papers account for the majority of the citations. In my case, 13% of the manuscripts contribute to 50% of the citations and 42% contribute to 80% of the citations. So yes the Pareto principle is at play, but…

Limitations

If you were to ask me about each paper included in the 13% that gather 50% of the citations, I would reply:

  • Some I knew as we were preparing it that it would be important to the field.
  • Some I thought would be important but are not cited so much.
  • Some I thought were curiosities that would be of interest to only a few but ended-up as my most cited papers.

I think you get the message…

Conclusion

I can prove anything by statistics except the truth.
— George Canning

Yes, you can make statistics say anything. In the context of a creative process, predicting which of the creative action (here paper) will become a hit is actually rather easier after the fact than the other way around. Therefore, the concept might be interesting to track your resources (grant dollars, materials, projects to start, …) but it cannot be used, as expected I guess, to help you predict your future creative hit wonder!