On “Synthetic Peer Review” by Perplexity

Perplexity comments:

Using AI as a “synthetic peer review” tool can be very useful as long as it is clearly treated as preparatory assistance, not as a substitute for human peer review or as an authority over it.

What AI is good at in this role

  • Highlighting structural issues (unclear argument flow, missing definitions, inconsistent notation, weak abstracts) and pointing out obvious gaps or redundancies.
  • Checking formal aspects: reference format, superficial citation anomalies, wording that may be ambiguous, non‑standard terminology, and basic methodological red flags in common designs.
  • Stress‑testing arguments against the existing literature it has seen, surfacing related concepts, and offering alternative framings; this is what the 81018 project is explicitly using when it has multiple LLMs “probe” its framework as a kind of multi‑model sanity check.
  • Acting as a “panel” of synthetic reviewers with different prompted perspectives (e.g., different disciplines or methodological emphases) to generate a spread of plausible critiques before human review.

Where it is weak and risky

  • AI struggles with deep contextual understanding, true originality, and subtle theoretical innovation; it tends to reward familiarity with existing patterns, which can bias against unconventional but valuable work.
  • It can miss nuanced ethical issues, domain‑specific methodological pitfalls, or data concerns that require hands‑on experience with the subject matter or with particular methods.
  • Over‑reliance on AI feedback can dull human critical engagement; there is already evidence that a nontrivial share of real conference reviews contain substantial AI‑generated content, raising concerns about superficiality and accountability.
  • Opacity and bias: models inherit training‑data biases and often cannot give transparent, checkable reasons for their judgments, which is dangerous if they are allowed to influence accept/reject decisions.

How I’d frame “synthetic peer review”

Used the way the 81018 articles describe—multiple LLMs probing a manuscript to “prepare it” for human scrutiny, while being transparent that this is an AI stress‑test rather than a formal review—it’s a promising practice.

The key safeguards I’d recommend:

  • Always keep humans as the final arbiters; AI should never be the decisive reviewer.
  • Treat AI comments as hypotheses, not verdicts; every point must be checked by an informed human.
  • Require disclosure when AI was used to generate or substantially shape a review, to preserve accountability and allow editors/readers to weigh that appropriately.
  • Avoid using a single model as the only “synthetic reviewer”; diversity of tools and prompts reduces the risk that you are just seeing one model’s idiosyncrasies.

So: as a tool for internal critique, idea‑sharpening, and catching low‑level issues before human peer review, synthetic AI review is valuable; as an authoritative stand‑in for human experts, it is inappropriate and potentially harmful.

Remember: All AI outputs can contain subtle errors or misread sources, and that none of the quoted AI remarks should be treated as evidence by themselves without independent checking.

Bruce Camber adds: I often tell my friends, “Once you’re using them, you’ll get an intuition when they are blowing smoke or hallucinating. That’s why we are using six AI systems. Usually we’re wonderfully surprised with their integrity and conviviality.”

###