Implementing AI Automation in Systematic Literature Reviews: A Practical Guide

For niche academic researchers, the systematic literature review (SLR) is both essential and arduous. Manual screening and data extraction consume months of valuable time. AI automation, implemented through tools like Rayyan and ASReview, offers a transformative solution. This guide moves from theory to practice.

Foundational AI Concepts for Screening

Effective automation hinges on understanding key machine learning strategies. Active learning, specifically uncertainty sampling, is the core query method. The AI model prioritizes records it is least confident about, maximizing learning from each human decision. For text representation, TF-IDF (Term Frequency-Inverse Document Frequency) effectively converts abstracts and titles into numerical features, capturing key term importance. To handle the common issue of few relevant studies among many irrelevant ones, a dynamic resampling balance strategy adjusts the training data, preventing the model from being biased toward the majority class. As a model, Naive Bayes often provides a fast, robust starting point due to its efficiency with text data.

A Step-by-Step Implementation Process

First, prepare your dataset. Export your gathered references (e.g., from PubMed, Scopus) into a CSV file with clear columns for title, abstract, and a binary inclusion label. Start with a small seed set of 10-15 clearly relevant (“include”) and irrelevant (“exclude”) records to initialize the model. Import this file into your chosen platform.

In ASReview, you can directly configure the AI pipeline using the strategies above: TF-IDF for features, Naive Bayes as the classifier, uncertainty sampling for query, and dynamic resampling for balancing. The software then presents records one by one for your decision, continuously updating the model. Rayyan integrates similar AI functionality, offering “Prioritize” mode which uses active learning to rank references by predicted relevance.

Screen interactively. As you label each presented record, the AI’s predictions improve, progressively surfacing more relevant studies. This human-in-the-loop process ensures accuracy while drastically reducing the total number of records you must manually assess. After screening, use the model’s predictions to aid in the subsequent data extraction phase, highlighting papers most likely to contain your target variables.

For a comprehensive guide with detailed workflows, templates, and additional strategies, see my e-book: AI for Niche Academic Researchers: How to Automate Systematic Literature Review Screening and Data Extraction.