From Theory to Practice: Implementing AI Screening with Rayyan and ASReview

Bridging the Gap Between Methodology and Tooling

For niche academic researchers, systematic literature reviews (SLRs) are a cornerstone of rigorous scholarship. Yet, screening thousands of abstracts and extracting data from a handful of relevant studies remains a bottleneck. AI automation—specifically active learning—offers a path from tedious manual work to efficient, transparent workflows. This post translates the core mechanics of active learning into a practical implementation using Rayyan and ASReview.

Why Standard Screening Fails Niche Fields

In narrow research domains, relevant records are rare—often less than 1% of the total retrieved. This imbalance cripples traditional keyword-based screening. You waste hours scanning irrelevant titles. Active learning solves this through a dynamic resampling strategy: it continuously adjusts which records to show you, prioritizing those most likely to be relevant while down-weighting the overwhelming majority of noise.

The Active Learning Engine: What Happens Under the Hood

Both Rayyan and ASReview use active learning loops. Here is the simplified theory behind the tools:

Feature Extraction: Text from titles and abstracts is converted into numerical vectors. TF-IDF (Term Frequency-Inverse Document Frequency) is a robust, lightweight method that works well for scientific writing, capturing key terms without being overwhelmed by common words.
Model: A classifier predicts relevance for each unseen record. Naive Bayes is often the fastest and most effective starting point for text classification, especially when you have limited labeled data. It handles the sparse, high-dimensional space of TF-IDF vectors efficiently.
Query Strategy: The system chooses which records to show you next. Uncertainty sampling is the classic approach: it selects the records the model is most unsure about (e.g., a predicted relevance score near 50%). This ensures you spend your screening effort on the most informative cases, accelerating model learning.

Step-by-Step Implementation in Two Tools

Rayyan (Web-Based)

1. Import your RIS/BibTeX file. 2. Start screening; Rayyan’s AI (using a proprietary model) flags records as likely relevant. 3. Use the “Show me uncertain” filter—this implements uncertainty sampling. You review the borderline cases first. 4. Monitor the “AI predictions” pane to see confidence scores. Stop screening when new records are all predicted as irrelevant with high confidence. Rayyan hides its backend, but this manual filtering mimics active learning.

ASReview (Open-Source, Python or GUI)

1. Install ASReview Lab (GUI) or use the Python API. 2. Load your dataset. 3. Configure the model: select Naive Bayes as the classifier, TF-IDF for feature extraction, and Uncertainty sampling as the query strategy. 4. Run the simulation (or real interactive screening). ASReview automatically applies dynamic resampling to handle class imbalance. 5. Review the stopping criterion—ASReview can recommend stopping when recall reaches a threshold (e.g., 95%).

Practical Verification

Don’t trust the AI blindly. Use ASReview’s “Simulation Mode” to test on a small pre-labeled set (e.g., 50 records) before full deployment. In Rayyan, manually verify a random 10% of excluded records to check for false negatives. Both tools allow export of screening decisions, including AI confidence scores, for audit trails.

The shift from theory to practice requires understanding what the tools do—and what they hide. By choosing the right active learning configuration (TF-IDF + Naive Bayes + uncertainty sampling) and handling imbalance with dynamic resampling, you can reduce screening time by 60–80% while maintaining rigor. Start with a small pilot, benchmark your recall, then scale confidently.

For a comprehensive guide with detailed workflows, templates, and additional strategies, see my e-book: AI for Niche Academic Researchers: How to Automate Systematic Literature Review Screening and Data Extraction.