Background: Transcriptome sequencing data can contain up to 30% contaminating reads. These may originate from laboratory contamination or biologically relevant sources, amenable to metatranscriptomics analysis. Aim: to evaluate the utility of contaminating reads for large-scale screening of plant pests and symbionts.
Materials and methods: We analyzed the data of RNA-seq experiments of rye (Secale cereale L.) including five in-house accessions and 50 public datasets from NCBI SRA archive. Reads with good mapping to the rye genome were filtered out, retaining putative contaminats for downstream analysis.
Results: After removing laboratory contaminants, we compared aphids, symbiotic fungi, bacteria and viruses across accessions. Symbiome-derived reads were reproducible in biological replicates and varied by location, condition, and plant species, enabling post-hoc metatranscriptomic analysis.
Conclusion: Contaminating reads correlated with field-observed species or expected symbionts. Distribution patterns across accessions support repurposing existing and future sequencing data to screen for plant pests, monitor symbiotic organisms, and plan eradication strategies amid global climate change.