Machine learning models accurately predict T-DNA insertion into plant genomes

Authors

  • Sawyer H. Smith Park University
  • Azin Agah Park University

DOI:

https://doi.org/10.47611/jsr.v13i3.2542

Keywords:

Agrobacterium, Machine Learning

Abstract

Agrobacterium tumefaciens is a gram-negative bacterium of the family Rhizobiaceae and is known for its pathogenic ability to induce a neoplastic response in over 100 different species of plants, often leading to significant decline in individual plant health. The mechanism by which tumors are induced includes a segment of DNA contained within the bacterium’s Ti plasmid which is integrated in the host genome. The T-DNA is oncogenic, encoding enzymes that increase the production of certain plant hormones ultimately leading to tumor formation. The impressive ability of T-DNA to integrate into plant genomes has led to its use as a common method of genetic transformation in plants. While it has been documented that the T-DNA insertion occurs at double strand breaks, the mechanism of insertion still remains elusive. Currently, the point at which the T-DNA is inserted in the host genome is believed to be somewhat random with respect to the surrounding sequences, and uncontrolled multiple insertion sites appear to be a common phenomenon. In this study, we utilized machine learning algorithms to assess the nucleotide sequences that are important in integration of Ti plasmid into the host genome. Various machine learning algorithms have yielded high-accuracy models provided sequence data alone.

Downloads

Metrics

PDF views
61

Author Biography

Azin Agah, Park University

Department of Chemistry

Published

08-31-2024

How to Cite

Smith, S., & Agah, A. (2024). Machine learning models accurately predict T-DNA insertion into plant genomes. Journal of Student Research, 13(3). https://doi.org/10.47611/jsr.v13i3.2542

Issue

Section

Research Articles