Date of Award

Spring 2022

Document Type

Open Access Dissertation


Biological Sciences

First Advisor

Bert Ely


DNA replication, recombination and repairs maintain bacterial genome stability. But these processes may also induce genome rearrangements leading to inter andintra chromosomal structural variations. Genus Caulobacter undergoes extensive genome rearrangements. Genomic studies in bacteria usually focus on the codingregions, but there is important information present in the intergenic DNA spaces inaddition to the regulatory elements involved in transcription. Recently, Ely published a new model for recombination in genus Caulobacter with simultaneousloss and gain of genes resulting from preferential recombination at non- homologous regions flanked by regions of homology. In my dissertation, I observedand catalogued hairpin structures at known sites of recombination in both closely and distantly related species to Caulobacter crescentus strain NA1000. To automate the process of identifying conserved base patterns in long sequences inbacterial genomes, I developed an unsupervised machine-learning pipeline usingagglomerative clustering. These analyses have identified the presence of sequences capable of forming hairpins at the previously identified recombination hotspots. When additional Caulobacter genomes were examined, an increase in phylogenetic distance led to a decrease in the number of hairpins matching the model organism Caulobacter crescentus NA1000, with most of the differences seen in the loop sequence of the hairpin. I also observed that stem structures tendto remain consistent across species. We did observe changes in either the length or bases. This can be due to differences in sequence conservation as an outcomeof phylogenetic distance. The presence of these hairpin structures seemsto have been conserved at sites of recombination suggesting that they may play role in initiating recombination by acting as substrates. It has also previously been shownthat Caulobacter crescentus uses Rho dependent termination machinery under stress. We identified some of the hairpin structures at sites of both rho dependent and independent termination in Caulobacter genus and compared it with previouslyidentified structures using ARNold for intrinsic termination and RHOTermPredict for rho-dependent termination. Our hairpin structures matched the ones identifiedwith ARNold but RHOTermPredict is designed for genomes with low GC %. The latter identified 6 times as many RUT sites as were genes, hence limiting our confirmation of Rho-independent terminators.


© 2022, Geetha Saarunya Clarke

Included in

Biology Commons