Palindromes in 1000 Genomes

A reference catalog of DNA palindromes in the human genome and their variations in 1000 Genomes

Madhavi K. Ganapathiraju, Sandeep Subramanian, Srilakshmi Chaparala and Kalyani B. Karunakaran.
Human Genome Variation 7, 40 (2020). https://doi.org/10.1038/s41439-020-00127-5

Abstract: A palindrome in DNA is like a palindrome in language, but when read backwards, it is a complement of the forward sequence; effectively, the two halves of a sequence complement each other from its midpoint like in a double strand of DNA. Palindromes are distributed throughout the human genome and play significant roles in gene expression and regulation. Palindromic mutations are linked to many human diseases, such as neuronal disorders, mental retardation, and various cancers. In this work, we computed and analyzed the palindromic sequences in the human genome and studied their conservation in personal genomes using 1000 Genomes data. We found that ~30% of the palindromes exhibit variation, some of which are caused by rare variants. The analysis of disease/trait-associated single-nucleotide polymorphisms in palindromic regions showed that disease-associated risk variants are 14 times more likely to be present in palindromic regions than in other regions. The catalog of palindromes in the reference genome and 1000 Genomes is being made available here with details on their variations in each individual genome to serve as a resource for future and retrospective whole-genome studies identifying statistically significant palindrome variations associated with diseases or traits and their roles in disease mechanisms.

Full Catalog (5 GB tar.gz file) (Supplementary File 3) of palindromes in reference genome and their variations in 1000 Genomes. The README file has all the necessary information, but can be improved! We will do that by end of December. Write to Madhavi if you need files for individual chromosomes or any other related information.

DNA Palindromes
Some students who worked on internships used data about palindromes of reference genome to gain experience in genome sequence studies. Some of their work resulted in abstracts or short papers listed below:

  • Abstract by High School Students (Cheng, Gupta, Hammond):
    Sophia Cheng, Ritwik Gupta, Tonya Hammond, LC Viswanathan and MK Ganapathiraju.
    Distribution of Palindromes in the Human Genome
    Journal of Pathology Informatics 5 (12), 2014. (see abstract in the collection).
  • Extended abstract by Undergrad Intern (Li):
    Helen Li, Aman Gupta and MK Ganapathiraju.
    DNA Palindromes in Human Genome
    2015 Summit on Translational Bioinformatics March 245-27, 2015.
  • Paper on palindromes in cancer genomes in comparison to 1000 Genomes:
    Subramanian S, Chaparala S, Avali V, Ganapathiraju MK.
    A pilot study on the prevalence of DNA palindromes in breast cancer genomes
    BMC Medical Genomics. Vol: 9 Suppl: 3 Article No: 73, 2016. Dec 5 2016. PMCID: PMC5260791.