Data and Materials

  A. Data Sets

  Most of the genome data sets ( Human, Chimpanzee, Orangutan, Rhesus, Cow, Horse, Platypus, Dog, Cat, Chicken, ZebraFinch, Opossum, Mouse, Rat, X. tropicalis, Medaka, Zebrafish ) used in our analysis were obtained from the UCSC genome browser database (http://genome.ucsc.edu; March, 2006, release, Version hg18). These alignments were produced by BLAT by using mRNA that human genome assembly National Center for Biotechnology Information Build 36.1, March 2006 freeze.

Database genomes
No Species No. of Overlapping
Gene Pairs
No. of Genes
(UCSC Genome based)
1  Human10,120   29,772   
2  Chimpanzee10,026   29,948   
3  Orangutan853   3,400   
4  Rhesus213   542   
5  Cow3,595   10,208   
6  Horse99   359   
7  Platypus1   24   
8  Dog213   944   
9  Cat25   459   
10  Chicken1,451   4,363   
11  ZebraFinch3   47   
12  Opossum115   171   
13  Mouse8,667   22,286   
14  Rat6,892   15,667   
15  X. tropicalis535   9,431   
16  Medaka50   397   
17  Zebrafish1,805   14,580   

  Find Common Overlapping Pairs    

  B. Selection of orientation reliable transcripts

  To reduce the workload improve the mapping quality, we first applied the selected sense orientation reliable transcripts. All imperfect alignment and uncertain multiple alignments were removed. The transcripts sequences that were aligned to more than one genomic fragment were discarded as suspected chimeras.

  C. Search for Overlapping Genes from transcripts sequences

  We focused on mRNA sequences to genome alignment data from UCSC genome browser database only and did not include ESTs to get high quality data for genome. That is the reason why EST sequences can be identified as artifacts overlapping genes because of chimeric sequences, mislabeling, and genomic sequence contamination. Furthermore, EST sequences found unreliable for the detection of antisense transcripts, again because of the problematic assignment of EST sequences orientation. We therefore used the genome to mRNA sequences alignment data calculated in the UCSC Genome Browser database. In the data, mRNA sequence set were attempted to map on the genome sequence. We searched overlapping genes from genes that are transcribed from opposite strands of the same genomic locus. And also, to list all of the putative overlapping genes was mapped on the genome.


Graphics Application Lab., Dept. of Computer Science,
Molecular Biology & Phylogeny Lab.,
Pusan National University
San-30, Jangjeon-dong, Keumjeong-gu, Pusan, 609-735, South Korea.
Phone: +82-51-582-5009 Fax: +82-51-515-2208