We have recently reported that the statistical analysis of the frequency distribution of short oligonucleotides within mammalian and viral genomes allows the production of sets of DNA sequences enriched in signals for transcription factors. Such statistical approaches could facilitate the identification of new promoter regions playing a role in the transcriptional regulation of gene expression. In the case of mammalian oligonucleotides, we found that the published set of frequent decamers enriched in transcriptional motifs is not suitable for studies on genes of Homo sapiens and evolutionarily related genomes, because it contains decameric sequences belonging to genomic repeats. We report here that most of the decameric sequences of DNA repeats belong to Alu repeats. Accordingly, we produced a subset of Alu-free frequent decamers. In addition, we eliminated from the subset of Alu-free frequent decamers those that are frequently present within other common human repeats, including (GT)n, (AT)n, (CA)n, (ATT)n, (CAA)n and (GTT)n. The Alu-free (repeats-free) subset of frequent mammalian decamers is enriched in signals for transcription factors and allows the identification of putative signals in genes, such as those coding for plasminogen activator, adenosine deaminase and p53, that contain a large number of Alu-like repeats interspersed within our genomic sequences. The newly generated compilation of frequent decamers described here might be used to locate genomic regions playing functional roles in the expression of genes of Homo sapiens and related primates

A SET OF ALU-FREE FREQUENT DECAMERS FROM MAMMALIAN GENOMES ENRICHED IN TRANSCRIPTION FACTOR SIGNALS

VOLINIA, Stefano;SCAPOLI, Chiara;
1994

Abstract

We have recently reported that the statistical analysis of the frequency distribution of short oligonucleotides within mammalian and viral genomes allows the production of sets of DNA sequences enriched in signals for transcription factors. Such statistical approaches could facilitate the identification of new promoter regions playing a role in the transcriptional regulation of gene expression. In the case of mammalian oligonucleotides, we found that the published set of frequent decamers enriched in transcriptional motifs is not suitable for studies on genes of Homo sapiens and evolutionarily related genomes, because it contains decameric sequences belonging to genomic repeats. We report here that most of the decameric sequences of DNA repeats belong to Alu repeats. Accordingly, we produced a subset of Alu-free frequent decamers. In addition, we eliminated from the subset of Alu-free frequent decamers those that are frequently present within other common human repeats, including (GT)n, (AT)n, (CA)n, (ATT)n, (CAA)n and (GTT)n. The Alu-free (repeats-free) subset of frequent mammalian decamers is enriched in signals for transcription factors and allows the identification of putative signals in genes, such as those coding for plasminogen activator, adenosine deaminase and p53, that contain a large number of Alu-like repeats interspersed within our genomic sequences. The newly generated compilation of frequent decamers described here might be used to locate genomic regions playing functional roles in the expression of genes of Homo sapiens and related primates
1994
Gambari, R; Volinia, Stefano; Nesti, C; Scapoli, Chiara; Barrai, I.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/463100
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact