For the purpose of predicting candidate sRNAs, both strands of the 1396 intergenic regions (IGs) at least 50 nucleotides in length in the N. europaea genome (Chain et al., 2003) were analyzed using a computational approach that integrates primary sequence information and comparative genomics analysis (Tjaden, 2008a, b). In summary, candidate ρ-independent transcription terminators in the N. europaea genome were predicted using the program transtermhp (Kingsford et Galunisertib al., 2007). For the comparative genomics analysis, evidence of base-pair substitutions that conserve the sRNA secondary structure was identified by comparing both strands
of each of the 1396 IGs of the N. europaea genome with the following betaproteobacterial genomes: Acidovorax JS42, Bordetella bronchiseptica, Burkholderia pseudomallei K96243, Herminiimonas arsenicoxydans, Methylobacillus flagellatus KT, Neisseria meningitidis MC58, Nitrosomonas eutropha C91, Nitrosospira multiformis ATCC 25196, Polaromonas JS666, and Ralstonia solanacearum. For the 15 IGs predicted to contain likely sRNAs, alignments and covarying residues evincing the conserved
Y-27632 manufacturer RNA secondary structure (Supporting Information, Fig. S1). The 15 predicted sRNA sequences were then searched against the Rfam model library (Griffiths-Jones et al., 2005). Following the Rfam search methodology, each sequence was scanned against the library of Rfam sequences using wu-blast with an E-value threshold of 1.0. Any matches were then scanned against the corresponding covariance model using the Rfam threshold for that family of sequences. Data from 42 N. europaea Affymetrix Farnesyltransferase microarrays were obtained from the Gene Expression Omnibus (Edgar et al., 2002). The experimental data for these microarrays were derived from cells exposed to chloroform, chloromethane (Gvakharia et al., 2007),
zinc, cadmium, cyanide (Park & Ely, 2008, 2009), benzene, or toluene (Radniecki et al., 2008), and from all the corresponding controls. Tiled oligonucleotide probes on the arrays assayed each of the 2461 protein-coding genes as well as one strand of 1042 IGs of the N. europaea genome. Data from all microarray experiments were normalized so that the median intensities are the same across all arrays. GeneRacer® Core Kit from Invitrogen (Carlsbad, CA) was used to confirm the expression and the full length of the transcripts of the two selected psRNAs (psRNA5 and psRNA11). RNA extracted from chloromethane-treated cells was used to map the transcripts’ 5′- and 3′-ends. The cDNA was generated by reverse transcription of the RNA with SuperScript Reverse Transcriptase (Invitrogen). To distinguish the primary transcript 5′-ends from internal 5′-processing sites, we analyzed the RNAs with 5′-rapid amplification of cDNA ends (RACE), with and without treatment with tobacco acid pyrophosphatase (TAP).