In recent years several initiatives have taken place to promote universal access to the Information and Knowledge
Society. It was in this context that in 2004 the Online Knowledge Library (b-on) was launched in Portugal. With it, it
became easier to get access to full text international scientific publications. This study aims to present and analyse
some of the statistical and bibliometric indicators of the Portuguese scientific output seeking to evaluate its link it
with b-on. It was analysed the usage of b-on resources by the public universities members of the consortium from
2004 to 2010, and we chose as sample the five universities with more downloads per FTE (full time equivalent).In addition to the usage data of the consortium, we used the Web of Science (WoS) from which we identified the articles indexed by the five Portuguese universities affiliated between 2000-2010.Thus, and through a quantitative
methodology, we identified the scientific production per subject area, international cooperation and scientific journals
with the highest number of published articles, among others. We conclude that the availability and access to
electronic resources contributes to the increased of the scientific productivity of the universities and that the study and
analysis of its use and output are essential.
In recent years several initiatives have taken place to promote universal access to the Information and Knowledge Society. It was in this context that in 2004 the Online Knowledge Library (b-on) was launched in Portugal. With it, it became easier to get access to full text international scientific publications. This study aims to present and analyse some of the statistical and bibliometric indicators of the Portuguese scientific output seeking to evaluate its link it with b-on. It was analysed the usage of b-on resources by the public universities members of the consortium from 2004 to 2010, and we chose as sample the five universities with more downloads per FTE (full time equivalent).In addition to the usage data of the consortium, we used the Web of Science (WoS) from which we identified the articles indexed by the five Portuguese universities affiliated between 2000-2010.Thus, and through a quantitative methodology, we identified the scientific production per subject area, international cooperation and scientific journals with the highest number of published articles, among others. We conclude that the availability and access to electronic resources contributes to the increased of the scientific productivity of the universities and that the study and analysis of its use and output are essential.
High density oligonucleotide arrays have been used extensively for expression studies of eukaryotic organisms. We have designed a prokaryotic high density oligonucleotide array using the complete Escherichia coli genome sequence to monitor expression levels of all genes and intergenic regions in the genome. Because previously described methods for preparing labeled target nucleic acids are not useful for prokaryotic cell analysis using such arrays, a mRNA enrichment and direct labeling protocol was developed together with a cDNA synthesis protocol. The reproducibility of each labeling method was determined using high density oligonucleotide probe arrays as a read-out methodology and the expression results from direct labeling were compared to the expression results from the cDNA synthesis. About 50% of all annotated E.coli open reading frames are observed to be transcribed, as measured by both protocols, when the cells were grown in rich LB medium. Each labeling method individually showed a high degree of concordance in replica experiments (95 and 99%, respectively), but when each sample preparation method was compared to the other, ∼32% of the genes observed to be expressed were discordant. However, both labeling methods can detect the same relative gene expression changes when RNA from IPTG-induced cells was labeled and compared to RNA from uninduced E.coli cells.
Data from gene expression arrays are influenced by many experimental parameters that lead to variations not simply accessible by standard quantification methods. To compare measurements from gene expression array experiments, quantitative data are commonly normalised using reference genes or global normalisation methods based on mean or median values. These methods are based on the assumption that (i) selected reference genes are expressed at a standard level in all experiments or (ii) that mean or median signal of expression will give a quantitative reference for each individual experiment. We introduce here a new ranking diagram, with which we can show how the different normalisation methods compare, and how they are influenced by variations in measurements (noise) that occur in every experiment. Furthermore, we show that an upper trimmed mean provides a simple and robust method for normalisation of larger sets of experiments by comparative analysis.
The accurate determination of the biological effects of low doses of pollutants is a major public health challenge. DNA microarrays are a powerful tool for investigating small intracellular changes. However, the inherent low reliability of this technique, the small number of replicates and the lack of suitable statistical methods for the analysis of such a large number of attributes (genes) impair accurate data interpretation. To overcome this problem, we combined results of two independent analysis methods (ANOVA and RELIEF). We applied this analysis protocol to compare gene expression patterns in Saccharomyces cerevisiae growing in the absence and continuous presence of varying low doses of radiation. Global distribution analysis highlights the importance of mitochondrial membrane functions in the response. We demonstrate that microarrays detect cellular changes induced by irradiation at doses that are 1000-fold lower than the minimal dose associated with mutagenic effects.
Microarray experiments generate data sets with information on the expression levels of thousands of genes in a set of biological samples. Unfortun ately, such experiments often produce multiple missing expression values, normally due to various experimental problems. As many algorithms for gene expression analysis require a complete data matrix as input, the missing values have to be estimated in order to analyze the available data. Alternatively, genes and arrays can be removed until no missing values remain. However, for genes or arrays with only a small number of missing values, it is desirable to impute those values. For the subsequent analysis to be as informative as possible, it is essential that the estimates for the missing gene expression values are accurate. A small amount of badly estimated missing values in the data might be enough for clustering methods, such as hierachical clustering or K-means clustering, to produce misleading results. Thus, accurate methods for missing value estimation are needed. We present novel methods for estimation of missing values in microarray data sets that are based on the least squares principle, and that utilize correlations between both genes and arrays. For this set of methods, we use the common reference name LSimpute. We compare the estimation accuracy of our methods with the widely used KNNimpute on three complete data matrices from public data sets by randomly knocking out data (labeling as missing). From these tests...
Little consideration has been given to the effect of different segmentation methods on the variability of data derived from microarray images. Previous work has suggested that the significant source of variability from microarray image analysis is from estimation of local background. In this study, we used Analysis of Variance (ANOVA) models to investigate the effect of methods of segmentation on the precision of measurements obtained from replicate microarray experiments. We used four different methods of spot segmentation (adaptive, fixed circle, histogram and GenePix) to analyse a total number of 156 172 spots from 12 microarray experiments. Using a two-way ANOVA model and the coefficient of repeatability, we show that the method of segmentation significantly affects the precision of the microarray data. The histogram method gave the lowest variability across replicate spots compared to other methods, and had the lowest pixel-to-pixel variability within spots. This effect on precision was independent of background subtraction. We show that these findings have direct, practical implications as the variability in precision between the four methods resulted in different numbers of genes being identified as differentially expressed. Segmentation method is an important source of variability in microarray data that directly affects precision and the identification of differentially expressed genes.
Methods based on DNA reassociation in solution with the subsequent PCR amplification of certain hybrid molecules, such as coincidence cloning and subtractive hybridization, all suffer from a common imperfection: cross-hybridization between various types of paralogous repetitive DNA fragments. Although the situation can be slightly improved by the addition of repeat-specific competitor DNA into the hybridization mixture, the cross-hybridization outcome is a significant number of background chimeric clones in resulting DNA libraries. In order to overcome this challenge, we developed a technique called mispaired DNA rejection (MDR), which utilizes a treatment of resulting reassociated DNA with mismatch-specific nucleases. We examined the MDR efficiency using cross-hybridization of complex, whole genomic mixtures derived from human and chimpanzee genomes, digested with frequent-cutter restriction enzyme. We show here that both single-stranded DNA-specific and mismatched double-stranded DNA-specific nucleases can be used for MDR separately or in combination, reducing the background level from 60 to 4% or lower. The technique presented here is of universal usefulness and can be applied to both cDNA and genomic DNA subtractions of very complex DNA mixtures. MDR is also useful for the genome-wide recovery of highly conserved DNA sequences...
A remarkable feature of the Yeast Knockout strain collection is the presence of two unique 20mer TAG sequences in almost every strain. In principle, the relative abundances of strains in a complex mixture can be profiled swiftly and quantitatively by amplifying these sequences and hybridizing them to microarrays, but TAG microarrays have not been widely used. Here, we introduce a TAG microarray design with sophisticated controls and describe a robust method for hybridizing high concentrations of dye-labeled TAGs in single-stranded form. We also highlight the importance of avoiding PCR contamination and provide procedures for detection and eradication. Validation experiments using these methods yielded false positive (FP) and false negative (FN) rates for individual TAG detection of 3–6% and 15–18%, respectively. Analysis demonstrated that cross-hybridization was the chief source of FPs, while TAG amplification defects were the main cause of FNs. The materials, protocols, data and associated software described here comprise a suite of experimental resources that should facilitate the use of TAG microarrays for a wide variety of genetic screens.
Post-translational modifications (PTMs) of histones play a role in modifying chromatin structure for DNA-templated processes in the eukaryotic nucleus, such as transcription, replication, recombination and repair; thus, histone PTMs are considered major players in the epigenetic control of these processes. Linking specific histone PTMs to gene expression is an arduous task requiring large amounts of highly purified and natively modified histones to be analyzed by various techniques. We have developed robust and complementary procedures, which use strong protein denaturing conditions and yield highly purified core and linker histones from unsynchronized proliferating, M-phase arrested and butyrate-treated cells, fully preserving their native PTMs without using enzyme inhibitors. Cell hypotonic swelling and lysis, nuclei isolation/washing and chromatin solubilization under mild conditions are bypassed to avoid compromising the integrity of histone native PTMs. As controls for our procedures, we tested the most widely used conventional methodologies and demonstrated that they indeed lead to drastic histone dephosphorylation. Additionally, we have developed methods for preserving acid-labile histone modifications by performing non-acid extractions to obtain highly purified H3 and H4. Importantly...
Because the properties of horizontally-transferred genes will reflect the mutational proclivities of their donor genomes, they often show atypical compositional properties relative to native genes. Parametric methods use these discrepancies to identify bacterial genes recently acquired by horizontal transfer. However, compositional patterns of native genes vary stochastically, leaving no clear boundary between typical and atypical genes. As a result, while strongly atypical genes are readily identified as alien, genes of ambiguous character are poorly classified when a single threshold separates typical and atypical genes. This limitation affects all parametric methods that examine genes independently, and escaping it requires the use of additional genomic information. We propose that the performance of all parametric methods can be improved by using a multiple-threshold approach. First, strongly atypical alien genes and strongly typical native genes would be identified using conservative thresholds. Genes with ambiguous compositional features would then be classified by examining gene context, including the class (native or alien) of flanking genes. By including additional genomic information in a multiple-threshold framework, we observed a remarkable improvement in the performance of several popular...
The ability to verify the sequence of a nucleic acid-based therapeutic is an essential step in the drug development process. The challenge associated with sequence identification increases with the length and nuclease resistance of the nucleic acid molecule, the latter being an important attribute of therapeutic oligonucleotides. We describe methods for the sequence determination of Spiegelmers, which are enantiomers of naturally occurring RNA with high resistance to enzymatic degradation. Spiegelmer sequencing is effected by affixing a label or hapten to the 5′-end of the oligonucleotide and chemically degrading the molecule in a controlled fashion to generate fragments that are then resolved and identified using liquid chromatography-mass spectrometry. The Spiegelmer sequence is then derived from these fragments. Examples are shown for two different Spiegelmers (NOX-E36 and NOX-A12), and the specificity of the method is shown using a NOX-E36 mismatch control.
Proteins are covalently trapped on DNA to form DNA–protein crosslinks (DPCs) when cells are exposed to DNA-damaging agents. DPCs interfere with many aspects of DNA transactions. The current DPC detection methods indirectly measure crosslinked proteins (CLPs) through DNA tethered to proteins. However, a major drawback of such methods is the non-linear relationship between the amounts of DNA and CLPs, which makes quantitative data interpretation difficult. Here we developed novel methods of DPC detection based on direct CLP measurement, whereby CLPs in DNA isolated from cells are labeled with fluorescein isothiocyanate (FITC) and quantified by fluorometry or western blotting using anti-FITC antibodies. Both formats successfully monitored the induction and elimination of DPCs in cultured cells exposed to aldehydes and mouse tumors exposed to ionizing radiation (carbon-ion beams). The fluorometric and western blotting formats require 30 and 0.3 μg of DNA, respectively. Analyses of the isolated genomic DPCs revealed that both aldehydes and ionizing radiation produce two types of DPC with distinct stabilities. The stable components of aldehyde-induced DPCs have half-lives of up to days. Interestingly, that of radiation-induced DPCs has an infinite half-life...
Previously, we published a method for creating a novel DNA substrate, the double Holliday junction substrate. This substrate contains two Holliday junctions that are mobile, topologically constrained and separated by a distance comparable with conversion tract lengths. Although useful for studying late stage homologous recombination in vitro, construction of the substrate requires significant effort. In particular, there are three bottlenecks: (i) production of large quantities of single-stranded DNA; (ii) the loss of a significant portion of the DNA following the recombination step; and (iii) the loss of DNA owing to inefficient gel extraction. To address these limitations, we have made the following changes to the protocol: (i) use of a helper plasmid, rather than exogenous helper phage, to produce single-stranded DNA; (ii) use of the unidirectional ϕC31 integrase system in place of the bidirectional Cre recombinase reaction; and (iii) gel extraction by DNA diffusion. Here, we describe the changes made to the materials and methods and characterize the substrates that can be produced, including migratable single Holliday junctions, hemicatenanes and a quadruple Holliday junction substrate.
Rapid accumulation of large and standardized microarray data collections is opening up novel opportunities for holistic characterization of genome function. The limited scalability of current preprocessing techniques has, however, formed a bottleneck for full utilization of these data resources. Although short oligonucleotide arrays constitute a major source of genome-wide profiling data, scalable probe-level techniques have been available only for few platforms based on pre-calculated probe effects from restricted reference training sets. To overcome these key limitations, we introduce a fully scalable online-learning algorithm for probe-level analysis and pre-processing of large microarray atlases involving tens of thousands of arrays. In contrast to the alternatives, our algorithm scales up linearly with respect to sample size and is applicable to all short oligonucleotide platforms. The model can use the most comprehensive data collections available to date to pinpoint individual probes affected by noise and biases, providing tools to guide array design and quality control. This is the only available algorithm that can learn probe-level parameters based on sequential hyperparameter updates at small consecutive batches of data, thus circumventing the extensive memory requirements of the standard approaches and opening up novel opportunities to take full advantage of contemporary microarray collections.
One of the primary aims of synthetic biology is to (re)design metabolic pathways towards the production of desired chemicals. The fast pace of developments in molecular biology increasingly makes it possible to experimentally redesign existing pathways and implement de novo ones in microbes or using in vitro platforms. For such experimental studies, the bottleneck is shifting from implementation of pathways towards their initial design. Here, we present an online tool called ‘Metabolic Tinker’, which aims to guide the design of synthetic metabolic pathways between any two desired compounds. Given two user-defined ‘target’ and ‘source’ compounds, Metabolic Tinker searches for thermodynamically feasible paths in the entire known metabolic universe using a tailored heuristic search strategy. Compared with similar graph-based search tools, Metabolic Tinker returns a larger number of possible paths owing to its broad search base and fast heuristic, and provides for the first time thermodynamic feasibility information for the discovered paths. Metabolic Tinker is available as a web service at http://osslab.ex.ac.uk/tinker.aspx. The same website also provides the source code for Metabolic Tinker, allowing it to be developed further or run on personal machines for specific applications.
Although engineered nucleases can efficiently cleave intracellular DNA at desired target sites, major concerns remain on potential ‘off-target’ cleavage that may occur throughout the genome. We developed an online tool: predicted report of genome-wide nuclease off-target sites (PROGNOS) that effectively identifies off-target sites. The initial bioinformatics algorithms in PROGNOS were validated by predicting 44 of 65 previously confirmed off-target sites, and by uncovering a new off-target site for the extensively studied zinc finger nucleases (ZFNs) targeting C-C chemokine receptor type 5. Using PROGNOS, we rapidly interrogated 128 potential off-target sites for newly designed transcription activator-like effector nucleases containing either Asn-Asn (NN) or Asn-Lys (NK) repeat variable di-residues (RVDs) and 3- and 4-finger ZFNs, and validated 13 bona fide off-target sites for these nucleases by DNA sequencing. The PROGNOS algorithms were further refined by incorporating additional features of nuclease–DNA interactions and the newly confirmed off-target sites into the training set, which increased the percentage of bona fide off-target sites found within the top PROGNOS rankings. By identifying potential off-target sites in silico...
non-peer-reviewed; New technologies are creating opportunities for online assessment not previously available to K-12 level teachers. However, most research into this particular aspect of education has focused on university level assessment. This case study placed online assessment into the context of an Irish fourth class primary classroom. To achieve this, focus was put on a comparison between immediate and delayed feedback for online tests. This particular comparison was selected in an attempt to better understand how and when feedback should be provided for frequent online assessments.
A review of the literature and practical research was carried out. The literature review looked at frequent testing as a formative assessment method. Specifically it focused on multiple choice format vocabulary tests conducted online. A key element of this focus was the timing of feedback for these tests. Two timing methods were looked at; immediate, answer until correct feedback and delayed feedback.
The literature review findings helped inform the research methodology. The research aspect of this study used online multiple choice vocabulary questions as the platform on which to compare the different timing methods. Online surveys of teachers...
Pattern recognition methods have become a powerful tool for segmentation in the sense that they are capable of automatically building a segmentation model from training images. However, they present several difficulties, such as requirement of a large set of training data, robustness to imaging conditions not present in the training set, and complexity of the search process. In this paper we tackle the second problem by using a deep belief network learning architecture, and the third problem by resorting to efficient searching algorithms. As an example, we illustrate the performance of the algorithm in lip segmentation and tracking in video sequences. Quantitative comparison using different strategies for the search process are presented. We also compare our approach to a state-of-the-art segmentation and tracking algorithm. The comparison show that our algorithm produces competitive segmentation results and that efficient search strategies reduce ten times the run-complexity.; Jacinto C. Nascimento and Gustavo Carneiro
Josh de Leeuw is a doctoral student in the Cognitive Science program and the Department of Psychological and Brain Sciences at Indiana University. His research interests include the role of cognitive constraints in learning, interactions between knowledge and perception, and the methodology of online behavioral experiments. He is the creator of jspsych, a popular tool for conducting online experiments. His most recent project is FactorsDB, a community-driven collection of open-source online experiments for use in the classroom. He received his BA in Cognitive Science at Vassar College in Poughkeepsie, NY.; Behavioral scientists have been using the internet to conduct research for over two decades, but only recently has the scope of internet research begun to rival the traditional laboratory experiment. In this workshop, I will introduce you to the basics of online data collection and various tools for conducting online research, including jsPsych (http://www.jspsych.org), a programming library for conducting laboratory-like experiments online developed at Indiana University. I'll describe all the necessary components of running an online experiment, the features of jsPsych, and how to create a simple experiment using the jsPsych library.