The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
While a popular strategy in de novo transcriptome assembly algorithms is to assemble the reads by obtaining a de Bruijn graph that represents the transcriptome, an additional step is needed to obtain predicted transcripts from the de Bruijn graph. A similarity search algorithm is then applied to a related organism to obtain information about possible function of these predicted transcripts. We observe...
The advent of single-cell RNA sequencing (scRNA-seq) has given researchers the ability to study transcriptomic activity within individual cells, rather than across hundreds or thousands of cells as with bulk RNA-seq techniques. The greater precision afforded by scRNA-seq identifies mutations and gene expression landscapes private to individual cells or subpopulations, enabling us to determine novel...
Introduction: There exists a number of methods that attempt to reconstruct a genome from a set of scaffolds. To do so, they (i) determine the order of scaffolds; and (ii) determine the orientation (i.e., strand of origin) of scaffolds. Some methods attempt to solve these subproblems jointly by using various types of additional data including jumping libraries, long error-prone reads, homology relationships...
Pebble game rigidity analysis is an efficient method for extracting rigidity and flexibility information of biomolecules without performing costly molecular dynamics simulations. The standard algorithm works on a multi-graph associated to a mechanical model constructed from an arbitrary atom-bond network. Motivated by large scale protein flexibility and simulated unfolding applications, we have developed...
Finding candidate genes that could cause specific diseases has been the subject of many studies. This is an important research task, however in the biological experimentation domain it can be very expensive and time consuming. So an alternative way is to find gene expression values from partial measurements and try to predict the rest. By using computational methods, we can statistically estimate...
We develop a cache-efficient RNA folding algorithm, ByBox, that is based on Zuker's method. Using a simple LRU cache model, we show that the traditional implementation, Zuker, of Zuker's method has a much higher number of cache misses than ByBox. Extensive experiments conducted on the Xeon E5 server show that cache efficiency translates into time and energy efficiency. Our benchmarking shows that,...
With the development of next-generation sequencing technologies, large number of transcripts has been accumulated in public databases. Long non-coding RNAs (lncRNAs), typically above 200 nucleotides in sequence length, have recently attracted increasing interests because of their important roles in various cellular processes. While it is straightforward to distinguishing lncRNAs from most small non-coding...
High-throughput methylation detection approaches are epigenetic exploration strategies that map sites of DNA methylation to the genome and thus provide insight into the regulatory program of specific cells. Using methyl-binding domain affinity proteins, MBD2-based Methyl-Seq has been established to identify short regions in which a minimum of 5 CpG residues are methylated within a 200bp window requiring...
Warfarin is a popular pharmaceutical anticoagulant that targets the vitamin K epoxide reductase complex sub-unit 1, encoded by the gene VKORC1, but warfarin can be dangerous since it can cause bleeding. A first step in finding better anticoagulants that target the same enzyme is structure based lead molecule optimization using computational tools. This is possible because the tertiary structure of...
We have developed linear space algorithms to compute the Damerau-Levenshtein (DL) distance [1], [2] between two strings and also to find a sequence of edit operations of length equal to the DL distance (optimal trace). Our algorithms require O(s min{m, n} + m + n) space, where s is the size of the alphabet and m and n are, respectively, the lengths of the two strings. Previously known algorithms require...
A crucial task for metagenomic analysis is to annotate the function and taxonomy of the sequencing reads generated from a microbiome sample. In general, the reads can either be assembled into contigs and searched against reference databases, or individually searched without assembly. The first approach may suffer due to the fragmentary and incomplete nature of nucleotide sequence assembly, while the...
Applying genomics in the clinical practice has the potential to facilitate personalized medicine. Physicians play a key role in this process since they are the ones who decide whether or not to use genomic information in their clinical practice. This study aims to determine the current status of physicians in using genomics in their clinical practices and to identify the desired features of a patient...
Predicting drug-target interaction through simulation is an immensely important problem. It has a huge impact in drug discovery in pharmaceutical industry. FDA reports that it takes close to five billion dollars to introduce a new drug to the market. A slight improvement in accuracy of prediction in the domain may save millions of dollars in the investment, there by lowering down the cost of production...
A Bayesian optimization technique enables a short search time for a complex prediction model that includes many hyperparameters while maintaining the accuracy of the prediction model. Here, we apply a Bayesian optimization technique to the drug-target interaction (DTI) prediction problem as a method for computational drug discovery. We target neighborhood regularized logistic matrix factorization...
Many biological analysis techniques require measurement of similarity between sequences from large genomic datasets, which often involves extraction of all pairs of close DNA or RNA sequences. We present a k-mer-based tool to efficiently perform such sequence similarity queries for large viral datasets produced by next-generation sequencing.
Closing gaps in draft genomes is an important post processing step in genome assembly. At present, most assembled genomes contain gaps. Usually, genomes assembled from short reads or hybrid (with both short and long reads) have much more gaps than genomes assembled purely from long reads (with high coverage). A more complete genome is highly desirable since it leads to better annotation, less genotyping...
Not only fulfilling a large portion of the worldwide meat consumption, pigs also serve as a model organism in biomedical studies due to the shared similarity with humans at both physiological and genetic levels. However, as a diploid organism, a normal pig holds two versions of genetic code simultaneously, creating an obstacle for many studies in the related field. For the first time in history, we...
Calibrating stochastic biochemical models against experimental insights remains a critical challenge in biological design automation. Stochastic biochemical models incorporate the uncertainty inherent in the system being modeled, thus demanding meticulous calibration techniques. We present an approach for calibrating stochastic biochemical models such that the calibrated model satisfies a given behavioral...
RNA-seq is a mature and well-established method for studying the complexity of the transcriptome in the research setting. As this method moves from the research realm to the clinical context, new opportunities for the development of bioinformatics methods arise. During this talk I will present some of the challenges we have found during our work to release a clinical test for tumor samples using RNA-seq...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.