Abellange9785

The implementation of the integer linear programming reformulation using current mathematical modeling and integer linear programming software tools provided biologically meaningful alignments of virus-host protein-protein interaction networks.

The identification of early mild cognitive impairment (EMCI), which is an early stage of Alzheimer's disease (AD) and is associated with brain structural and functional changes, is still a challenging task. Recent studies show great promises for improving the performance of EMCI identification by combining multiple structural and functional features, such as grey matter volume and shortest path length. However, extracting which features and how to combine multiple features to improve the performance of EMCI identification have always been a challenging problem. Sunitinib chemical structure To address this problem, in this study we propose a new EMCI identification framework using multi-modal data and graph convolutional networks (GCNs). Firstly, we extract grey matter volume and shortest path length of each brain region based on automated anatomical labeling (AAL) atlas as feature representation from T1w MRI and rs-fMRI data of each subject, respectively. Then, in order to obtain features that are more helpful in identifying EMCI, a coand promising for automatic diagnosis of EMCI in clinical practice.

Integrative network methods are commonly used for interpretation of high-throughput experimental biological data transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual changes in the data and represents a so-called active module. Usually methods implementing this approach find a single subnetwork and thus solve a hard classification problem for vertices. This subnetwork inherently contains erroneous vertices, while no instrument is provided to estimate the confidence level of any particular vertex inclusion. To address this issue, in the current study we consider the active module problem as a soft classification problem.

We propose a method to estimate probabilities of each vertex to belong to the active module based on Markov chain Monte Carlo (MCMC) subnetwork sampling. As an example of the performance of our method on real data, we run it on two gene expression datertices. Given the estimated probabilities, it becomes possible to provide a connected subgraph in a consistent manner for any given FDR level no vertex can disappear when the FDR level is relaxed. We show, on both simulated and real datasets, that the proposed method has good computational performance and high classification accuracy.

DNA methylation in the human genome is acknowledged to be widely associated with biological processes and complex diseases. The Illumina Infinium methylation arrays have been approved as one of the most efficient and universal technologies to investigate the whole genome changes of methylation patterns. As methylation arrays may still be the dominant method for detecting methylation in the anticipated future, it is crucial to develop a reliable workflow to analysis methylation array data.

In this study, we develop a web service MADA for the whole process of methylation arrays data analysis, which includes the steps of a comprehensive differential methylation analysis pipeline pre-processing (data loading, quality control, data filtering, and normalization), batch effect correction, differential methylation analysis, and downstream analysis. In addition, we provide the visualization of pre-processing, differentially methylated probes or regions, gene ontology, pathway and cluster analysis results. Moreover, a customization function for users to define their own workflow is also provided in MADA.

With the analysis of two case studies, we have shown that MADA can complete the whole procedure of methylation array data analysis. MADA provides a graphical user interface and enables users with no computational skills and limited bioinformatics background to carry on complicated methylation array data analysis. The web server is available at http//120.24.94.898080/MADA.

With the analysis of two case studies, we have shown that MADA can complete the whole procedure of methylation array data analysis. MADA provides a graphical user interface and enables users with no computational skills and limited bioinformatics background to carry on complicated methylation array data analysis. The web server is available at http//120.24.94.898080/MADA.

Biological data has grown explosively with the advance of next-generation sequencing. However, annotating protein function with wet lab experiments is time-consuming. Fortunately, computational function prediction can help wet labs formulate biological hypotheses and prioritize experiments. Gene Ontology (GO) is a framework for unifying the representation of protein function in a hierarchical tree composed of GO terms.

We propose GODoc, a general protein GO prediction framework based on sequence information which combines feature engineering, feature reduction, and a novel k-nearest-neighbor algorithm to resolve the multiple GO prediction problem. Comprehensive evaluation on CAFA2 shows that GODoc performs better than two baseline models. In the CAFA3 competition (68 teams), GODoc ranks 10th in Cellular Component Ontology. Regarding the species-specific task, the proposed method ranks 10th and 8th in the eukaryotic Cellular Component Ontology and the prokaryotic Molecular Function Ontology, respectively. In the term-centric task, GODoc performs third and is tied for first for the biofilm formation of Pseudomonas aeruginosa and the long-term memory of Drosophila melanogaster, respectively.

We have developed a novel and effective strategy to incorporate a training procedure into the k-nearest neighbor algorithm (instance-based learning) which is capable of solving the Gene Ontology multiple-label prediction problem, which is especially notable given the thousands of Gene Ontology terms.

We have developed a novel and effective strategy to incorporate a training procedure into the k-nearest neighbor algorithm (instance-based learning) which is capable of solving the Gene Ontology multiple-label prediction problem, which is especially notable given the thousands of Gene Ontology terms.

Autoři článku: Abellange9785 (Abrahamsen Harder)

Práce s článkem

Osobní nástroje

Navigace

Nástroje

Abellange9785