Onealbjerring2344

Z Iurium Wiki

Verze z 7. 10. 2024, 21:06, kterou vytvořil Onealbjerring2344 (diskuse | příspěvky) (Založena nová stránka s textem „Analysis of probability distributions conditional on species trees has demonstrated the existence of anomalous ranked gene trees (ARGTs), ranked gene trees…“)
(rozdíl) ← Starší verze | zobrazit aktuální verzi (rozdíl) | Novější verze → (rozdíl)

Analysis of probability distributions conditional on species trees has demonstrated the existence of anomalous ranked gene trees (ARGTs), ranked gene trees that are more probable than the ranked gene tree that accords with the ranked species tree. Here, to improve the characterization of ARGTs, we study enumerative and probabilistic properties of two classes of ranked labeled species trees, focusing on the presence or avoidance of certain subtree patterns associated with the production of ARGTs. We provide exact enumerations and asymptotic estimates for cardinalities of these sets of trees, showing that as the number of species increases without bound, the fraction of all ranked labeled species trees that are ARGT-producing approaches 1. This result extends beyond earlier existence results to provide a probabilistic claim about the frequency of ARGTs.Proteins fold into complex three-dimensional shapes. Simplified representations of their shapes are central to rationalise, compare, classify, and interpret protein structures. Traditional methods to abstract protein folding patterns rely on representing their standard secondary structural elements (helices and strands of sheet) using line segments. This results in ignoring a significant proportion of structural information. The motivation of this research is to derive mathematically rigorous and biologically meaningful abstractions of protein folding patterns that maximize the economy of structural description and minimize the loss of structural information. We report on a novel method to describe a protein as a non-overlapping set of parametric three dimensional curves of varying length and complexity. Our approach to this problem is supported by information theory and uses the statistical framework of minimum message length (MML) inference. We demonstrate the effectiveness of our non-linear abstraction to support efficient and effective comparison of protein folding patterns on a large scale.The Tikhonov regularized nonnegative matrix factorization (TNMF) is an NMF objective function that enforces smoothness on the computed solutions, and has been successfully applied to many problem domains including text mining, spectral data analysis, and cancer clustering. There is, however, an issue that is still insufficiently addressed in the development of TNMF algorithms, i.e., how to develop mechanisms that can learn the regularization parameters directly from the data sets. The common approach is to use fixed values based on a priori knowledge about the problem domains. However, from the linear inverse problems study it is known that the quality of the solutions of the Tikhonov regularized least square problems depends heavily on the choosing of appropriate regularization parameters. Since least squares are the building blocks of the NMF, it can be expected that similar situation also applies to the NMF. In this paper, we propose two formulas to automatically learn the regularization parameters from the data set based on the L-curve approach. We also develop a convergent algorithm for the TNMF based on the additive update rules. Finally, we demonstrate the use of the proposed algorithm in cancer clustering tasks.During previous years, many studies on synthesis, as well as on anti-tumor, anti-inflammatory and anti-bacterial activities of the pyrazole derivatives have been described. Certain pyrazole derivatives exhibit important pharmacological activities and have proved to be useful template in drug research. Considering importance of pyrazole template, in current work the series of novel inhibitors were designed by replacing central ring of acridine with pyrazole ring. These heterocyclic compounds were proposed as a new potential base for telomerase inhibitors. Obtained dibenzopyrrole structure was used as a novel scaffold structure and extension of inhibitors was done by different functional groups. Docking of newly designed compounds in the telomerase active site (telomerase catalytic subunit TERT) was carried out. All dibenzopyrrole derivatives were evaluated by three docking programs CDOCKER, Ligandfit docking (Scoring Functions) and AutoDock. Compound C_9g, C_9k and C_9l performed best in comparison to all designed inhibitors during the docking in all methods and in interaction analysis. Introduction of pyrazole and extension of dibenzopyrrole in compounds confirm that such compound may act as potential telomerase inhibitors.Gene translation is the process in which intracellular macro-molecules, called ribosomes, decode genetic information in the mRNA chain into the corresponding proteins. Gene translation includes several steps. During the elongation step, ribosomes move along the mRNA in a sequential manner and link amino-acids together in the corresponding order to produce the proteins. The homogeneous ribosome flow model (HRFM) is a deterministic computational model for translation-elongation under the assumption of constant elongation rates along the mRNA chain. The HRFM is described by a set of n first-order nonlinear ordinary differential equations, where n represents the number of sites along the mRNA chain. The HRFM also includes two positive parameters ribosomal initiation rate and the (constant) elongation rate. In this paper, we show that the steady-state translation rate in the HRFM is a concave function of its parameters. This means that the problem of determining the parameter values that maximize the translation rate is relatively simple. Our results may contribute to a better understanding of the mechanisms and evolution of translation-elongation. We demonstrate this by using the theoretical results to estimate the initiation rate in M. musculus embryonic stem cell. The underlying assumption is that evolution optimized the translation mechanism. For the infinite-dimensional HRFM, we derive a closed-form solution to the problem of determining the initiation and transition rates that maximize the protein translation rate. We show that these expressions provide good approximations for the optimal values in the n-dimensional HRFM already for relatively small values of n. These results may have applications for synthetic biology where an important problem is to re-engineer genomic systems in order to maximize the protein production rate.Identifying relevant genes which are responsible for various types of cancer is an important problem. In this context, important genes refer to the marker genes which change their expression level in correlation with the risk or progression of a disease, or with the susceptibility of the disease to a given treatment. Gene expression profiling by microarray technology has been successfully applied to classification and diagnostic prediction of cancers. However, extracting these marker genes from a huge set of genes contained by the microarray data set is a major problem. Most of the existing methods for identifying marker genes find a set of genes which may be redundant in nature. Motivated by this, a multiobjective optimization method has been proposed which can find a small set of non-redundant disease related genes providing high sensitivity and specificity simultaneously. In this article, the optimization problem has been modeled as a multiobjective one which is based on the framework of variable length particle swarm optimization. Using some real-life data sets, the performance of the proposed algorithm has been compared with that of other state-of-the-art techniques.Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work, we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. check details This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes.Gene selection based on microarray data, is highly important for classifying tumors accurately. Existing gene selection schemes are mainly based on ranking statistics. From manifold learning standpoint, local geometrical structure is more essential to characterize features compared with global information. In this study, we propose a supervised gene selection method called locality sensitive Laplacian score (LSLS), which incorporates discriminative information into local geometrical structure, by minimizing local within-class information and maximizing local between-class information simultaneously. In addition, variance information is considered in our algorithm framework. Eventually, to find more superior gene subsets, which is significant for biomarker discovery, a two-stage feature selection method that combines the LSLS and wrapper method (sequential forward selection or sequential backward selection) is presented. Experimental results of six publicly available gene expression profile data sets demonstrate the effectiveness of the proposed approach compared with a number of state-of-the-art gene selection methods.Gene expression deviates from its normal composition in case a patient has cancer. This variation can be used as an effective tool to find cancer. In this study, we propose a novel gene expressions based colon classification scheme (GECC) that exploits the variations in gene expressions for classifying colon gene samples into normal and malignant classes. Novelty of GECC is in two complementary ways. First, to cater overwhelmingly larger size of gene based data sets, various feature extraction strategies, like, chi-square, F-Score, principal component analysis (PCA) and minimum redundancy and maximum relevancy (mRMR) have been employed, which select discriminative genes amongst a set of genes. Second, a majority voting based ensemble of support vector machine (SVM) has been proposed to classify the given gene based samples. Previously, individual SVM models have been used for colon classification, however, their performance is limited. In this research study, we propose an SVM-ensemble based new approach for gene based classification of colon, wherein the individual SVM models are constructed through the learning of different SVM kernels, like, linear, polynomial, radial basis function (RBF), and sigmoid. The predicted results of individual models are combined through majority voting. In this way, the combined decision space becomes more discriminative. The proposed technique has been tested on four colon, and several other binary-class gene expression data sets, and improved performance has been achieved compared to previously reported gene based colon cancer detection techniques. The computational time required for the training and testing of 208 × 5,851 data set has been 591.01 and 0.019 s, respectively.

Autoři článku: Onealbjerring2344 (Lancaster Mcmahon)