Shafferrosenkilde3619
In this work, we present a thorough review of these important recent methodological advances in high-dimensional mediation analysis. Specifically, we describe in detail more than ten high-dimensional mediation methods, focusing on their motivations, basic modeling ideas, specific modeling assumptions, practical successes, methodological limitations, as well as future directions. We hope our review will serve as a useful guidance for statisticians and computational biologists who develop methods of high-dimensional mediation analysis as well as for analysts who apply mediation methods to high-throughput genomics studies.Although remarkable advances have been reported in high-throughput sequencing, the ability to aptly analyze a substantial amount of rapidly generated biological (DNA/RNA/protein) sequencing data remains a critical hurdle. To tackle this issue, the application of natural language processing (NLP) to biological sequence analysis has received increased attention. In this method, biological sequences are regarded as sentences while the single nucleic acids/amino acids or k-mers in these sequences represent the words. Embedding is an essential step in NLP, which performs the conversion of these words into vectors. Specifically, representation learning is an approach used for this transformation process, which can be applied to biological sequences. Vectorized biological sequences can then be applied for function and structure estimation, or as input for other probabilistic models. Considering the importance and growing trend for the application of representation learning to biological research, in the present study, we have reviewed the existing knowledge in representation learning for biological sequence analysis.Quantum chemical calculations are today an extremely valuable tool for studying enzymatic reaction mechanisms. In this mini-review, we summarize our recent work on several metal-dependent decarboxylases, where we used the so-called cluster approach to decipher the details of the reaction mechanisms, including elucidation of the identity of the metal cofactors and the origins of substrate specificity. Decarboxylases are of growing potential for biocatalytic applications, as they can be used in the synthesis of novel compounds of, e.g., pharmaceutical interest. They can also be employed in the reverse direction, providing a strategy to synthesize value-added chemicals by CO2 fixation. A number of non-redox metal-dependent decarboxylases from the amidohydrolase superfamily have been demonstrated to have promiscuous carboxylation activities and have attracted great attention in the recent years. The computational mechanistic studies provide insights that are important for the further modification and utilization of these enzymes in industrial processes. The discussed enzymes are 5-carboxyvanillate decarboxylase, γ-resorcylate decarboxylase, 2,3-dihydroxybenzoic acid decarboxylase, and iso-orotate decarboxylase.Mass cytometry is a powerful tool for deep immune monitoring studies. To ensure maximal data quality, a careful experimental and analytical design is required. However even in well-controlled experiments variability caused by either operator or instrument can introduce artifacts that need to be corrected or removed from the data. Here we present a data processing pipeline which ensures the minimization of experimental artifacts and batch effects, while improving data quality. Data preprocessing and quality controls are carried out using an R pipeline and packages like CATALYST for bead-normalization and debarcoding, flowAI and flowCut for signal anomaly cleaning, AOF for files quality control, flowClean and flowDensity for gating, CytoNorm for batch normalization and FlowSOM and UMAP for data exploration. As proper experimental design is key in obtaining good quality events, we also include the sample processing protocol used to generate the data. Both, analysis and experimental pipelines are easy to scale-up, thus the workflow presented here is particularly suitable for large-scale, multicenter, multibatch and retrospective studies.Hi-C and capture Hi-C have greatly advanced our understanding of the principles of higher-order chromatin structure. In line with the evolution of the Hi-C protocols, there is a demand for an advanced computational method that can be applied to the various forms of Hi-C protocols and effectively remove innate biases. To resolve this issue, we developed an implicit normalization method named "covNorm" and implemented it as an R package. The proposed method can perform a complete procedure of data processing for Hi-C and its variants. Starting from the negative binomial model-based normalization for DNA fragment coverages, removal of genomic distance-dependent background and calling of the significant interactions can be applied sequentially. selleck The performance evaluation of covNorm showed enhanced or similar reproducibility in terms of HiC-spector score, correlation of compartment A/B profiles, and detection of reproducible significant long-range chromatin contacts compared to baseline methods in the benchmark datasets. The developed method is powerful in terms of effective normalization of Hi-C and capture Hi-C data, detection of long-range chromatin contacts, and readily extendibility to the other derivative Hi-C protocols. The covNorm R package is freely available at GitHub https//github.com/kaistcbfg/covNormRpkg.In plants, AAA-adenosine triphosphatase (ATPase) Cell Division Control Protein 48 (CDC48) uses the force generated through ATP hydrolysis to pull, extract, and unfold ubiquitylated or sumoylated proteins from the membrane, chromatin, or protein complexes. The resulting changes in protein or RNA content are an important means for plants to control protein homeostasis and thereby adapt to shifting environmental conditions. The activity and targeting of CDC48 are controlled by adaptor proteins, of which the plant ubiquitin regulatory X (UBX) domain-containing (PUX) proteins constitute the largest family. Emerging knowledge on the structure and function of PUX proteins highlights that these proteins are versatile factors for plant homeostasis and adaptation that might inspire biotechnological applications.