Gadegaardholden6452

Z Iurium Wiki

The Veterans Affairs Precision Oncology Data Repository (VA-PODR) is a large, nationwide repository of de-identified data on patients diagnosed with cancer at the Department of Veterans Affairs (VA). Data include longitudinal clinical data from the VA's nationwide electronic health record system and the VA Central Cancer Registry, targeted tumor sequencing data, and medical imaging data including computed tomography (CT) scans and pathology slides. A subset of the repository is available at the Genomic Data Commons (GDC) and The Cancer Imaging Archive (TCIA), and the full repository is available through the Veterans Precision Oncology Data Commons (VPODC). By releasing this de-identified dataset, we aim to advance Veterans' health care through enabling translational research on the Veteran population by a wide variety of researchers.Discovering causal mechanisms underlying firearm acquisition can provide critical insight into firearm-related violence in the United States. Here, we established an information-theoretic framework to address the long-disputed dichotomy between self-protection and fear of firearm regulations as potential drivers of firearm acquisition in the aftermath of a mass shooting. We collected data on mass shootings, federal background checks, media output on firearm control and shootings, and firearm safety laws from 1999 to 2017. First, we conducted a cluster analysis to partition States according to the restrictiveness of their firearm-related legal environment. Then, we performed a transfer entropy analysis to unveil causal relationships at the State-level in the Wiener-Granger sense. The analysis suggests that fear of stricter firearm regulations is a stronger driver than the desire of self-protection for firearm acquisitions. This fear is likely to cross State borders, thereby shaping a collective pattern of firearm acquisition throughout the Nation.Pairwise sequence alignment is often a computational bottleneck in genomic analysis pipelines, particularly in the context of third-generation sequencing technologies. To speed up this process, the pairwise k-mer Jaccard similarity is sometimes used as a proxy for alignment size in order to filter pairs of reads, and min-hashes are employed to efficiently estimate these similarities. However, when the k-mer distribution of a dataset is significantly non-uniform (e.g., due to GC biases and repeats), Jaccard similarity is no longer a good proxy for alignment size. In this work, we introduce a min-hash-based approach for estimating alignment sizes called Spectral Jaccard Similarity, which naturally accounts for uneven k-mer distributions. The Spectral Jaccard Similarity is computed by performing a singular value decomposition on a min-hash collision matrix. We empirically show that this new metric provides significantly better estimates for alignment sizes, and we provide a computationally efficient estimator for these spectral similarity scores.Gene expression and protein abundance data of cells or tissues belonging to healthy and diseased individuals can be integrated and mapped onto genome-scale metabolic networks to produce patient-derived models. As the number of available and newly developed genome-scale metabolic models increases, new methods are needed to objectively analyze large sets of models and to identify the determinants of metabolic heterogeneity. We developed a distance-based workflow that combines consensus machine learning and metabolic modeling techniques and used it to apply pattern recognition algorithms to collections of genome-scale metabolic models, both microbial and human. Model composition, network topology and flux distribution provide complementary aspects of metabolic heterogeneity in patient-specific genome-scale models of skeletal muscle. Using consensus clustering analysis we identified the metabolic processes involved in the individual responses to endurance training in older adults.The genetic effect explains the causality from genetic mutations to the development of complex diseases. Existing genome-wide association study (GWAS) approaches are always built under a linear assumption, restricting their generalization in dissecting complicated causality such as the recessive genetic effect. Therefore, a sophisticated and general GWAS model that can work with different types of genetic effects is highly desired. Here, we introduce a deep association kernel learning (DAK) model to enable automatic causal genotype encoding for GWAS at pathway level. DAK can detect both common and rare variants with complicated genetic effects where existing approaches fail. When applied to four real-world GWAS datasets including cancers and schizophrenia, our DAK discovered potential casual pathways, including the association between dilated cardiomyopathy pathway and schizophrenia.This piece identifies and compares three examples of successful data sharing that sought to improve housing and health outcomes, ultimately improving the lives of vulnerable groups. Data strategists should first consider proving out the benefit in consultation with diverse stakeholders, mitigating legal risks from the beginning, and starting with a minimal data prototype.Repeating patterns in architecture are utilized in elements at a variety of scales, from a façade to perforated ceilings and wall reliefs to carpeting and tile stonework. The Truchet tiling concept is one means to develop a modular non-repeating pattern. This paper explores some of the basic concepts of Truchet tilings, variations developed, and some current examples of using these methods with digital generation and fabrication methods.Deep learning, a set of approaches using artificial neural networks, has generated rapid recent advancements in machine learning. Deep learning does, however, have the potential to reduce the reproducibility of scientific results. Model outputs are critically dependent on the data and processing approach used to initially generate the model, but this provenance information is usually lost during model training. To avoid a future reproducibility crisis, we need to improve our deep-learning model management. The FAIR principles for data stewardship and software/workflow implementation give excellent high-level guidance on ensuring effective reuse of data and software. We suggest some specific guidelines for the generation and use of deep-learning models in science and explain how these relate to the FAIR principles. We then present dtoolAI, a Python package that we have developed to implement these guidelines. The package implements automatic capture of provenance information during model training and simplifies model distribution.Expectations of machine learning (ML) are high for discovering new patterns in high-throughput biological data, but most such practices are accustomed to relying on existing knowledge conditions to design experiments. Investigations of the power and limitation of ML in revealing complex patterns from data without the guide of existing knowledge have been lacking. In this study, we conducted systematic experiments on such ab initio knowledge discovery with ML methods on single-cell RNA-sequencing data of early embryonic development. Results showed that a strategy combining unsupervised and supervised ML can reveal major cell lineages with minimum involvement of prior knowledge or manual intervention, and the ab initio mining enabled a new discovery of human early embryonic cell differentiation. The study illustrated the feasibility, significance, and limitation of ab initio ML knowledge discovery on complex biological problems.High-throughput drug screens in cancer cell lines test compounds at low concentrations, thereby enabling the identification of drug-sensitivity biomarkers, while resistance biomarkers remain underexplored. Dissecting meaningful drug responses at high concentrations is challenging due to cytotoxicity, i.e., off-target effects, thus limiting resistance biomarker discovery to frequently mutated cancer genes. To address this, we interrogate subpopulations carrying sensitivity biomarkers and consecutively investigate unexpectedly resistant (UNRES) cell lines for unique genetic alterations that may drive resistance. By analyzing the GDSC and CTRP datasets, we find 53 and 35 UNRES cases, respectively. For 24 and 28 of them, we highlight putative resistance biomarkers. EGF816 We find clinically relevant cases such as EGFRT790M mutation in NCI-H1975 or PTEN loss in NCI-H1650 cells, in lung adenocarcinoma treated with EGFR inhibitors. Interrogating the underpinnings of drug resistance with publicly available CRISPR phenotypic assays assists in prioritizing resistance drivers, offering hypotheses for drug combinations.The development and growing adoption of the FAIR data principles and associated standards as a part of research policies and practices place novel demands on research data services. This article highlights common challenges and priorities and proposes a set of recommendations on how data infrastructures can evolve and collaborate to provide services that support the implementation of the FAIR data principles, in particular in the context of building the European Open Science Cloud (EOSC). The recommendations cover a broad area of topics, including certification, infrastructure components, stewardship, costs, rewards, collaboration, training, support, and data management. These recommendations were prioritized according to their perceived urgency by different stakeholder groups and associated with actions as well as suggested action owners. This article is the output of three workshops organized by the projects FAIRsFAIR, RDA Europe, OpenAIRE, EOSC-hub, and FREYA designed to explore, discuss, and formulate recommendations among stakeholders in the scientific community. While the results are a work-in-progress, the challenges and priorities outlined provide a detailed and unique overview of current issues seen as crucial by the community that can sharpen and improve the roadmap toward a FAIR data ecosystem.Biological systems are composed of highly complex networks, and decoding the functional significance of individual network components is critical for understanding healthy and diseased states. Several algorithms have been designed to identify the most influential regulatory points within a network. However, current methods do not address all the topological dimensions of a network or correct for inherent positional biases, which limits their applicability. To overcome this computational deficit, we undertook a statistical assessment of 200 real-world and simulated networks to decipher associations between centrality measures and developed an algorithm termed Integrated Value of Influence (IVI), which integrates the most important and commonly used network centrality measures in an unbiased way. When compared against 12 other contemporary influential node identification methods on ten different networks, the IVI algorithm outperformed all other assessed methods. Using this versatile method, network researchers can now identify the most influential network nodes.Most data science is about people, and opinions on the value of human data differ. The author offers a synthesis of overly optimistic and overly pessimistic views of human data science it should become a science, with errors systematically studied and their effects mitigated-a goal that can only be achieved by bringing together expertise from a range of disciplines.

Autoři článku: Gadegaardholden6452 (Montoya Bishop)