文章 | ✨⌈灵犀·随想⌋

xaringan包rmarkdown制作幻灯片的神器

Rstudio最近出来一些包简直要逆天了，这完全已经不再局限于IDE这样的工具了。Rstudio已经成为了我日常工作中的工具，写代码、画图、写文档、制作幻灯片，写博客。 ...

Metabolomic Databases

Comprehensive Metabolomic Databases HMDB The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed information about small molecule metabolites found (and experimentally verified) in the human body. The database contains three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data. HMDB contains information on more than 6500 metabolites. Additionally, approximately 1500 protein (and DNA) sequences are linked to these metabolite entries. Each MetaboCard entry contains more than 100 data fields with 2/3 of the information being devoted to chemical/clinical data and the other 1/3 devoted to enzymatic or biochemical data. Many data fields are hyperlinked to other databases (KEGG, PubChem, MetaCyc, ChEBI, PDB, Swiss-Prot, and GenBank) and a variety of structure and pathway viewing applets. BiGG The BiGG database is a metabolic reconstruction of human metabolism designed for systems biology simulation and metabolic flux balance modeling. It is a comprehensive literature-based genome-scale metabolic reconstruction that accounts for the functions of 1,496 ORFs, 2,004 proteins, 2,766 metabolites, and 3,311 metabolic and transport reactions. It was assembled from build 35 of the human genome. SetupX SetupX, developed by the Fiehn laboratory at UC Davis, is a web-based metabolomics LIMS. It is XML compatible and built around a relational database management core. It is particularly oriented towards the capture and display of GC-MS metabolomic data through its metabolic annotation database called BinBase. Click here for free download of database and source code. BinBase BinBase is a GC-TOF metabolomic database. Click here for data query and Click here for free download of source code. SYSTOMONAS SYSTOMONAS (SYSTems biology of pseudOMONAS) is a database for systems biology studies of Pseudomonas species. It contains extensive transcriptomic, proteomic and metabolomic data as well as metabolic reconstructions of this pathogen. Reconstruction of metabolic networks in SYSTOMONAS was achieved via comparative genomics. Broad data integration with well established databases BRENDA, KEGG and PRODORIC is also maintained. Several tools for the analysis of stored data and for the visualization of the corresponding results are provided, enabling a quick understanding of metabolic pathways, genomic arrangements or promoter structures of interest MetaboLights database MetaboLights is a database for metabolomics experiments and derived information. The database is cross-species, cross-technique and covers metabolite structures and their reference spectra as well as their biological roles, locations, concentrations and experimental data from metabolic experiments. MetaboLights offer user-submission tools and have strong reporting capabilities. We will utilise and further develop de-facto standard formats where various components are encapsulated, such as the encoded spectral and chromatographic data, and associated information about the chemical structure, as well as metadata describing assays and the study as a whole. Metabolic Pathway Databases KEGG KEGG (Kyoto Encyclopedia of Genes and Genomes) is one of the most complete and widely used databases containing metabolic pathways (372 reference pathwasy) from a wide variety of organisms (>700). These pathways are hyperlinked to metabolite and protein/enzyme information. Currently KEGG has >15,000 compounds (from animals, plants and bacteria), 7742 drugs (including different salt forms and drug carriers) and nearly 11,000 glycan structures. MetaCyc MetaCyc is a database of nonredundant, experimentally elucidated metabolic pathways. MetaCyc contains more than 1,100 pathways from more than 1,500 different organisms. MetaCyc is curated from the scientific experimental literature and contains pathways involved in both primary and secondary metabolism, as well as associated compounds, enzymes, and genes. HumanCyc HumanCyc is a bioinformatics database that describes the human metabolic pathways and the human genome. The current version of HumanCyc was constructed using Build 31 of the human genome. The resulting pathway/genome database (PGDB) includes information on 28,783 genes, their products and the metabolic reactions and pathways they catalyze. BioCyc BioCyc is a collection of 371 Pathway/Genome Databases. Each database in the BioCyc collection describes the genome and metabolic pathways of a single organism. The databases within the BioCyc collection are organized into tiers according to the amount of manual review and updating they have received. Tier 1 DBs have been created through intensive manual efforts and include EcoCyc, MetaCyc and the BioCyc Open Compounds Database (BOCD). BOCD includes metabolites, enzyme activators, inhibitors, and cofactors derived from hundreds of organisms. Tier 2 and Tier 3 databases contain computationally predicted metabolic pathways, as well as predictions as to which genes code for missing enzymes in metabolic pathways, and predicted operons. Reactome Reactome is a curated, peer-reviewed knowledgbase of biological pathways, including metabolic pathways as well as protein trafficking and signaling pathways. Reactome includes several types of reactions in its pathway diagram collection including experimentally confirmed, manually inferred and electronically inferred reactions. Reactome has pathway data on more than 20 different organisms but the primary organism of interest is Homo sapiens. Reactome has data and pathway diagrams for >2700 proteins, 2800 reactions and 860 pathways for humans. WikiPathways WikiPathways is an open, collaborative platform for capturing and disseminating models of biological pathways for data visualization and analysis. The database has pathway for more than 20 species and more than 100 pathways for seven species. The human collection contains more than 800 pathways, covering more than 7500 genes. WikiPathways also contains pathways with more than 1000 metabolites. More info in this paper: http://nar.oxfordjournals.org/content/44/D1/D488.full Compound or Compound-Specific Databases PubChem PubChem is a freely available database of chemical structures of small organic molecules and information on their biological activities. It contains structure, nomenclature and calculated physico-chemical data and is linked with NIH PubMed/Entrez information. PubChem is organized as three linked databases within the NCBI’s Entrez information retrieval system. These are PubChem Substance, PubChem Compound, and PubChem BioAssay. PubChem also provides a fast chemical structure similarity search tool. PubChem has >19 million unique chemical structures. ChEBI Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. The chemical entities in ChEBI are either products of nature (metabolites) or synthetic products used to intervene in the processes of living organisms (drugs or toxins). ChEBI contains structure and nomenclature information along with hyperlinks to many well-regarded databases. ChEBI uses a carefully developed ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are precisely specified. ChEBI has >15,500 chemical entities in its database. ChemSpider ChemSpider is an aggregated database of organic molecules containing more than 20 million compounds from many different providers. At present the database contains information from such diverse sources as a marine natural products database, ACD-Labs chemical databases, the EPA’s DSSTox databases and from a series of chemical vendors. It has extensive search utilities and most compounds have a large number of calculated physico-chemical property values. KEGG Glycan The KEGG GLYCAN database is a collection of experimentally determined glycan structures. It contains all unique structures taken from CarbBank, structures entered from recent publications, and structures present in KEGG pathways. KEGG Glycan has >11,000 glycan structures from a large number of eukaryotic and prokaryotic sources. IIMDB In Vivo/In Silico Metabolites Database (IIMDB) consists of both known and computationally generated compounds. The database, which is available at http://metabolomics.pharm.uconn.edu/iimdb/ , includes ∼23 000 known compounds (mammalian metabolites, drugs, secondary plant metabolites, and glycerophospholipids) collected from existing biochemical databases plus more than 400 000 computationally generated human phase-I and phase-II metabolites of these known compounds. The IIMDB database features a user-friendly web interface and a programmer-friendly RESTful web service. Drug Databases DrugBank The DrugBank database is a blended bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains nearly 4800 drug entries including >1,350 FDA-approved small molecule drugs, 123 FDA-approved biotech (protein/peptide) drugs, 71 nutraceuticals and >3,243 experimental drugs. DrugBank also contains extensive SNP-drug data that is useful for pharmacogenomics studies. Therapeutic Target Database The Therapeutic Target Database (TTD) is a drug database designed to provide information about the known therapeutic protein and nucleic acid targets described in the literature, the targeted disease conditions, the pathway information and the corresponding drugs/ligands directed at each of these targets. The database currently contains 1535 targets and 2107 drugs/ligands. PharmGKB The PharmGKB database is a central repository for genetic, genomic, molecular and cellular phenotype data and clinical information about people who have participated in pharmacogenomics research studies. The data includes, but is not limited to, clinical and basic pharmacokinetic and pharmacogenomic research in the cardiovascular, pulmonary, cancer, pathways, metabolic and transporter domains. Its aim is to aid researchers in understanding how genetic variation among individuals contributes to differences in reactions to drugs. PharmGKB contains searchable data on genes (>20,000), diseases (>3000), drugs (>2500) and pathways (53). It also has detailed information on 470 genetic variants (SNP data) affecting drug metabolism. STITCH STITCH (‘search tool for interactions of chemicals’) is a searchable database that integrates information about interactions from metabolic pathways, crystal structures, binding experiments and drug-target relationships. Text mining and chemical structure similarity is used to predict relations between chemicals. Each proposed interaction can be traced back to the original data sources. The database contains interaction information for over 68 000 different chemicals, including 2200 drugs, and connects them to 1.5 million genes across 373 genomes. SuperTarget SuperTarget is a database that contains a core dataset of about 7300 drug-target relations of which 4900 interactions have been subjected to a more extensive manual annotation effort. SuperTarget provides tools for 2D drug screening and sequence comparison of the targets. The database contains more than 2500 target proteins, which are annotated with about 7300 relations to 1500 drugs; the vast majority of entries have pointers to the respective literature source. A subset of 775 more extensively annotated drugs is provided separately through the Matador database (Manually Annotated Targets And Drugs Online Resource). Spectral Databases HMDB The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. It contains experimental MS/MS data for 800 compounds, experimental 1H and 13C NMR data (and assignments) for 790 compounds and GC/MS spectral and retention index data for 260 compounds. Additionally, predicted 1H and 13C NMR spectra have been generated for 3100 compounds. All spectral databases are downloadable and searchable. BMRB The BioMagResBank (BMRB) is the central repository for experimental NMR spectral data, primarily for macromolecules. The BMRB also contains a recently established subsection for metabolite data. The current metabolomics database contains structures, structure viewing applets, nomenclature data, extensive 1D and 2D spectral peak lists (from 1D, TOCSY, DEPT, HSQC experiments), raw spectra and FIDs for nearly 500 molecules. The data is both searchable and downloadable. MMCD The Madison Metabolomics Consortium Database (MMCD) is a database on small molecules of biological interest gathered from electronic databases and the scientific literature. It contains approximately 10,000 metabolite entries and experimental spectral data on about 500 compounds. Each metabolite entry in the MMCD is supported by information in an average of 50 separate data fields, which provide the chemical formula, names and synonyms, structure, physical and chemical properties, NMR and MS data on pure compounds under defined conditions where available, NMR chemical shifts determined by empirical and/or theoretical approaches, information on the presence of the metabolite in different biological species, and extensive links to images, references, and other public databases. MassBank MassBank is a mass spectral database of experimentally acquired high resolution MS spectra of metabolites. Maintained and supported by he JST-BIRD project, it offers various query methods for standard spectra obtained from Keio University, RIKEN PSC, and other Japanese research institutions. It is officially sanctioned bythe Mass Spectrometry Society of Japan. The database has very detailed MS data and excellent spectral/structure searching utilities. More than 13,000 spectra from 1900 different compounds are available. Golm Metabolome Database The Golm Metabolome Database provides public access to custom GC/MS libraries which are stored as Mass Spectral (MS) and Retention Time Index (RI) Libraries (MSRI). These libraries of mass spectral and retention time indices can be used with the NIST/AMDIS software to identify metabolites according their spectral tags and RI’s. The libraries are both searchable and downloadable and have been carefully collected under defined conditions on several types of GC/MS instruments (quadrupole and TOF). Metlin The METLIN Metabolite Database is a repository for mass spectral metabolite data. All metabolites are neutral or free acids. It is a collaborative effort between the Siuzdak and Abagyan groups and Center for Mass Spectrometry at The Scripps Research Institute. METLIN is searchable by compound name, mass, formula or structure. It contains 15,000 structures, including more than 8000 di and tripeptides. METLIN contains MS/MS, LC/MS and FTMS data that can be searched by peak lists, mass range, biological source or disease. Fiehn GC-MS Database This library contains data on 713 compounds (name, structure, CAS ID, other links) for which GC/MS data (spectra and retention indices) have been collected by the Fiehn laboratory. A locally maintain program called BinBase/Bellerophon filters input GC/MS spectra and uses the spectral library to identify compounds. The actual GC/MS library is available from several different GC/MS vendors. BML-NMR The Birmingham Metabolite Library Nuclear Magnetic Resonance database is a freely available resource containing 3328 NMR spectra of 208 common metabolite standards. This database includes both 2-D 1H J-resolved spectra and 1-D 1H spectra, recorded at 500 MHz using various water suppression methods and acquisition parameters, for solutions at pH values of 6.6, 7.0 and 7.4. The raw and processed data, including associated metadata, are housed in a purpose-built MySQL database that is compliant with the Metabolomics Standards Initiative (MSI) endorsed reporting requirements, with some necessary amendments. Library data can be accessed freely and searched through a custom written web interface. FIDs, NMR spectra and associated metadata can be downloaded according to a newly implemented MSI-compatible XML schema. MetaboLights database MetaboLights is a database for metabolomics experiments and derived information. The database is cross-species, cross-technique and covers metabolite structures and their reference spectra as well as their biological roles, locations, concentrations and experimental data from metabolic experiments. MetaboLights offer user-submission tools and have strong reporting capabilities. We will utilise and further develop de-facto standard formats where various components are encapsulated, such as the encoded spectral and chromatographic data, and associated information about the chemical structure, as well as metadata describing assays and the study as a whole. mzCloud mzCloud features a searchable collection of high resolution/accurate mass spectral trees using a new third generation spectra correlation algorithm. mzCloud is free and available for public use online. mzCloud also represents an open consortium of dedicated research and scientific groups aiming to establish a comprehensive library of high quality spectral trees to improve the structure elucidation of unknowns. mzCloud tries to address identification bottleneck by considering all mass spectrometricaly relevant aspects, looking at number of experimental and computational details and in some cases allowing identification of unknowns even if they are not present in library. Disease & Physiology Databases OMIM Online Mendelian Inheritance in Man (OMIM) is a comprehensive compendium of human genes and genetic phenotypes. The full-text, referenced overviews in OMIM contain information on all known Mendelian disorders and over 12,000 genes. OMIM focuses on the relationship between phenotype and genotype. It is updated daily, and the entries contain many links to other genetics resources. OMIM contains 379 diseases with associated gene sequence data as well as 2385 conditions with a disease phenotype and a known genetic cause. METAGENE METAGENE is a knowledgebase for inborn errors of metabolism providing information about the disease, genetic cause, treatment and the characteristic metabolite concentrations or clinical tests that may be used to diagnose or monitor the condition. It has data on 431 genetic diseases. OMMBID OMMBID or the On-Line Metabolic and Molecular Basis to Inherited Disease is an web-accessible book/encyclopedia describing the genetics, metabolism, diagnosis and treatment of hundreds of metabolic disorders contributed from hundreds of experts. It also contains extensive reviews, detailed pathways, chemical structures, physiological data and tables that are particularly useful for clinical biochemists. Most university libraries have subscriptions to this resource. OMMBID was originally developed by Charles Scriver at McGill.

EBM Training[已结束]

《循证医学》课程培训计划课程介绍：循证医学是当今世界医学领域最重要、最活跃、最前沿的新兴学科。循证医学就是遵循现有最好的证据，进行临床实践和制定宏观医疗卫生决策。实施循证医学将会不断淘汰现行无效的医学干预措施，防止新的无效的措施进入医学实践，从而不断提高医疗卫生服务的质量和效率，充分利用有限的医学资源。课程目标：掌握循证医学的基本理论、方法和技能，学会查证、用证解决临床问题；自愿参与第二课堂，在实践中学习系统评价的方法。共8章，主要内容如下： ...

edgeR入门

edgeR包主要是用于利用来自不同技术平台的read数（包括RNA-seq，SAGE或者ChIP-seq等）来鉴别差异表达或者差异标记（ChIP-seq）。主要是利用了多组实验的精确统计模型或者适用于多因素复杂实验的广义线性模型。所以有时作者也把前者叫做“经典edgeR”，后者叫做”广义线性模型 edgeR“。这里定义的read数是可以指基因水平、外显子水平、转录本水平或者标签水平等，这个由用户根据自己数据分析的实际需要而定。这里作者也列举了一些差异表达鉴定方面的文献：包括edgeR刚发布时的文献–“edgeR: a Bioconductor package for differential expression analysis of digital gene expression data”以及后来的一些改进文章。 ...

读《血疫：埃博拉的故事》有感

这是一本“人命关天”的书，埃博拉，来自热带雨林的危险病毒，可在24小时内乘飞机抵达地球上的任何城市。航空线路连接了全世界的所有城市，构成网络。这不只是一个病毒的故事，它关系着人类的无知、贪婪、勇气和牺牲，以及我们面对大自然时的敬畏。真实，远比你想象的更惊悚。 ...

R语言解析JSON格式数据文件

由于分析的数据格式为JSON格式，既占空间，而且分析时也非常的不方便，所以，我们需要对JSON格式的数据进行解析，使其符合R语言分析所需要的数据格式，如data.frame,list等。在R语言的包库中，已经有人对JSON格式的解析做了完整的包jsonlite，这极大地减轻了分析人员的工作压力。 jsonlite包中有以下几个函数: ...

抗击埃博拉系列（一）——纵然膺使命

2015年5月9日，中国（湖南）援塞医疗队肩负国家使命踏上了抗击埃博拉的征程。我作为药师有幸加入援塞医疗队随队出征。从飞机舷梯下来，结束了二十多个小时的飞行旅程，带着一丝疲惫我踏上了大西洋西海岸的美丽城市——弗里敦。大自然赋予了她迷人、秀丽的景色，放眼望去是蔚蓝的天空，高大的椰树，几间低矮的铁皮屋和茅草屋散落在树林中，她原生态的秀丽完全无法让人使其与埃博拉联系上。 ...

抗击埃博拉系列（二）

图1. Jui hospital 外景，马路左侧为sierra leone 医学院校园，右侧为医院图2. Jui hospital 医院内景，sierra leone 最大、最漂亮的医院，但条件仍不及国内乡镇卫生院 ...