Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
ukbtools: An R package to manage and query UK Biobank data.
oleh: Ken B Hanscombe, Jonathan R I Coleman, Matthew Traylor, Cathryn M Lewis
Format: | Article |
---|---|
Diterbitkan: | Public Library of Science (PLoS) 2019-01-01 |
Deskripsi
<h4>Introduction</h4>The UK Biobank (UKB) is a resource that includes detailed health-related data on about 500,000 individuals and is available to the research community. However, several obstacles limit immediate analysis of the data: data files vary in format, may be very large, and have numerical codes for column names.<h4>Results</h4>ukbtools removes all the upfront data wrangling required to get a single dataset for statistical analysis. All associated data files are merged into a single dataset with descriptive column names. The package also provides tools to assist in quality control by exploring the primary demographics of subsets of participants; query of disease diagnoses for one or more individuals, and estimating disease frequency relative to a reference variable; and to retrieve genetic metadata.<h4>Conclusion</h4>Having a dataset with meaningful variable names, a set of UKB-specific exploratory data analysis tools, disease query functions, and a set of helper functions to explore and write genetic metadata to file, will rapidly enable UKB users to undertake their research.