
Researchers from all fields, including Policy Management and Environment and Information Studies, will apply Bioinformatics Data Skills (V Buffalo, 2015) to tackle problems in diverse areas (e.g., life science, language, music, design, etc.).
This class focuses on the skills bioinformaticians use to explore and extract information from complex, large datasets. These data skills give you freedom; you’ll be able to look at any data (in any format, and files of any size) and begin exploring data to extract meaning.
Throughout the class, I will emphasize working in a robust and reproducible manner. Reproducibility means that your work can be repeated by other researchers and they can arrive at the same results. For this to be the case, your work must be well documented, and your methods, code, and data all need to be available so that other researchers have the materials to reproduce everything. If a workflow run on a different machine yields a different outcome, it is neither robust nor fully reproducible. These are themes that reappear throughout the class.
This class focuses primarily on handling tabular plain-text data formats. Tabular data is terrific for honing your data skills. Even if your goal is to analyze other types of data in the future, tabular data serves as great example data to learn with. Developing the text-processing skills necessary to work with tabular data will be applicable to working with many other data types. Thus, this class will teach you useful computational tools and data skills that will be helpful in your research.