Too often, healthcare workers are led to believe that medical informatics is a complex field that can only be mastered by teams of professional programmers. This is simply not the case. With just a few dozen simple algorithms, easily implemented with open source programming languages, you can fully utilize the medical information contained in clinical and research datasets. The common computational tasks of medical informatics are accessible to anyone willing to learn the basics.
Methods in Medical Informatics: Fundamentals of Healthcare Programming in Perl, Python, and Ruby demonstrates that biomedical professionals with fundamental programming knowledge can master any kind of data collection. Providing you with access to data, nomenclatures, and programming scripts and languages that are all free and publicly available, this book - Describes the structure of data sources used, with instructions for downloading Includes a clearly written explanation of each algorithm Offers equivalent scripts in Perl, Python, and Ruby, for each algorithm Shows how to write short, quickly learned scripts, using a minimal selection of commands Teaches basic informatics methods for retrieving, organizing, merging, and analyzing data sources Provides case studies that detail the kinds of questions that biomedical scientists can ask and answer with public data and an open source programming language Requiring no more than a working knowledge of Perl, Python, or Ruby, Methods in Medical Informatics will have you writing powerful programs in just a few minutes. Within its chapters, you will find descriptions of the basic methods and implementations needed to complete many of the projects you will encounter in your biomedical career.
PART I FUNDAMENTAL ALGORITHMS AND METHODS OF MEDICAL INFORMATICS Chapter 1 Parsing and Transforming Text Files Peeking into Large Files Paging through Large Text Files Extracting Lines that Match a Regular Expression Changing Every File in a Subdirectory Counting the Words in a File Making a Word List with Occurrence Tally Using Printf Formatting Style Chapter 2 Utility Scripts Random Numbers Converting Non-ASCII to Base64 ASCII Creating a Universally Unique Identifier Splitting Text into Sentences One-Way Hash on a Name One-Way Hash on a File A Prime Number Generator Chapter 3 Viewing and Modifying Images Viewing a JPEG Image Converting between Image Formats Batch Conversions Drawing a Graph from List Data Drawing an Image Mashup Chapter 4 Indexing Text ZIPF Distribution of a Text File Preparing a Concordance Extracting Phrases Preparing an Index Comparing Texts Using Similarity Scores PART II MEDICAL DATA RESOURCES Chapter 5 The National Library of Medicine's Medical Subject Headings (MeSH ) Determining the Hierarchical Lineage for MeSH Terms Creating a MeSH Database Reading the MeSH Database Creating an SQLite Database for MeSH Reading the SQLite MeSH Database Chapter 6 The International Classification of Diseases Creating the ICD Dictionary Building the ICD-O (Oncology) Dictionary Chapter 7 SEER: The Cancer Surveillance, Epidemiology, and End Results Program Parsing the SEER Data Files Finding the Occurrences of All Cancers in the SEER Data Files Finding the Age Distributions of the Cancers in the SEER Data Files Chapter 8 OMIM: The Online Mendelian Inheritance in Man Collecting the OMIM Entry Terms Finding Inherited Cancer Conditions Chapter 9 PubMed Building a Large Text Corpus of Biomedical Information Creating a List of Doublets from a PubMed Corpus Downloading Gene Synonyms from PubMed Downloading Protein Synonyms from PubMed Chapter 10 Taxonomy Finding a Taxonomic Hierarchy Finding the Restricted Classes of Human Infectious Pathogens Chapter 11 Developmental Lineage Classification and Taxonomyof Neoplasms Building the Doublet Hash Scanning the Literature for Candidate Terms Adding Terms to the Neoplasm Classification Determining the Lineage of Every Neoplasm Concept Chapter 12 U.S. Census Files Total Population of the United States Stratified Distribution for the U.S. Census Adjusting for Age Chapter 13 Centers for Disease Control and Prevention Mortality Files Death Certificate Data Obtaining the CDC Data Files How Death Certificates Are Represented in Data Records Ranking, by Number of Occurrences, Every Condition in the CDC Mortality Files PART III PRIMARY TASKS OF MEDICAL INFORMATICS Chapter 14 Autocoding A Neoplasm Autocoder Recoding Chapter 15 Text Scrubber for Deidentifyin g Confidential Text Chapter 16 Web Pages and CGI Scripts Grabbing Web Pages CGI Script for Searching the Neoplasm Classification Chapter 17 Image Annotation Inserting a Header Comment Extracting the Header Comment in a JPEG Image File Inserting IPTC Annotations Extracting Comment, EXIF, and IPTC Annotations Dealing with DICOM Finding DICOM Images DICOM-to-JPEG Conversion Chapter 18 Describing Data with Data, Using XML Parsing XML Resource Description Framework (RDF) Dublin Core Metadata Insert an RDF Document into an Image File Insert an Image File into an RDF Document RDF Schema Visualizing an RDF Schema with GraphViz Obtaining GraphViz Converting a Data Structure to GraphViz PART IV MEDICAL DISCOVERY Chapter 19 Case Study: Emphysema Rates Chapter 20 Case Study: Cancer Occurrence Rates Chapter 21 Case Study: Germ Cell Tumor Rates across Ethnicities Chapter 22 Case Study: Ranking the Death-Certifying Process, by State Chapter 23 Case Study: Data Mashups for Epidemics Tally of Coccidioidomycosis Cases by State Creating the Map Mashup Chapter 24 Case Study: Sickle Cell Rates Chapter 25 Case Study: Site-Specific Tumor Biology Anatomic Origins of Mesotheliomas Mesothelioma Records in the SEER Data Sets Graphic Representation Chapter 26 Case Study: Bimodal Tumors Chapter 27 Case Study: The Age of Occurrence of Precancers Epilogue for Healthcare Professionals and Medical Scientists Learn One or More Open Source Programming Languages Don't Agonize Over Which Language You Should Choose Learn Algorithms Unless You Are a Professional Programmer, Relax and Enjoy Being a Newbie Do Not Delegate Simple Programming Tasks to Others Break Complex Tasks into Simple Methods and Algorithms Write Fast Scripts Concentrate on the Questions, Not the Answers Appendix How to Acquire Ruby How to Acquire Perl How to Acquire Python How to Acquire RMagick How to Acquire SQLite How to Acquire the Public Data Files Used in This Book Other Publicly Available Files, Data Sets, and Utilities
Jules Berman, Ph.D., M.D., received two bachelor of science degrees (mathematics and earth sciences) from MIT, a Ph.D. in pathology from Temple University, and an M.D. from the University of Miami School of Medicine. His postdoctoral research was conducted at the National Cancer Institute. His medical residence in pathology was completed at the George Washington University School of Medicine. He became board certified in anatomic pathology and in cytopathology, and served as the chief of Anatomic Pathology, Surgical Pathology and Cytopathology at the Veterans Administration (VA) Medical Center in Baltimore, Maryland. While at the Baltimore VA, Dr. Berman held appointments at the University of Maryland Medical Center and at theJohns Hopkins Medical Institutions. In 1998, he became the program director for pathology informatics in the Cancer Diagnosis Program at the U.S. National Cancer Institute. In 2006, he became president of the Association for Pathology Informatics. Over the course of his career, he has written, as first author, more than 100 publications, including five books in the field of medical informatics. Today, Dr. Berman is a full-time freelance writer.
Reviews for Methods in Medical Informatics: Fundamentals of Healthcare Programming in Perl, Python, and Ruby
As subspecialty board certification in clinical informatics has finally become a reality, Jules Berman's Methods in Medical Informatics could not be more timely. This well-written and informative text combines Dr. Berman's expertise in programming with his vast knowledge of publicly available data sets and everyday healthcare programming needs to result in a book which ... should become a staple in health informatics education programs as well as a standard addition to the personal libraries of informaticists. -Alexis B. Carter, Journal of Pathology Informatics, October 2011 This book provides an introduction to processing clinical and population health data using rigorous methods and widely available, low cost, but very capable tools. The inclusion of the three leading dynamic programming languages broadens the appeal ... bridges the gap from programming instruction to dealing with specialized medical data, making it possible to teach a relevant programming course in a biomedical environment. I would have loved to have a copy of this when I was teaching introductory programming for medical informatics. -Professor James H. Harrison, Jr., Director of Clinical Informatics, University of Virginia ... presents students and professionals in the healthcare field (who have some working knowledge of the open-source programming languages Perl, Python, or Ruby) with instruction for applying basic informatics algorithms to medical data sets. He [the author] provides algorithm scripts for each of the languages, along with step-by-step explanations of the algorithms used for retrieving, organizing, merging, and analyzing such data sources as the National Cancer Institute's Surveillance Epidemiology and End Results project, the National Library of Medicine's PubMed service, the mortality records of the US Centers for Disease Control and Prevention, the US Census, and the Online Mendelian Inheritance in Man data set on inherited conditions. -SciTech Book News, February 2011