MICHAEL JOSEPH WADDELL Michael AT WaddellInformatics.com http://www.WaddellInformatics.com # CAREER FOCUS # A Research or Teaching position in Bioinformatics or Biostatistics where I can utilize my extensive cross-disciplinary background in the fields of medicine, programming and statistics. # STRENGTHS # * Strong problem solving and research skills. * Strong interdisciplinary background in both biological and physical sciences. * Intense curiosity and appetite for knowledge. * Comfortable in unstructured environments where initiative and creativity are encouraged. # EDUCATION # 2004 - 2008, Ph.D.(expected), University of Wisconsin - Madison, Department of Computer Sciences (Specialization: Artificial Intelligence; Minor: Biomedical Statistics; Research Focus: Analyzing biological data using statistical and machine learning algorithms and improving the quality and efficiency of the interactions between the biological scientist and the software tool; Advisor: Dr. C. David Page; Dissertation: Toward the Development of a Collaborative Machine Learning System (in progress) ) 2002 - 2004, M.S., University of Wisconsin - Madison, Department of Computer Sciences (Academic Focus: Artificial Intelligence, Computational Biology, Database Systems and Programming Languages & Compilers; Advisor: Dr. C. David Page ) 2000 - 2001, University of Wisconsin Medical School (Withdrew after successful completion of first year.) 1995 - 2000, B.S., University of Wisconsin - Madison. (Majors: Mathematics, Biochemistry and Molecular Biology; Minor: Spanish; Research Focus: Using computational and quantum chemical approaches to understand the structural and energetic origins of the high stability of collagen molecules; Advisor: Dr. Ronald T. Raines; Thesis: Theoretical Analysis of the Basis of Collagen Stability; Graduated with Honors ) # GRANTS RECEIVED # Computation and Informatics in Biology and Medicine Training Program Fellowship, 2002 - 2005 University of Wisconsin Medical Scholars Training Program Fellowship, 2000 - 2001 Barry M. Goldwater Scholarship, 1997 - 1999 Howard Hughes Scholars Fellowship recipient, 1997 (declined) Letters and Science Honors Sophomore Summer Apprenticeship recipient, 1996 # HONORS AND AWARDS # National Merit Scholar University of Wisconsin Medical Scholar 1995 - 2000 University of Wisconsin Dean's List 1995 - 2000 (6 out of 10 semesters) Phi Beta Kappa Honor Society, 1998 - 2000 Phi Kappa Phi Honor Society, 1997 - 2000 Golden Key National Honor Society, 1997 - 2000 (Web Design Subcommittee, 1998 - 2000) Boy Scout Eagle Award # WORK EXPERIENCE (non-research/teaching related) # AI Developer/Bioinformaticist. Developing artificial intelligence and bioinformatics tools for public health and biosurveillance information management systems. Pangaea Information Technologies. Chicago, Illinois. May 2006 - present. Manager: Miklos Ferenczy. Intern. Designed and implemented SELDI Filter, an application for automated analysis of proteomic mass spectrometry data. Provided statistical insights into protocols in order to assess the validity and predictive power of the results of those experiments. Abbott Laboratories, Department of Structural Chemistry. Abbott Park, Illinois. May 2003 - August 2003. Manager: Dr. John C. Rogers. Programmer. Designed and maintained database applications using Microsoft Access and Visual Basic. FlipSide Printing, L.L.C. Madison, Wisconsin. 2001 - 2002. Manager: Daniel C. Kahl. Programmer. Designed and implemented a payroll system using Visual Basic and Microsoft Access. Designed club website. Waukesha Trap and Skeet Club. Waukesha, Wisconsin. January 1997 - May 1997. Manager: Gary Schaetzel. Assistant Office Manager. Cashiered and managed staffing, finance, accounting and inventory. Waukesha Trap and Skeet Club. Waukesha, Wisconsin. May 1991 - August 1995. Manager: Gary Schaetzel. # RESEARCH EXPERIENCE # Research Assistant. Integration of human expert feedback into machine learning systems. Analyzing biological data using statistical and machine learning algorithms. Department of Computer Sciences and Department of Biostatistics & Medical Informatics, University of Wisconsin - Madison. November 2001 - present. Advisor: Dr. C. David Page. Research Assistant. Exploration of abelian sand-pile models through theoretical and computational analysis. Department of Mathematics, University of Wisconsin - Madison. January 2000 - May 2000. Advisor: Dr. James Propp. Research Assistant. Investigation of the inductive effect in the stabilization of hydroxyproline residues in collagen from a computational chemistry perspective. Department of Biochemistry, University of Wisconsin - Madison. May 1998 - May 2000. Advisor: Dr. Ronald T. Raines. Research Assistant. Examined DNA mismatch repair damage in human colorectal tumor cells. Comprehensive Cancer Center, University of Wisconsin - Madison. August 1997 - May 1998. Advisor: Dr. David A. Boothman. Research Assistant. Investigation into the structure/function relationship in NADPH-cytochrome P450 oxidoreductase. McArdle Laboratory for Cancer Research, University of Wisconsin - Madison. April 1996 - May 1997. Advisor: Dr. Charles B. Kasper. # TEACHING EXPERIENCE # Lecturer. Taught an accelerated undergraduate summer course in bioinformtics. Summer Research Program in Biostatistics and Bioinformatics, Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison. June 2005. Course Coordinator: Dori Kalish. Professional Tutor. One-on-one tutoring of undergraduate students in computer science courses. Department of Computer Sciences, University of Wisconsin - Madison. January 2003 - May 2005. Teaching Assistant. Teaching assistant for an Introduction to Programming discussion section. Department of Computer Sciences, University of Wisconsin - Madison. August 2002 - December 2002. Course Coordinator: Debra Deppeler. Student Assistant. Taught group-learning based discussions sections in the three-semester Calculus sequence. Wisconsin Emerging Scholars Program, Department of Mathematics, University of Wisconsin - Madison. January 1996 - December 1999. Course Coordinator: Dr. Melinda Certain. Professional Tutor. One-on-one tutoring of undergraduate students in the colleges of Business and Letters and Science. Taught a variety of mathematics, computer science and economics courses. Academic Advancement Program, University of Wisconsin - Madison. Fall 1998 - Spring 2000. Volunteer Tutor. One-on-one and group tutoring of undergraduate students in a variety of mathematics, physics and chemistry courses. Greater University Tutoring Service, University of Wisconsin - Madison. Summer 1998. # PUBLICATIONS # M. Waddell, D. Page, F. Zhan, B. Barlogie, and J. Shaughnessy, Jr. Predicting cancer susceptibility from single-nucleotide polymorphism data: A case study in multiple myeloma. In Proceedings of BIOKDD '05, Chicago, Illinois, August 2005, Aug 2005. J. Hardin, M. Waddell, C. D. Page, F. Zhan, B. Barlogie, J. Shaughnessy, Jr., and J. Crowley. Evaluation of multiple models to distinguish closely related forms of disease using DNA microarray data. Statistical Applications in Genetics and Molecular Biology, 3(1), June 2004. M. Molla, M. Waddell, D. Page, and J. Shavlik. Using machine learning to design and interpret gene-expression microarrays. AI Magazine, 25(1):23-44, 2004. I. d. C. Dutra, D. Page, V. S. Costa, J. W. Shavlik, and M. Waddell. Toward automatic management of embarrassingly parallel applications. In H. Kosch, L. Boszormenyi, and H. Hellwagner, editors, Euro-Par 2003. Parallel Processing, 9th International Euro-Par Conference, Klagenfurt, Austria, August 26-29, 2003. Proceedings, volume 2790 of Lecture Notes in Computer Science, pages 509-516. Springer-Verlag, Aug 2003. M. L. DeRider, S. J. Wilkens, M. J. Waddell, L. E. Bretscher, F. Weinhold, R. T. Raines, and J. L. Markley. Collagen stability: Insights from NMR spectroscopic and hybrid density functional computational investigations of the effect of electronegative substituents on prolyl ring conformations. Journal of the American Chemical Society, 124(11):2497-2505, Mar 2002. D. Page, F. Zhan, J. Cussens, M. Waddell, J. Hardin, B. Barlogie, and J. Shaughnessy, Jr. Comparative data mining for microarrays: A case study based on multiple myeloma. Technical Report 1453, Computer Sciences Department, University of Wisconsin, Nov 2002. M. J. Waddell. Theoretical analysis of the basis of collagen stability. Senior Undergraduate Thesis, May 2000. University of Wisconsin, Department of Biochemistry. # COMPUTER SKILLS # Programming: C/C++, JAVA, PERL, Prolog, SQL, Lisp, Visual Basic, XHTML/CSS/Javascript (AJAX) Operating Systems: Microsoft Windows, MS-DOS, MacOS X, Linux, BSD, Solaris, IRIX # COMMUNITY SERVICE # Judge. University of Wisconsin Math Meet Finals. May 1997. Emergency Medical Technician. Village of Elm Grove, Wisconsin. Spring 1998 - Spring 2000. Emergency Room and Trauma Life Center Volunteer. University of Wisconsin Hospital. 1997 - 1998. # MEMBERSHIPS # Association for Computing Machinery (ACM) American Association for Artificial Intelligence (AAAI) Institute of Electrical and Electronics Engineers(IEEE) # REFERENCES # (available upon request) # STATEMENT OF RESEARCH # My research focuses on the development of "collaborative machine learning systems" -- machine learning systems that act as collaborators with human users in knowledge discovery -- for biomedical knowledge domains. The development of such systems draws both from work in machine learning systems and from work in collaborative systems (i.e., systems that collaborate with humans but not necessarily for the purpose of knowledge discovery). The goal of this work is to develop a system that performs the primary task of a collaborative machine learning system: Given: A set of models for a classification task (from different algorithms, different settings, bagging, boosting, etc.) Do: Choose the single model that is both highly accurate and best encapsulates the intuition of the user. In all fields of knowledge, human reasoning is becoming overwhelmed by vast and rapidly-growing collections of data. This is especially true in biomedical domains where high-throughput technologies are quickly creating more data than humans can process manually. Many have proposed machine learning as a solution to dealing with this wealth of new data. Such tools, running on modern workstations, are capable of examining large collections of data and quickly considering a variety of potential hypotheses to explain or summarize the data. However, these tools lack several advantages that human experts bring to data analysis, including the rich bodies of background knowledge that human experts possess for the formulation and evaluation of hypotheses and the ability to deal effectively with noise in the data. When working with large data collections from fields such as molecular biology or pharmaceuticals, it would be beneficial if a machine learning system and a human expert could act as a team. This team approach would take advantage of the strengths of each: the speed of the computer combined with the knowledge, intuition and skill of the human expert. For the past five years, I have been working closely with a group at the Lambert Laboratory of Myeloma Genetics at the University of Arkansas for Medical Sciences. We have been collecting and analyzing genomics data in order to understand the genetic basis of this disease. My cross-disciplinary background in biology, medicine and computer sciences gave me unique insights into the limitations of current machine learning software when used on biomedical knowledge domains. It was this insight that led to my current research focus. These collaborations first illustrated to me the types of interactivity that domain experts, humans who are experts in a particular domain of knowledge (e.g., genetics) but not necessarily knowledgeable about machine learning, need in a machine learning system. These collaborations also highlighted the shortcomings that many of the current algorithms have with respect to their ability to meet these needs. One such shortcoming is that many current algorithms produce accurate, but incomprehensible, models. Because the models are not directly comprehensible to the researcher, the learning algorithm may give high importance to irrelevant features or to relevant, but uninformative, features. By providing the domain expert with insight into the features that each model is using, the expert can more easily determine which models are relying on irrelevant or uninformative features. The first interactive machine learning system that I developed is SELDI Filter (Waddell 2003). I designed this system to automate much of the tedious analysis process associated with mass spectrometry data. It integrates with the Ciphergen ProteinChip(tm) software that researchers are using to collect SELDI data, supports interactive analysis of the data using a variety of machine learning algorithms, and presents the analysis results in either Microsoft Word(tm) or Microsoft PowerPoint(tm) for ease of integration into papers and presentations. SELDI Filter currently automates the following analysis methods: ANOVA, Discriminant Analysis, Partition Analysis and Neural Network Analysis. However, due to its modular design, other techniques can quickly and easily be added. SELDI Filter also allows the researcher to group the collected spectra hierarchically. This flexible grouping scheme allows SELDI Filter to accommodate many different experimental designs and datasets. Currently, I am developing the Colleague system (Dutra 2003, Waddell 2004). This system is being developed as a machine learning system with a web-based interface for submitting and managing jobs as well as for analyzing the results returned by the underlying machine learning algorithm. Currently, Colleague supports interactive clustering of models, visualization of models customized to the appropriate knowledge domain, and "drilling-down" and "rolling-up" both within groups of models and within groups of examples. AI systems can be roughly divided into two groups: systems that replace human abilities and systems that augment human abilities. Collaborative systems replace tasks that humans are not adept at and use an efficient collaborative interaction with their users in order to more efficiently utilize those users' strengths. In this way, using a collaborative system makes a human expert more efficient at an overall task (augmentation of human abilities) although certain subtasks are being performed entirely by the system (replacement of human abilities). The promise of "systems that complement the abilities of people and that augment their performance, rather than duplicate people's abilities and replace them (Terveen 1993)," is that this interaction allows us to harness knowledge, experience, intuition and other human abilities that we cannot currently replicate in silico. If we adapt current AI techniques using this model, then AI would essentially become "Amplified Intelligence (Hoffman 2002)" instead of "Artificial Intelligence." This change in focus would mean that instead of trying to develop "artificial superhumans" to replace us, we would focus our efforts on amplifying and extending our cognitive abilities (Ford 1997, Hoffman 2002). References Dutra, Page, Costa, Shavlik and Waddell (2003). Toward Automatic Management of Embarrassingly Parallel Applications. In Euro-Par 2003. Parallel Processing, 9th International Euro-Par Conference, Klagenfurt, Austria, August 26-29, 2003. Proceedings, Volume 2790 of Lecture Notes in Computer Science, pages 509-516. Springer-Verlag. Ford, Glymour, and Hayes (1997). Cognitive Prostheses. AI Magazine, 18(3):104. Hoffman, Hayes, Ford and Hancock (2002). The Triples Rule. IEEE Intelligent Systems, 17(3):62-65. Terveen (1993). Intelligent Systems as Cooperative Systems. International Journal of Intelligent Systems, 3(2-4):217-250. Waddell (2003). SELDI Filter: Automating the Filtering and Analysis of Proteomic Mass Spectrometry Data. Poster. Abbott Laboratories Science Intern Poster Session, Abbott Park, Illinois, July 23, 2003. Waddell (2004). Validating the Effectiveness of Machine Learning Assistance. Poster. Computation and Informatics in Biology and Medicine (CIBM) Training Program Retreat, University of Wisconsin-Madison, Madison, Wisconsin, October 15, 2004. # STATEMENT OF TEACHING PLANS # My first teaching experience at the university level was as an undergraduate teaching assistant for the calculus curriculum. The math department at the University of Wisconsin has a program called the "Wisconsin Emerging Scholars" program which provides motivated students with 6 hours of discussion sections per week in addition to their standard 5 hours of lectures. In addition to the activities of a normal discussion section, the students spent the bulk of these 6 hours each week in small groups working on challenging problems related to the lecture material for that week. I feel very fortunate that my first formal teaching experience was with the WES program, because I learned much more about how students learn than if I had been teaching a traditional lecture-style course. First of all, I was able to spend a good deal of time observing the students working in groups to solve these challenging problems. I was able to see the kinds of errors that were being made, the concepts that were confusing, and the ways in which their reasoning went awry. I worked very hard to give them hints to guide them in the right direction without giving them the answer. By forcing them to work through the analytical process on their own, I found that they were able to generalize what they had learned better than when they were just told how to do a particular type of problem. When I found that multiple groups were misunderstanding a fundamental concept, I would pause the groups and do a "mini-lecture" to address that concept. The joy that I found in teaching inspired me to work hard to improve my effectiveness as a teacher. As part of this, I took an educational psychology course, which gave me a better understanding of the different ways in which students learn. During this class, I learned the psychological basis for the behavior that I had witnessed during my four years with the WES program. This verified that the lessons I had learned were not unique to teaching calculus, but could easily be adapted to any field of study. As a teacher, my pedagogical goal has been to keep students engaged in the learning process and to encourage individual learning styles. To meet this goal, I have developed a number of effective techniques. First, in order to keep students engaged, I work to minimize the amount of lecturing by focusing on main ideas and minimal examples during "formal" lectures. I find that numerous examples are more effective when students work through them in small groups with guidance rather than watching the teacher demonstrate the example. Not only does this approach help the students to remember the method better, but it also helps them to generalize what they have learned and it builds their self-confidence within the discipline. Secondly, in order to encourage individual learning styles, I promote office hours for those students who need individualized help, I encourage independent projects for those motivated students who want to delve deeper into a particular area, and I work to provide multiple modalities for students to demonstrate their knowledge. I understand that every student is different, and although one student may be a good test-taker and another student may be a good public-speaker, what really matters is whether they have a solid grasp of the material that they have been presented with. Thus, I believe in offering students more means of demonstrating their knowledge than just during a formal exam. Thus, I encourage all students -- from the struggling to the excelling -- to do independent projects that suit their individual learning styles. The challenge for me as a teacher is to balance my desire for individualized instruction with the realities of having limited time and resources. This is a challenge that I solve differently for each class depending upon the specific needs of that group of students. Finally, a technique that I use to deal with the variety of backgrounds of my students is to give short "pre-tests" for each section to identify background areas that are weak as well as to identify material that the class is already familiar with and thus I do not need to cover in depth. I have spent my educational career working to combine the fields of the biological sciences, the medical sciences, mathematics and the computer sciences. Thus, I am qualified to teach many courses in these areas. The areas which I feel the most qualified to teach are bioinformatics, medical informatics and artificial intelligence.