MICHAEL JOSEPH WADDELL
Michael AT WaddellInformatics.com
http://www.WaddellInformatics.com
# CAREER FOCUS #
A Research or Teaching position in Bioinformatics or Biostatistics where
I can utilize my extensive cross-disciplinary background in the fields of
medicine, programming and statistics.
# STRENGTHS #
* Strong problem solving and research skills.
* Strong interdisciplinary background in both biological and physical
sciences.
* Intense curiosity and appetite for knowledge.
* Comfortable in unstructured environments where initiative and creativity
are encouraged.
# EDUCATION #
2004 - 2008, Ph.D.(expected), University of Wisconsin - Madison, Department
of Computer Sciences (Specialization: Artificial Intelligence; Minor:
Biomedical Statistics; Research Focus: Analyzing biological data using
statistical and machine learning algorithms and improving the quality
and efficiency of the interactions between the biological scientist and
the software tool; Advisor: Dr. C. David Page; Dissertation: Toward the
Development of a Collaborative Machine Learning System (in progress) )
2002 - 2004, M.S., University of Wisconsin - Madison, Department of
Computer Sciences (Academic Focus: Artificial Intelligence, Computational
Biology, Database Systems and Programming Languages & Compilers; Advisor:
Dr. C. David Page )
2000 - 2001, University of Wisconsin Medical School (Withdrew after
successful completion of first year.)
1995 - 2000, B.S., University of Wisconsin - Madison. (Majors: Mathematics,
Biochemistry and Molecular Biology; Minor: Spanish; Research Focus: Using
computational and quantum chemical approaches to understand the structural
and energetic origins of the high stability of collagen molecules; Advisor:
Dr. Ronald T. Raines; Thesis: Theoretical Analysis of the Basis of Collagen
Stability; Graduated with Honors )
# GRANTS RECEIVED #
Computation and Informatics in Biology and Medicine Training Program
Fellowship, 2002 - 2005
University of Wisconsin Medical Scholars Training Program Fellowship,
2000 - 2001
Barry M. Goldwater Scholarship, 1997 - 1999
Howard Hughes Scholars Fellowship recipient, 1997 (declined)
Letters and Science Honors Sophomore Summer Apprenticeship recipient, 1996
# HONORS AND AWARDS #
National Merit Scholar
University of Wisconsin Medical Scholar 1995 - 2000
University of Wisconsin Dean's List 1995 - 2000 (6 out of 10 semesters)
Phi Beta Kappa Honor Society, 1998 - 2000
Phi Kappa Phi Honor Society, 1997 - 2000
Golden Key National Honor Society, 1997 - 2000 (Web Design Subcommittee,
1998 - 2000)
Boy Scout Eagle Award
# WORK EXPERIENCE (non-research/teaching related) #
AI Developer/Bioinformaticist. Developing artificial intelligence and
bioinformatics tools for public health and biosurveillance information
management systems. Pangaea Information Technologies. Chicago, Illinois. May
2006 - present. Manager: Miklos Ferenczy.
Intern. Designed and implemented SELDI Filter, an application for automated
analysis of proteomic mass spectrometry data. Provided statistical insights
into protocols in order to assess the validity and predictive power of the
results of those experiments. Abbott Laboratories, Department of Structural
Chemistry. Abbott Park, Illinois. May 2003 - August 2003. Manager: Dr. John
C. Rogers.
Programmer. Designed and maintained database applications using Microsoft
Access and Visual Basic. FlipSide Printing, L.L.C. Madison, Wisconsin. 2001
- 2002. Manager: Daniel C. Kahl.
Programmer. Designed and implemented a payroll system using Visual Basic
and Microsoft Access. Designed club website. Waukesha Trap and Skeet
Club. Waukesha, Wisconsin. January 1997 - May 1997. Manager: Gary Schaetzel.
Assistant Office Manager. Cashiered and managed staffing, finance, accounting
and inventory. Waukesha Trap and Skeet Club. Waukesha, Wisconsin. May 1991 -
August 1995. Manager: Gary Schaetzel.
# RESEARCH EXPERIENCE #
Research Assistant. Integration of human expert feedback into machine
learning systems. Analyzing biological data using statistical and machine
learning algorithms. Department of Computer Sciences and Department
of Biostatistics & Medical Informatics, University of Wisconsin -
Madison. November 2001 - present. Advisor: Dr. C. David Page.
Research Assistant. Exploration of abelian sand-pile models through
theoretical and computational analysis. Department of Mathematics, University
of Wisconsin - Madison. January 2000 - May 2000. Advisor: Dr. James Propp.
Research Assistant. Investigation of the inductive effect in the
stabilization of hydroxyproline residues in collagen from a computational
chemistry perspective. Department of Biochemistry, University of Wisconsin -
Madison. May 1998 - May 2000. Advisor: Dr. Ronald T. Raines.
Research Assistant. Examined DNA mismatch repair damage in human colorectal
tumor cells. Comprehensive Cancer Center, University of Wisconsin -
Madison. August 1997 - May 1998. Advisor: Dr. David A. Boothman.
Research Assistant. Investigation into the structure/function relationship
in NADPH-cytochrome P450 oxidoreductase. McArdle Laboratory for Cancer
Research, University of Wisconsin - Madison. April 1996 - May 1997. Advisor:
Dr. Charles B. Kasper.
# TEACHING EXPERIENCE #
Lecturer. Taught an accelerated undergraduate summer course in
bioinformtics. Summer Research Program in Biostatistics and Bioinformatics,
Department of Biostatistics and Medical Informatics, University of Wisconsin
- Madison. June 2005. Course Coordinator: Dori Kalish.
Professional Tutor. One-on-one tutoring of undergraduate students in computer
science courses. Department of Computer Sciences, University of Wisconsin -
Madison. January 2003 - May 2005.
Teaching Assistant. Teaching assistant for an Introduction to Programming
discussion section. Department of Computer Sciences, University of Wisconsin
- Madison. August 2002 - December 2002. Course Coordinator: Debra Deppeler.
Student Assistant. Taught group-learning based discussions sections in
the three-semester Calculus sequence. Wisconsin Emerging Scholars Program,
Department of Mathematics, University of Wisconsin - Madison. January 1996 -
December 1999. Course Coordinator: Dr. Melinda Certain.
Professional Tutor. One-on-one tutoring of undergraduate students in
the colleges of Business and Letters and Science. Taught a variety of
mathematics, computer science and economics courses. Academic Advancement
Program, University of Wisconsin - Madison. Fall 1998 - Spring 2000.
Volunteer Tutor. One-on-one and group tutoring of undergraduate students in
a variety of mathematics, physics and chemistry courses. Greater University
Tutoring Service, University of Wisconsin - Madison. Summer 1998.
# PUBLICATIONS #
M. Waddell, D. Page, F. Zhan, B. Barlogie, and J. Shaughnessy, Jr. Predicting
cancer susceptibility from single-nucleotide polymorphism data: A case
study in multiple myeloma. In Proceedings of BIOKDD '05, Chicago,
Illinois, August 2005, Aug 2005.
J. Hardin, M. Waddell, C. D. Page, F. Zhan, B. Barlogie, J. Shaughnessy,
Jr., and J. Crowley. Evaluation of multiple models to distinguish closely
related forms of disease using DNA microarray data. Statistical
Applications in Genetics and Molecular Biology, 3(1), June 2004.
M. Molla, M. Waddell, D. Page, and J. Shavlik. Using machine learning to
design and interpret gene-expression microarrays. AI Magazine,
25(1):23-44, 2004.
I. d. C. Dutra, D. Page, V. S. Costa, J. W. Shavlik, and M. Waddell. Toward
automatic management of embarrassingly parallel applications. In H. Kosch,
L. Boszormenyi, and H. Hellwagner, editors, Euro-Par 2003. Parallel
Processing, 9th International Euro-Par Conference, Klagenfurt, Austria,
August 26-29, 2003. Proceedings, volume 2790 of Lecture Notes in
Computer Science, pages 509-516. Springer-Verlag, Aug 2003.
M. L. DeRider, S. J. Wilkens, M. J. Waddell, L. E. Bretscher, F. Weinhold,
R. T. Raines, and J. L. Markley. Collagen stability: Insights
from NMR spectroscopic and hybrid density functional computational
investigations of the effect of electronegative substituents on prolyl
ring conformations. Journal of the American Chemical Society,
124(11):2497-2505, Mar 2002.
D. Page, F. Zhan, J. Cussens, M. Waddell, J. Hardin, B. Barlogie,
and J. Shaughnessy, Jr. Comparative data mining for microarrays: A case
study based on multiple myeloma. Technical Report 1453, Computer Sciences
Department, University of Wisconsin, Nov 2002.
M. J. Waddell. Theoretical analysis of the basis of collagen
stability. Senior Undergraduate Thesis, May 2000. University of Wisconsin,
Department of Biochemistry.
# COMPUTER SKILLS #
Programming: C/C++, JAVA, PERL, Prolog, SQL, Lisp, Visual Basic,
XHTML/CSS/Javascript (AJAX)
Operating Systems: Microsoft Windows, MS-DOS, MacOS X, Linux, BSD,
Solaris, IRIX
# COMMUNITY SERVICE #
Judge. University of Wisconsin Math Meet Finals. May 1997.
Emergency Medical Technician. Village of Elm Grove, Wisconsin. Spring 1998 -
Spring 2000.
Emergency Room and Trauma Life Center Volunteer. University of Wisconsin
Hospital. 1997 - 1998.
# MEMBERSHIPS #
Association for Computing Machinery (ACM)
American Association for Artificial Intelligence (AAAI)
Institute of Electrical and Electronics Engineers(IEEE)
# REFERENCES #
(available upon request)
# STATEMENT OF RESEARCH #
My research focuses on the development of "collaborative machine learning
systems" -- machine learning systems that act as collaborators with human
users in knowledge discovery -- for biomedical knowledge domains. The
development of such systems draws both from work in machine learning systems
and from work in collaborative systems (i.e., systems that collaborate with
humans but not necessarily for the purpose of knowledge discovery). The
goal of this work is to develop a system that performs the primary task
of a collaborative machine learning system:
Given: A set of models for a classification task (from different algorithms,
different settings, bagging, boosting, etc.)
Do: Choose the single model that is both highly accurate and best
encapsulates the intuition of the user.
In all fields of knowledge, human reasoning is becoming overwhelmed by
vast and rapidly-growing collections of data. This is especially true in
biomedical domains where high-throughput technologies are quickly creating
more data than humans can process manually. Many have proposed machine
learning as a solution to dealing with this wealth of new data. Such tools,
running on modern workstations, are capable of examining large collections of
data and quickly considering a variety of potential hypotheses to explain or
summarize the data. However, these tools lack several advantages that human
experts bring to data analysis, including the rich bodies of background
knowledge that human experts possess for the formulation and evaluation
of hypotheses and the ability to deal effectively with noise in the data.
When working with large data collections from fields such as molecular
biology or pharmaceuticals, it would be beneficial if a machine learning
system and a human expert could act as a team. This team approach would
take advantage of the strengths of each: the speed of the computer combined
with the knowledge, intuition and skill of the human expert.
For the past five years, I have been working closely with a group at the
Lambert Laboratory of Myeloma Genetics at the University of Arkansas for
Medical Sciences. We have been collecting and analyzing genomics data in
order to understand the genetic basis of this disease. My cross-disciplinary
background in biology, medicine and computer sciences gave me unique
insights into the limitations of current machine learning software when
used on biomedical knowledge domains. It was this insight that led to my
current research focus.
These collaborations first illustrated to me the types of interactivity that
domain experts, humans who are experts in a particular domain of knowledge
(e.g., genetics) but not necessarily knowledgeable about machine learning,
need in a machine learning system. These collaborations also highlighted
the shortcomings that many of the current algorithms have with respect to
their ability to meet these needs.
One such shortcoming is that many current algorithms produce accurate,
but incomprehensible, models. Because the models are not directly
comprehensible to the researcher, the learning algorithm may give high
importance to irrelevant features or to relevant, but uninformative,
features. By providing the domain expert with insight into the features
that each model is using, the expert can more easily determine which models
are relying on irrelevant or uninformative features.
The first interactive machine learning system that I developed is SELDI
Filter (Waddell 2003). I designed this system to automate much of the tedious
analysis process associated with mass spectrometry data. It integrates with
the Ciphergen ProteinChip(tm) software that researchers are using to collect
SELDI data, supports interactive analysis of the data using a variety of
machine learning algorithms, and presents the analysis results in either
Microsoft Word(tm) or Microsoft PowerPoint(tm) for ease of integration into
papers and presentations. SELDI Filter currently automates the following
analysis methods: ANOVA, Discriminant Analysis, Partition Analysis and Neural
Network Analysis. However, due to its modular design, other techniques can
quickly and easily be added. SELDI Filter also allows the researcher to group
the collected spectra hierarchically. This flexible grouping scheme allows
SELDI Filter to accommodate many different experimental designs and datasets.
Currently, I am developing the Colleague system (Dutra 2003, Waddell
2004). This system is being developed as a machine learning system with a
web-based interface for submitting and managing jobs as well as for analyzing
the results returned by the underlying machine learning algorithm. Currently,
Colleague supports interactive clustering of models, visualization of
models customized to the appropriate knowledge domain, and "drilling-down"
and "rolling-up" both within groups of models and within groups of examples.
AI systems can be roughly divided into two groups: systems that replace
human abilities and systems that augment human abilities. Collaborative
systems replace tasks that humans are not adept at and use an efficient
collaborative interaction with their users in order to more efficiently
utilize those users' strengths. In this way, using a collaborative system
makes a human expert more efficient at an overall task (augmentation of
human abilities) although certain subtasks are being performed entirely
by the system (replacement of human abilities).
The promise of "systems that complement the abilities of people and that
augment their performance, rather than duplicate people's abilities and
replace them (Terveen 1993)," is that this interaction allows us to harness
knowledge, experience, intuition and other human abilities that we cannot
currently replicate in silico. If we adapt current AI techniques using this
model, then AI would essentially become "Amplified Intelligence (Hoffman
2002)" instead of "Artificial Intelligence." This change in focus would
mean that instead of trying to develop "artificial superhumans" to replace
us, we would focus our efforts on amplifying and extending our cognitive
abilities (Ford 1997, Hoffman 2002).
References
Dutra, Page, Costa, Shavlik and Waddell (2003). Toward Automatic Management
of Embarrassingly Parallel Applications. In Euro-Par 2003. Parallel
Processing, 9th International Euro-Par Conference, Klagenfurt, Austria,
August 26-29, 2003. Proceedings, Volume 2790 of Lecture Notes in Computer
Science, pages 509-516. Springer-Verlag.
Ford, Glymour, and Hayes (1997). Cognitive Prostheses. AI Magazine,
18(3):104.
Hoffman, Hayes, Ford and Hancock (2002). The Triples Rule. IEEE Intelligent
Systems, 17(3):62-65.
Terveen (1993). Intelligent Systems as Cooperative Systems. International
Journal of Intelligent Systems, 3(2-4):217-250.
Waddell (2003). SELDI Filter: Automating the Filtering and Analysis of
Proteomic Mass Spectrometry Data. Poster. Abbott Laboratories Science
Intern Poster Session, Abbott Park, Illinois, July 23, 2003.
Waddell (2004). Validating the Effectiveness of Machine Learning
Assistance. Poster. Computation and Informatics in Biology and Medicine
(CIBM) Training Program Retreat, University of Wisconsin-Madison, Madison,
Wisconsin, October 15, 2004.
# STATEMENT OF TEACHING PLANS #
My first teaching experience at the university level was as an undergraduate
teaching assistant for the calculus curriculum. The math department at
the University of Wisconsin has a program called the "Wisconsin Emerging
Scholars" program which provides motivated students with 6 hours of
discussion sections per week in addition to their standard 5 hours of
lectures. In addition to the activities of a normal discussion section, the
students spent the bulk of these 6 hours each week in small groups working
on challenging problems related to the lecture material for that week.
I feel very fortunate that my first formal teaching experience was with
the WES program, because I learned much more about how students learn than
if I had been teaching a traditional lecture-style course. First of all,
I was able to spend a good deal of time observing the students working in
groups to solve these challenging problems. I was able to see the kinds
of errors that were being made, the concepts that were confusing, and
the ways in which their reasoning went awry. I worked very hard to give
them hints to guide them in the right direction without giving them the
answer. By forcing them to work through the analytical process on their
own, I found that they were able to generalize what they had learned better
than when they were just told how to do a particular type of problem. When
I found that multiple groups were misunderstanding a fundamental concept,
I would pause the groups and do a "mini-lecture" to address that concept.
The joy that I found in teaching inspired me to work hard to improve
my effectiveness as a teacher. As part of this, I took an educational
psychology course, which gave me a better understanding of the different
ways in which students learn. During this class, I learned the psychological
basis for the behavior that I had witnessed during my four years with the
WES program. This verified that the lessons I had learned were not unique
to teaching calculus, but could easily be adapted to any field of study.
As a teacher, my pedagogical goal has been to keep students engaged in the
learning process and to encourage individual learning styles. To meet this
goal, I have developed a number of effective techniques. First, in order
to keep students engaged, I work to minimize the amount of lecturing by
focusing on main ideas and minimal examples during "formal" lectures. I find
that numerous examples are more effective when students work through them
in small groups with guidance rather than watching the teacher demonstrate
the example. Not only does this approach help the students to remember
the method better, but it also helps them to generalize what they have
learned and it builds their self-confidence within the discipline.
Secondly, in order to encourage individual learning styles, I promote
office hours for those students who need individualized help, I encourage
independent projects for those motivated students who want to delve deeper
into a particular area, and I work to provide multiple modalities for
students to demonstrate their knowledge. I understand that every student is
different, and although one student may be a good test-taker and another
student may be a good public-speaker, what really matters is whether they
have a solid grasp of the material that they have been presented with. Thus,
I believe in offering students more means of demonstrating their knowledge
than just during a formal exam. Thus, I encourage all students -- from the
struggling to the excelling -- to do independent projects that suit their
individual learning styles. The challenge for me as a teacher is to balance
my desire for individualized instruction with the realities of having
limited time and resources. This is a challenge that I solve differently
for each class depending upon the specific needs of that group of students.
Finally, a technique that I use to deal with the variety of backgrounds
of my students is to give short "pre-tests" for each section to identify
background areas that are weak as well as to identify material that the
class is already familiar with and thus I do not need to cover in depth. I
have spent my educational career working to combine the fields of the
biological sciences, the medical sciences, mathematics and the computer
sciences. Thus, I am qualified to teach many courses in these areas. The
areas which I feel the most qualified to teach are bioinformatics, medical
informatics and artificial intelligence.