Upon joining the Ph.D. program, each student is assigned an initial advisor who is on the DCDS faculty. This advisor meets with the student to assess their background and advise them on course selection. Immediately prior to each fall semester (starting in 2019), DCDS faculty conduct a “boot camp” in mathematics, statistics, and programming to help bring incoming students up to the level they need to succeed in initial coursework and the program.

All students complete a common core curriculum, and also a domain depth requirement in a social science area. The focus of the first year is on acquiring a common set of tools and an understanding of the ranges and types of problems students may work on as they progress in the program. The entire incoming cohort takes a unique two semester seminar sequence solely for DCDS students, which includes both general topics and a series of data-driven dives into the types of research questions that may be encountered in each of the domain areas.

In addition, students will be exposed to research in different areas through “rotations”, starting in November of their first year. By the end of the summer following their first year, each student will put together an advisory committee (of at least 2 DCDS faculty, preferably from different tracks) and identify the specific track in which they plan to do research and pursue their Ph.D.

Curriculum

Required Core Courses (24 credit hours)

  1. Computational and Data Sciences (CDS) Seminar Series (6 credits): A two-semester sequence cross-listed across participating departments and team taught by participating faculty.
    • DCDS 499 Introduction to Graduate Research in Computational and Data Sciences This course presents topics and ideas that do not need detailed specific computational or substantive backgrounds. The topics covered include ethics, the nature of research, robustness and reproducibility of research, and presentations on the DCDS core domains (computation, political science, psychology and brain sciences, public health and social work). The course exposes students to research in human and social data analytics across the university.
    • DCDS 500 Computational and Data Sciences Research Explorations The seminar lays the foundation for future study and success in transdisciplinary research involving the  computational and data  sciences. Opportunities exist to engage with the conceptual and technical challenges emerging from the increasingly ubiquitous availability of extensive datasets capturing many aspects of human life, social behavior, and scientific discovery. The course emphasizes technical and ethical issues of knowledge development, causal inference, and justice in the context of complex data. Students work in diverse teams to apply methods to case studies.
  1. CSE 502 Data Structures and Algorithms (3 credits): Study of fundamental algorithms, data structures, and their effective use in a variety of applications. Emphasizes importance of data structure choice and implementation for obtaining the most efficient algorithm for solving a given problem. A key component of this course is worst-case asymptotic analysis, which provides a quick and simple method for determining the scalability and effectiveness of an algorithm. We expect many students will already have this background – it is intended as a pathway for students with little computational training.
  2. Quantitative Methods Seminar Series (QM) I and II (6 credits): A two-semester sequence covering essential probability and statistics, including hypothesis testing, inference, and experimental methodology, using a modern statistical computing language like R.
    • QM I (Pol Sci 581 Quantitative Political Methodology I or Psych 5066 Quantitative Methods I): Introduces students to scientific inquiry and basic statistical tools. The course primarily covers the linear regression model, in both scalar and matrix form. It also focuses on how to collect, manage, and analyze data using computer software, and how to effectively communicate to others results from statistical analyses.
    • Topics include estimation, inference, specification, diagnostic tools, data management, and statistical computation. The course is intended for students without prior exposure to the fundamental topics.
    • QM II (Pol Sci 582 Quantitative Political Methodology II) The course covers advanced methods of statistical analysis in computational and social sciences. Topics include maximum likelihood estimation for various cross-sectional, time series, and measurement models…
  3. DCDS 510 Introduction to Data Wrangling (3 credits): The course introduces students to tools and techniques for how to collect, maintain, and process large-scale datasets of the kind generated when studying people and social systems. Students learn methods for generating information from multiple sources (e.g., static survey data, dynamic data accessed via API), as well as evaluating information for flaws (e.g., missing or erroneous entries, redundant entries, bias stemming from data collection). The course provides opportunities to ingest data, perform analyses, and document findings using an electronic notebook for reproducibility.
  4. Machine Learning I and II (6 credits): This is a two semester sequence in machine learning addressing the fundamental principles of supervised and unsupervised learning.
    • CSE 417 Introduction to Machine Learning: The course covers the foundations of supervised learning and important supervised learning algorithms. Topics include the theory of generalization (including VC-dimension, the bias-variance tradeoff, validation, and regularization) and linear and non-linear learning models (including linear and logistic regression, decision trees, ensemble methods, neural networks, nearest-neighbor methods, and support vector machines).
    • CSE 517 Machine Learning II. The advanced course addresses topics at the frontier of the field of machine learning. Topics to be covered include kernel methods (support vector machines, Gaussian processes), neural networks (deep learning), and unsupervised learning. The course also introduces new developments in the field, such as learning from structured data, active learning, and practical machine learning (feature selection, dimensionality reduction).

Domain Depth

Students will choose one of four focus “tracks” (Political Science, Psychological and Brain Sciences, Social Work and Public Health, or Computational Methodologies). Depending on the track, students must complete the following domain depth requirements:

  1. Political Science track: Students must complete three substantive classes in one subfield (American politics, comparative politics, international relations) from a specified list for each subfield as well as a research design course (PS 540).
  2. Psychological and Brain Sciences track: Students must complete three substantive classes in one subfield (brain, behavior and cognition, clinical science, social/personality, development & aging). With permission, students may substitute the Psychological & Brain Sciences Research Methods Course (PBS 5011) for one of those substantive classes depending on their background in psychological science.
  3. Social Work & Public Health track: Students must complete a three-course core doctoral seminar series, including conceptual foundations of social science, advanced research methods, and a theory seminar, either in public health or in social work. Students will also be required to take an advanced substantive course from an approved list in their area of interest.
  4. Computational Methodologies track: Students must take CSE 541T: Advanced Algorithms and either CSE 511A: Artificial Intelligence or CSE 515T: Bayesian Methods in Machine Learning. In addition, students must take two substantive classes in their area of interest (social work & public health, political science, or psychological and brain sciences) from among the classes acceptable for students in that track as noted above.

A typical progression of classes is described below, separately for students who enter with and without more extensive computational backgrounds

Students without much Computer Science background:

Fall of 1st year Spring of 1st year Fall of 2nd year Spring of 2nd year
Algorithms (CSE 502) Intro to Machine Learning (CSE 417) Quantitative Methods II (PolSci 582) Machine Learning II (CSE 517)
Quant Methods I (PolSci 581 or Psych 5066) Data Wrangling (DCDS 510) Domain Course Domain Course
Intro to CDS (DCDS 499) Explorations in CDS (DCDS 500) Domain Course or Elective Domain Course or Elective

Students with more Computer Science background:

Fall of 1st year Spring of 1st year Fall of 2nd year Spring of 2nd year

Intro to Machine Learning (CSE 417) or

Domain Course

Intro to Machine Learning (CSE 417) or

Domain Course

Quantitative Methods II (PolSci 582) Machine Learning II (CSE 517)
Quant Methods I (PolSci 581 or Psych 5066) Data Wrangling (DCDS 510) Domain Course Domain Course
Intro to CDS (DCDS 499) Explorations in CDS (DCDS 500) Domain Course or Elective Domain Course or Elective

Further Requirements

  • A minimum of 72 credit hours beyond the bachelor’s level, with a minimum of 37 being course credits (including the core curriculum)
  • A minimum of 24 credit hours of doctoral dissertation research
  • Students must maintain an average grade of B (GPA 3.0) for all 72 credit hours
  • Required courses must be completed with no more than one grade below a B-
  • Up to 24 graduate credit hours may be transferred with the approval of the Graduate Studies Committee, chaired by the Director of Graduate Studies
  • In addition to fulfilling the course and research credit requirements students must
    • complete at least two three-month-long research rotations
    • pass a qualifying exam
    • successfully defend a thesis proposal
    • present and successfully defend a dissertation
    • complete a teaching requirement consisting of two semesters of mentored teaching experience