2024 Harvard Data Science Initiative Research Fund Request for Proposals

Harvard Data Science Initiative (HDSI) research funding opportunities have been consolidated into this single Request for Proposals (RFP), encompassing the former Competitive Research Fund, Trust in Science Fund, and Postdoctoral Fellows Research Fund. The Faculty Special Projects Fund remains separate.

Overview

The Harvard Data Science Initiative (HDSI) connects faculty and students across schools to advance a new science of data. By asking the right questions and working with data of a size and variety previously unimaginable, breakthrough scientific advances that generate more informed decisions and/or predictive power are now possible. Through internal funding, the HDSI supports faculty- and postdoc-led projects that employ data science to promote scholarly research that advances a field (i.e., fundamental research) and that have the potential for real-world impact.

The HDSI Research Funding RFP allows faculty and postdoctoral fellows to apply for funding in one or more of three tracks:

  • Track A: faculty-led, data-driven research in any discipline of study
  • Track B: faculty-led, data-driven research that will lead to insights into Trust in Science
  • Track C: postdoctoral fellow-led, data-driven research in any field of study 

Important Dates

  • January 13, 2025: Deadline for submission of all applications online
  • January-February 2025: Review of applications
  • February-March 2025: Award notification
  • May 2025: Funding begins

General Information: Funding Term, Eligibility, and Diversity

Funding Term

Successful applications will be eligible for one year of funding. PIs of projects funded through Tracks A and B may apply for additional funding at the end of the first year (details under Award Amounts, below).

Eligibility for Faculty-Led Projects

Individuals who hold a faculty appointment at a Harvard school and who have principal investigator rights at that school. (Please note: Harvard Medical School faculty must hold a faculty appointment with PI rights in one of HMS’s Quad-based, preclinical departments). 

Eligibility for Postdoctoral Fellow-Led Projects

Harvard researchers who hold a non-faculty, fixed-term postdoctoral appointment at a Harvard school that extends to at least December, 2025. There is no requirement for applicants to have principal investigator rights, although the application should include contact details for the financial officer at the relevant school who would administer the funds should the proposal be selected. Applicants from Harvard Medical School must have an appointment in a quad-based pre-clinical department.

Diversity

HDSI recognizes that strength comes through diversity and encourages proposals from teams with diverse backgrounds, experiences, and identities. HDSI encourages PIs to consider this aspect of their proposal, and whether it can be helpful to involve partners from affected communities.

Award Amounts 

Track A: Up to $50,000 per award available in direct costs. Awarded applicants are eligible to apply for a one-year extension after nine months and ask for a maximum additional funding amount of $20,000 direct costs.

Track B: Up to $75,000 per award available in direct costs. Awarded applicants are eligible to apply for a one-year extension after nine months and ask for a maximum additional funding amount of $25,000 direct costs.

Track C: Up to $50,000 per award available in direct costs.

General Guidance for All Applications

Project proposals should describe how the proposed research will use (or advance) innovative data science to drive scholarship, while employing one or more of the following:

  • Novel/advanced computational methods/analyses
  • High-performance computing (including cloud computing) 
  • Computer-driven modeling, simulation, forecasting, and/or analytics
  • Artificial Intelligence/Machine Learning

Proposed research projects should describe creative and innovative approaches to advancing data science-driven research over one year, with the possibility of additional  funding for a one-year extension; the potential for longer-term research programs with broader scope should be addressedInterdisciplinary collaborations are particularly encouraged.

A key attribute for all investigator-led research projects is a description of the anticipated impact, even if not directly generated from the study or a near-term deliverable. This could include, for example, insights and/or correlations that influence public policyoutputs such as AI/ML advances or data science methodologies, or artifacts such as software, data cube, dashboard, database, workflow, platform, web portal, algorithms, or foundational and/or large language models (LLMs). Investigators are expected to address the potential for impact in their proposal.

Track-Specific Guidance

Track A: faculty-led, data-driven research in any field of study

Track A is designed to coalesce and accelerate methodologically-focused research across disciplines. We are especially interested in projects that intersect with or are likely to have impact within or across the DSI’s research themes: 

  1. Data-Driven Scientific Discovery (includes discovery of new materials, drug and gene discovery, astronomy, neuroscience, environment and health including climate change mitigation/resilience, greenhouse gases, agriculture/crop science, food security/sustainability, air pollution, water scarcity/contamination, biodiversity loss, deforestation, marine pollution and fishery depletion) 
  2. Markets and Networks (includes networks and influence, innovation and crowds, digital economy, labor economics, data-driven decisions, blockchain)
  3. Health and Biomedical Science (includes precision medicine, drug repurposing, public health, medical informatics, diagnostics, personal devices)
  4. Evidence-Based Policy (includes equality of opportunity, healthcare economics, democracy and governance)

We also welcome work that is primarily methodological or addresses deficiencies in data sets, including gaps or remediating poor quality data. We are interested in promoting advances across many areas that relate to the science of data, including causal inference,    visualization, generative AI, scalable and robust inference, experimental design, interpretability and robustness, ethics (including privacy and fairness), control of false discovery, human-in-the-loop, reinforcement learning, multi-agent systems, adaptive data systems, deep learning, theoretical foundations, reproducibility, and data sharing.

Track B: Trust in Science

Proposals submitted via Track B should seek to illuminate the varied factors that currently impede trusting relations between the producers and users of scientific information. Projects should leverage data science to analyze the breakdowns in public trust, and to ask what steps could be taken to promote better mutual understanding. Examples of topics that may be of interest include: 

  • The processes and products of data science be made more transparent, and how might strategies of democratization affect the trustworthiness of science?
  • Explorations into the juxtaposition of and tension between individual or group autonomy and trust in science  
  • Methods of visualizing data and how they affect the ways that different groups assess the trustworthiness of that data
  • The role of team structures in science in the trustworthiness of their results
  • Propagation of conspiracy theories
  • Public policy influencing through global data citizenry and communication
  • Misinformation and how data and public health relate
  • Science advocacy, transparency, and ethics
  • Improving AI integrity, credibility, and adoption
  • Optimizing the value and credibility of AI deployment, particularly in the clinical realm (e.g., clinical diagnosis and predictive modeling, health monitoring/analytics, and clinical decision-making)
  • COVID-19: data science-driven learnings for the next pandemicThese examples are illustrative only; applicants are encouraged to think more broadly, and we welcome alternative research directions.  

This track is reserved for previous applicants to the Trust in Science fund ONLY.

Track C: Postdoctoral Fellow-led Projects

The goal of Track C is to promote and support cross-disciplinary collaboration between data scientists at the postdoctoral level. We are particularly interested in funding research proposals that aim to: 

  1. Investigate novel applications of data analysis techniques (broadly defined), particularly by transferring methodology from one field to another; 
  2. Share and combine existing but distinct data sets to gain new insights into a problem; 
  3. Involve new collaborations between researchers in separate departments (or fields of research); 
  4. Explore new methods that may help to improve the public understanding of complex technical issues or areas of research.

Reporting Requirements

Successful applicants will be expected to provide a six-month status report and a brief, final report. Applicants are also expected to notify the HDSI promptly of any publications or public presentations arising from the funded research.

Successful applicants will be expected to give a short presentation of the funded project at a symposium during or at the end of the grant period, and to share any software code that is developed through a public GitHub repository.

How to submit an application

All applications must be submitted online here by 11:59 p.m. by January 13, 2025. All materials should be submitted as PDF documents (unless otherwise specified). 

Information requested in the online application includes the following:

Contact Information

Principal investigator and collaborator information. Include affiliations and professional titles

Project Summary (limit of 250 words)

The summary should be written in layperson’s language and may be used by the HDSI in promotional materials.

Project Description (5-page limit exclusive of figures and references)

Describe the research plan, which should be comprehensible and accessible to non-specialists. The project description should include the following:

Description of proposed research:

  1. Statement of real-world problem the project addresses and significance
  2. Goals and proposed research questions
  3. Experimental plan/approach
  4. The potential impact of the proposed work

Data and computational needs*:

  1. Description of any computational services that are needed (e.g., high performance cloud computing, cloud-based data storage, coding, data research platforms, database architecture, software engineering, analytics, visualization, etc.)
  2. The role of data science and computational analysis 
  3. The role of any specialized technical support including, for example, a data engineer or professional data scientist 
  4. Datasets required for the study: description, availability, validation, quality

Project Management: 

  1. Milestones/phases
  2. Go/no-go decision points (i.e., stage gates)
  3. Contingency plans/Pre-mortem if the research does not go as planned 
  4. Anticipated outputs and measures of success
CV

Abridged CV or biosketch (limit two pages) for each PI and co-PI.

External Funding

List of all current or pending external sources of grant support for the proposed project, and information on any external funding you have applied for or intend to apply for to support the project. 

Budget and Budget Justification (one page)

Budgets should provide enough information to convey the alignment of costs with the project and use of funds. Faculty are encouraged to work with their grants administrators when including personnel and fringe benefits. School assessments and/or indirect costs should not be included in your budget (the DSI will be covering any assessment above the award amount).

Examples of eligible expenses include:

  • Personnel, such as postdocs, research staff, undergraduate students
  • Travel (domestic and international)
  • Acquisition of datasets
  • Publishing costs

The following expenses are NOT eligible for funding:

  • Faculty salary (for all tracks) and postdoc salaries for (track C only)
  • Professional development and education
  • Subcontracts outside Harvard
  • Equipment

Please email lawrence_weissbach@harvard.edu with any questions about the substance of the call, and kevin_doyle@harvard.edu for any questions pertaining to the submission form.

* AI-Related Computational Support: All applicants intending to perform AI-related studies that necessitate funding to support high performance/cloud computing and data storage resources are encouraged to apply for to the NSF/National Artificial Intelligence Research Resource Pilot (NAIRR) prior to requesting HDSI funding. The NAIRR Pilot Program “brings together computational, data, software, model, training and user support resources to demonstrate and investigate all major elements of the NAIRR vision first laid out by the NAIRR Task Force”.