The Harvard Data Science Initiative (HDSI) in conjunction with Harvard University Information Technology (HUIT) and the Office for the Vice Provost for Research (OVPR) has selected four projects for support through the Innovation API Fund, providing OpenAI API credits to accelerate research efforts across Harvard University. Learn more about the fund and how to apply on the Innovation API Fund page.
Beaver: An Academic Research Agent for Zotero
Joscha Legewie, Professor of Sociology at the Harvard Faculty of Arts and Sciences and his team will be using the credits to further develop Beaver, a tool that embeds an AI research agent directly into Zotero to help scholars read, search, and synthesize literature responsibly. The agent performs agentic RAG across a user’s library (metadata, semantic related‑item, and full‑text searches), returning answers with precise, page‑level citations and links to source PDFs—supporting transparent, citable use of GenAI in academic workflows. The team will use credits to evaluate answer quality and retrieval fidelity, build task‑specific evaluation sets, and support beta testing—while prioritizing privacy and keeping core functionality free.
Learn more about Professor Legewie’s research.
Representing the Consumer in Trade Policy and Politics
Gautam Nair, Assistant Professor of Public Policy at the Harvard John F. Kennedy School of Government is using large language models to analyze 150 years of U.S. Congressional speeches and hearings to understand who represents consumer interests in trade debates and why. By linking LLM‑based text analyses to legislators’ biographies and district/state economic characteristics, the team will map when and how consumers gain voice, challenging the conventional wisdom that diffuse consumer interests are inevitably overshadowed by organized producer lobbies.
Learn more about Professor Nair’s research.
MAPA‑Enhanced: Democratizing AI‑Powered Pathway Analysis
Peng Gao, Assistant Professor of Environmental Health and Exposomics at the Harvard T.H. Chan School of Government is building on the MAPA framework for multi‑omics pathway analysis, this effort integrates OpenAI models (e.g., GPT‑4o and GPT‑4o‑mini) and text‑embedding‑3‑large to cut through redundant pathway outputs and deliver clearer biological insight. Prior work achieved 85% accuracy in module identification (vs. 33% for existing methods); the upgrade aims to reduce per‑analysis costs by 40–60% while improving annotation quality via RAG optimization, adaptive prompting across databases (GO, KEGG, Reactome, SMPDB), and a freely available open‑source release for the community.
Learn more about Professor Gao’s research.
Human‑like Agentic Systems for Complex Reasoning and Planning
Samuel Gershman, Professor of Psychology, at the Harvard Faculty of Arts and Sciences, through the inspiration of human cognitive abstraction, is building an LLM‑powered agent that writes code‑based models of its environment and abstracts actions (e.g., “open a door,” “move a box”) to improve long‑horizon planning. The agent will parse visual inputs using vision‑language models, convert them into symbolic structure, and generalize across tasks and environments (e.g., MiniGrid, Bait, Procgen), with benchmarks against reasoning‑tuned LLMs. The goal is more efficient, interpretable, and transferable planning, enabling smaller models to outperform larger ones through structured representations.
Learn more about Professor Gershman’s research.
All projects are supported through the HDSI Innovation API Fund