Overview
On Site
Full Time
Skills
Training
Art
Meta-data management
Security analysis
Benchmarking
Governance
Forms
Collaboration
Collections
Innovation
Social media
Open source
FOCUS
Generative Artificial Intelligence (AI)
Natural language processing
IT management
Leadership
Research
IMPACT
Testing
Documentation
Data
Workflow
Machine Learning (ML)
Artificial intelligence
Optical character recognition
Computer vision
Python
JavaScript
Lisp
Git
Management
Linux
Secure Shell
Continuous integration
Continuous delivery
Job Details
Position Description
The Institutional Data Initiative (IDI) is a new research center working to advance society's relationship with knowledge by expanding access to, and deepening our understanding of, the data that underpins AI. By collaborating with library, government, and academic institutions to publish their knowledge collections as AI training sets, IDI seeks to 1) empower those institutions and the cultures they represent, 2) build a foundational pipeline for academic inquiry of AI, and 3) advance the state of the art for all builders of AI systems.
IDI's work spans the AI data ecosystem-from digitization, data structuring, and metadata synthesis, to safety and security analysis, all the way through to benchmarking and the development of ethical and governance frameworks. Institutional collaboration forms the gateway to this work and IDI places a particular emphasis on opportunities with institutions that expand the cultural breadth of knowledge represented in the building blocks of AI.
At its core, IDI is a data practice around which other interdisciplinary work is convened. While theory and analysis are critical components of IDI's work, our impact is a direct factor of our ability to ship novel data. As such, IDI's workflows resemble those of a product studio. Our projects are time-bound and scopes are driven by ambition within time constraints. Our team structure is relatively flat and each member is expected to bring vision for their work and drive it through to completion. We prioritize interdisciplinary collaboration with academic contributors, both internal and external, as essential work that prevents the commodification of the data we help to publish.
The technical capabilities of our Principal Engineers define the depth of analysis and inquiry at IDI while developing and deploying repeatable methods and pipelines. The person in this role will have an ability to think creatively about extracting and manipulating data to unlock knowledge collections that have been stubbornly inaccessible, sometimes for centuries. Their understanding of machine learning and AI fundamentals will help identify areas of high impact and utilize models to facilitate this work. Each Principal Engineer must bring a unique set of skills and approaches to a team of engineers whose distinct capabilities complement the whole. This team works together to build an action plan for each corpus that takes it from uncharted territory to a well-defined map that others can traverse.
Beyond data, Principal Engineers also contribute to the building of community around IDI's work to enable outside collaborators-fellow technologists, academics, students, cultural stakeholders-to expand our capabilities, capacity, and perspectives. IDI operates within Harvard and alongside the Library Innovation Lab, the Berkman Klein Center, and the Applied Social Media Lab; engaging these communities, among others, is critical to delivering on our mission.
As a Principal Engineer, you will:
Basic Qualifications
Additional Qualifications and Skills
We are looking for people who have:
Working Conditions
Travel is required for quarterly on-site meetings. Occasional travel for conferences and events as needed.
Additional Information
This is a two-year term appointment with potential for renewal, subject to funding and departmental need.
Given the multidisciplinary nature of our work, we encourage a short cover letter to explain how your career trajectory and interests align with our work and mission.
We regret that Harvard Law School is unable to provide visa sponsorship for staff positions.
All offers to be made by HLS Human Resources.
Work Format Details
This position works a combination of on-campus and remote work based on business needs and manager approval. The opportunity for 100% remote work is available for applicants who live more than 50 miles from the Harvard Law School campus. All remote work must be performed within a state in which Harvard is registered to do business (CA, CT, GA, IL, MA, MD, ME, NH, NJ, NY, RI, VA, VT and WA).
Candidates residing in other states are also encouraged to apply, though would become employees of a third-party external payroll company rather than Harvard University.
About Us
Be a part of excellence and leadership in legal education and scholarship at Harvard Law School. We are a community of talented people from diverse backgrounds, lived experiences, and perspectives, dedicated to advancing the cause of justice all over the world. We value our differences and our diversity as a source of strength. We are committed to developing and inspiring our students and our workforce. Whoever you are, whatever you do, however you do it, Harvard Law School is a place where you can thrive.
Benefits
We invite you to visit Harvard's Total Rewards website ( ) to learn more about our outstanding benefits package, which may include:
The Institutional Data Initiative (IDI) is a new research center working to advance society's relationship with knowledge by expanding access to, and deepening our understanding of, the data that underpins AI. By collaborating with library, government, and academic institutions to publish their knowledge collections as AI training sets, IDI seeks to 1) empower those institutions and the cultures they represent, 2) build a foundational pipeline for academic inquiry of AI, and 3) advance the state of the art for all builders of AI systems.
IDI's work spans the AI data ecosystem-from digitization, data structuring, and metadata synthesis, to safety and security analysis, all the way through to benchmarking and the development of ethical and governance frameworks. Institutional collaboration forms the gateway to this work and IDI places a particular emphasis on opportunities with institutions that expand the cultural breadth of knowledge represented in the building blocks of AI.
At its core, IDI is a data practice around which other interdisciplinary work is convened. While theory and analysis are critical components of IDI's work, our impact is a direct factor of our ability to ship novel data. As such, IDI's workflows resemble those of a product studio. Our projects are time-bound and scopes are driven by ambition within time constraints. Our team structure is relatively flat and each member is expected to bring vision for their work and drive it through to completion. We prioritize interdisciplinary collaboration with academic contributors, both internal and external, as essential work that prevents the commodification of the data we help to publish.
The technical capabilities of our Principal Engineers define the depth of analysis and inquiry at IDI while developing and deploying repeatable methods and pipelines. The person in this role will have an ability to think creatively about extracting and manipulating data to unlock knowledge collections that have been stubbornly inaccessible, sometimes for centuries. Their understanding of machine learning and AI fundamentals will help identify areas of high impact and utilize models to facilitate this work. Each Principal Engineer must bring a unique set of skills and approaches to a team of engineers whose distinct capabilities complement the whole. This team works together to build an action plan for each corpus that takes it from uncharted territory to a well-defined map that others can traverse.
Beyond data, Principal Engineers also contribute to the building of community around IDI's work to enable outside collaborators-fellow technologists, academics, students, cultural stakeholders-to expand our capabilities, capacity, and perspectives. IDI operates within Harvard and alongside the Library Innovation Lab, the Berkman Klein Center, and the Applied Social Media Lab; engaging these communities, among others, is critical to delivering on our mission.
As a Principal Engineer, you will:
- Develop, refine and evaluate methods for analyzing and augmenting corpora.
- Research, train and evaluate machine learning models.
- Research, adjust and evaluate natural language processing techniques.
- Write and contribute to open-source software.
- Conduct an ongoing technology and research watch in areas pertinent to IDI's focus, including: generative AI, natural language processing, digital preservation and open knowledge.
- Provide technical leadership and guidance to both your team members and your project peers.
- Help build and lead development of multiple discrete projects at once.
- Draft technical and scientific communications outlining datasets and novel methodologies developed in the course of your work.
- Be a technological ambassador for the research center. Help build and engage with the broader academic and AI communities including the Harvard student population.
- Engage with partners to both share our work and explore new opportunities.
Basic Qualifications
- Minimum of seven years' post-secondary education or relevant work experience
Additional Qualifications and Skills
We are looking for people who have:
- A desire to create public-interest impact on AI.
- Deep understanding and experience designing, implementing, testing, and documenting data workflows.
- Advanced working knowledge of machine learning and "AI" systems including local and open toolchains in addition to commercial offerings.
- OCR and/or computer vision experience.
- Strong grasp of at least one general purpose development technology (Python, Javascript, Lisp, ...) and of every day development tools (IDEs, Git, dependencies management, Linux & SSH, CI/CD, ...)
- Experience drafting technical or scientific communications to document and disseminate their work.
Working Conditions
Travel is required for quarterly on-site meetings. Occasional travel for conferences and events as needed.
Additional Information
This is a two-year term appointment with potential for renewal, subject to funding and departmental need.
Given the multidisciplinary nature of our work, we encourage a short cover letter to explain how your career trajectory and interests align with our work and mission.
We regret that Harvard Law School is unable to provide visa sponsorship for staff positions.
All offers to be made by HLS Human Resources.
Work Format Details
This position works a combination of on-campus and remote work based on business needs and manager approval. The opportunity for 100% remote work is available for applicants who live more than 50 miles from the Harvard Law School campus. All remote work must be performed within a state in which Harvard is registered to do business (CA, CT, GA, IL, MA, MD, ME, NH, NJ, NY, RI, VA, VT and WA).
Candidates residing in other states are also encouraged to apply, though would become employees of a third-party external payroll company rather than Harvard University.
About Us
Be a part of excellence and leadership in legal education and scholarship at Harvard Law School. We are a community of talented people from diverse backgrounds, lived experiences, and perspectives, dedicated to advancing the cause of justice all over the world. We value our differences and our diversity as a source of strength. We are committed to developing and inspiring our students and our workforce. Whoever you are, whatever you do, however you do it, Harvard Law School is a place where you can thrive.
Benefits
We invite you to visit Harvard's Total Rewards website ( ) to learn more about our outstanding benefits package, which may include:
- Paid Time Off: 3-4 weeks of accrued vacation time per year (3 weeks for support staff and 4 weeks for administrative/professional staff), 12 accrued sick days per year, 12.5 holidays plus a Winter Recess in December/January, 3 personal days per year (prorated based on date of hire), and up to 12 weeks of paid leave for new parents who are primary care givers.
- Health and Welfare: Comprehensive medical, dental, and vision benefits, disability and life insurance programs, along with voluntary benefits. Most coverage begins as of your start date.
- Work/Life and Wellness: Child and elder/adult care resources including on campus childcare centers, Employee Assistance Program, and wellness programs related to stress management, nutrition, meditation, and more.
- Retirement: University-funded retirement plan with contributions from 5% to 15% of eligible compensation, based on age and earnings with full vesting after 3 years of service.
- Tuition Assistance Program: Competitive program including $40 per class at the Harvard Extension School and reduced tuition through other participating Harvard graduate schools.
- Tuition Reimbursement: Program that provides 75% to 90% reimbursement up to $5,250 per calendar year for eligible courses taken at other accredited institutions.
- Professional Development: Programs and classes at little or no cost, including through the Harvard Center for Workplace Development and LinkedIn Learning.
- Commuting and Transportation: Various commuter options handled through the Parking Office, including discounted parking, half-priced public transportation passes and pre-tax transit passes, biking benefits, and more.
- Harvard Facilities Access, Discounts and Perks: Access to Harvard athletic and fitness facilities, libraries, campus events, credit union, and more, as well as discounts to various types of services (legal, financial, etc.) and cultural and leisure activities throughout metro-Boston.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.