Katie Amaral-image

Katie Amaral

I'm a Senior Software Engineer and Technical Lead, currently working at Harvard University Library Technology Services building modern, AI-powered, accessible, secure, and scalable multimedia services for Harvard's vast collection of digital assets. I have a master's degree in Computer Information Systems with a concentration in IT Security from Boston University, and over ten years of experience working as a software engineer.

about-me-image

About me

  • Location:Boston, MA
  • Interests:Hiking, Traveling, Yoga
  • Study:Masters in Computer Information Systems
  • Employment:Harvard University

Work

Technical Lead

Harvard UniversityOctober 2024 - Present

Architecture Leadership - Lead the architectural design and technical implementation of Collections Explorer, a public hybrid search application for exploring Harvard Library’s vast digital collections through natural language interaction validation
Search Infrastructure - Design and implement high-performance search infrastructure using Elasticsearch, supporting multimodal hybrid search retrieval (KNN + BM25) across ~10 million records with Cohere Multilingual v3 dense vector embeddings, reducing memory utilization by 95% with BBQ compression
- Develop relevancy evaluation scripts using the Elasticsearch Rank Evaluation API to calculate precision and recall for tuning search performance; evaluate embedding models with Hugging Face transformers and PyTorch
Large Language Models and Model Context Protocol - Implement retrieval augmented generation (RAG) with LLMs on AWS Bedrock; Build MCP services for agentic search capabilities and query augmentation with dynamic faceting
Security - Conduct assessments and implement controls based on OWASP standards, including validation with Pydantic, input sanitization, CSPs, network controls, auth (OAuth2/OIDC), vulnerability scanning, LLM guardrails, etc.
Data Engineering - Build Apache Airflow ETL pipelines to extract and transform data from a variety of diverse data sources, performing validation, normalization, chunking, and ingestion into Elasticsearch
Full-Stack Software Development - Build backend API services using Python FastAPI and internal PyPI packages for code reuse; build frontend UI with NextJS, React, and a reusable component library in Storybook
Open Source - Contribute to Digital Collections Explorer, a multimodal vector search application by University of Washington

Senior Director, Software Engineering

OnCorps, IncApril 2024 - October 2024

ML Operations - Built ML Operations pipelines on the Databricks platform: data ingestion, data transformation, feature engineering, model training and fine-tuning, model deployment, inference, monitoring, and data validation
ML Model Training and Evaluation - Trained and fine-tuned Computer Vision and NER models for classification tasks using Python libraries such as SciKit-Learn, PyTorch, and Hugging Face transformers; Implemented statistical methods for model performance evaluation, including F1 score, precision, and recall
Data Validation - Wrote data validation classes using Python libraries such as Pandas and NumPy to confirm the accuracy of financial statements in preparation for clients to report to stakeholders, investors, and regulatory agencies

Senior Software Engineer

Harvard UniversityApril 2019 - April 2024

Artificial Intelligence - Built large-scale, performant distributed systems interconnected with asynchronous task queues (Celery + RabbitMQ) and streaming APIs to process ~3 million individual assets per day for mission-critical applications supporting ~500,000 monthly visitors worldwide for to access hundreds of millions of digital multimedia assets
Full-Stack Software Development - Built backend services with Python FastAPI, Javascript NodeJS, and TypeScript NestJS; built SPAs in Angular
Artificial Intelligence - Developed pilot project to modernize library discovery by enabling natural language interaction with catalog services using Python LangChain and GenAI models (Anthropic Claude Instant and OpenAI GPT 3.5)
CI/CD - Implemented unit and integration tests with CI/CD pipelines, orchestrating containerized deployments (Docker), Kubernetes workloads (Rancher), and secure secrets management using GitHub Actions and ArgoCD on AWS
Database Systems - Designed relational and NoSQL schemas with SQLAlchemy (PostgreSQL) and Mongoose (MongoDB)
SDLC - Work within Agile/Scrum methodologies across the full software development lifecycle, contributing to iterative development, peer code review, and continuous delivery; Certified ScrumMaster (CSM)

Software Engineer

Broad Institute of Harvard and MITMarch 2015 - April 2019

Biomedical Research Support - Built a secure web portal for transferring terabyte-scale genomic sequencing data with IBM Aspera APIs, applying encryption and access controls to ensure compliance with NIH data policy requirements
Full-Stack Development - Designed and built custom web applications working on all levels of the tech stack including frontend, backend, databases, authentication & authorization, security, testing, and CI/CD deployment automation

Education

Graduate Certificate - Artificial Intelligence

Harvard University UniversityIn progress

Artificial Intelligence with Python - Search algorithms, classification, optimization, machine learning, large langauge models, optimization Artificial Intelligence with Python
Foundations of Data Science & Engineering - Data engineering: data management, transformation, transportation, exploratory data analysis, visualization, statistical thinking, machine learning, natural language processing, big data analytics platforms
Deep Learning - Neural networks, transformers with attention, deep learning APIs with Keras, Tensorflow, and Pytorch
Foundations of Large Language Models - Transformer architectures (GPT, BERT, and T5), text generation, language translation, sentiment analysis, chatbots, conversational agents, prompt engineering, retrieval augmented generation (RAG), Hugging Face transformers in Python

Master of Science - Computer Information Systems IT Security

Boston UniversityGraduated 2014 | Cumulative GPA: 4.0

• Software Development – Software development with front and backend langauges and frameworks and relational databases
• Network Security – Advanced network security issues and solutions, services, access controls, vulnerabilities, threats, risks, network architectures, attacks, network security capabilities and mechanisms
• Enterprise Information Security – Security in computer systems, networks, applications, memory protection, access control and authentication, file system security, backup and recovery management, intrusion and virus protection mechanisms, application level protections, cryptography
• IT Security Policies and Procedures – Development and implementation of security policies, risk management plans, standards and procedures on infrastructure, systems, networks, data, operations and user access
• Information Systems Analysis – Analysis and design, object oriented methods, requirements analysis, UML, software system architecture, implementation, management, and testing