Work experience
- Software Engineer2017—Cloud DataprocGoogle (Remote)
- Designed, implemented, and benchmarked various experimental and production shuffle backends for Spark, including Dataproc Enhanced Flexibility Mode. (Tech lead.)
- Worked with Spark community to get API changes into Spark core to support disaggregated shuffle implementations. (Tech lead.)
- Worked on initial Native Query Execution engine for Spark on Dataproc.
- Worked on Spark Connect/Python notebook client libraries and integration.
- Designed and implemented the first Portable runners for Apache Beam (on Flink backend).
- Identified and performance bugs in the GCS connector and Dataproc distribution to improve best-case throughput by 20% and worst-case (pathological) inputs by orders of magnitude. (Inclues JNI work.)
- Designed and implemented core end-to-end/integration testing framework for Dataproc.
- Specialized in Spark, performance, price-performance, and distributed reliability engineering.
- Software Engineer2014—2016Android Machine IntelligenceGoogle, Seattle
- Adapted early FaceNet models to run locally on device.
- Compiled and compressed knowledge graph models to run locally on devices. Included distributed processing pipeline and Android client work. Technology was integrated into GBoard and other core Android services.
- Android application-level programming and JNI.
- Software Engineer/Data Scientist2013—2014Shipping ScienceeBay, Bellevue
- Added incremental improvements to production Fast 'N Free model, with a focus on latency and model accuracy.
- Developed new extensible data transformation and training pipeline in Spark
- Implemented Akka-based system to automate data generation and verification.
- Software Engineer Intern2012Shipping ScienceeBay, Redmond
- Designed and implemented new Fast 'N Free shipping estimate machine learning model.
- Trained and tested model on Hadoop, outperformed then-current system.
- Model was used on live site 2012—2013 (after internship).
Education
- University of Washington, Seattle Graduated June 2013
- Double Major: Bachelor of Science, cum laude, Computer Science and Physics
- Minor: Math
Skills
- Distributed systems
- Machine learning
- Bayesian probability modeling
- Spark
Programming Languages (recent experience)
- Python
- Rust
- JavaScript
- Java
Programming Languages (professional experience, not recent)
- C++
- Scala
Miscellaneous Interests
- Fully persistent data structures
- Software security
- The power of randomness
- Programming languages
- Skiing
- Biking (road and XC mountain)
- Hiking (goal to complete Washington PCT in individual day-segments)