home

about me

I am a seasoned data scientist with more than 10 years experience spanning machine learning, web development, and data engineering.

résumé

employment

Vice President of Computation June 2022 — Present
Senior Data Scientist September 2021 — June 2022
Evozyne Chicago, IL

  • Oversight of ML and software engineering initiatives, project planning, hiring/recruitment, establishing team culture, management of 10+ direct reports
  • Research partnership with Nvidia resulting in cutting-edge generative large language model (LLM) for protein engineering
  • Architected and directed the development of two cloud-based systems: a laboratory data and analytics system and a scalable ML system [Python, PyTorch, MetaFlow, Kubernetes]
  • Researched semi-supervised generative AI models (VAE, transformer) to create optimized synthetic proteins
  • Generated novel protein sequences to successfully enhance enzymatic activity and thermostability phenotypes by as much as 10x

Senior Data Scientist February 2021 — September 2021
Data Scientist, Applied Machine Learning May 2019 — February 2021
Tempus Labs Chicago, IL

  • Development of ML survival models for precision medicine prediction for risk of metastasis in cancer patients based on patient DNA and RNA genomics data
  • Built general purpose data science platform for conducting ML experiments [Python, sklearn, TensorFlow, lifelines, RedShift, MetaFlow]

Senior Data Scientist June 2017 — May 2019
Sprout Social Chicago, IL

  • Research and development of data science and machine learning systems to understand customer usage and analyze social media data [Python, Java, Spark, Redshift]

Lead Data Scientist September 2016 — June 2017
GE Transportation Chicago, IL

  • Created descriptive and predictive analytics solutions for asset performance (e.g. fuel optimization) utilizing modern machine learning and big data technologies [Python, Spark, Hadoop, Zeppelin]
  • Analytics and data engineering pipeline for container ship manifests importing goods to the Port of Los Angeles [Java, PostgreSQL]

Big Data Engineer, iTunes Analytics May 2015 — August 2016
Apple, Inc. Cupertino, CA

  • Developed analytics infrastructure to generate insights into customer experiences on products such as the iTunes Store, App Store, Apple Music, and Apple TV [Java, Python, Splunk, Cassandra, Hadoop, JavaScript]
  • Utilized machine learning, statistics, and data mining to perform data analysis, segmentation, and hypothesis testing

Software Developer August 2013 — April 2015
Signal (formerly known as BrightTag) Chicago, IL

  • Developed data models, algorithms, and back-end services to build and analyze user profile networks for millions of users per day; stored in NoSQL database with billions of records (∼50 TB) [Java, Cassandra, Python, Spark, Kafka, R]
  • Created real-time anomaly detection and network traffic forecasting system using Fourier analysis capable of predicting regular traffic patterns for upcoming week with >90% accuracy [Java, Python, Storm]

Postdoctoral Appointee March 2012 — July 2013
Argonne National Laboratory Leadership Computing Facility Chicago, IL
University of Chicago

  • Optimized massively parallel physics/chemistry simulations on IBM Blue Gene/Q supercomputer (3 on Top500); increased simulation speed over 8x, scalability to ∼0.4 million CPU cores [C++, C, MPI, OpenMP, Python]
  • Invented novel quantum mechanical proton transport model based on fragment electronic structure theory; model fitting via statistical optimization techniques (simulated annealing, regression, swarm intelligence, etc.)

Ph.D. Student Researcher June 2007 — March 2012
The Ohio State University Columbus, OH

  • Published 10 first author journal articles (see publications); presented at 20+ professional and academic events
  • Researched quantum chemistry and statistical thermodynamics; mathematical theory, computation, and algorithms
  • Implemented theoretical physics/chemistry models into efficient code [C++, C, Fortran, MPI, OpenMP]

education

B.S. Chemistry, minor in Microbiology August 2003 — June 2007
The Ohio State University Columbus, OH

Formal Courses:
    Quantum Mechanics, Statistical Thermodynamics, Computational Chemistry, Chemical Physics, Multivariable Calculus, Linear Algebra, Differential Equations, Computer Programming, Numerical Methods
Supplementary Online Courses:
  • Udacity: Web Development, Programming Languages, Parallel Programming (GPU/CUDA), Machine Learning, Artifical Intelligence
  • Coursera: Data Science Signature Track (R Programming, Statistics, Data Wrangling), Machine Learning, Algorithms, Databases, Neural Networks

technical skills

Category Proficiency in approximate descending order from left to right
Programming Languages Python, Java, JavaScript, C++, C, awk, Unix/Linux shell (bash), Scala
Web Technologies HTML, CSS/SCSS, Flask, Node.js, jQuery, Jinja, AJAX, web workers
Databases/Storage PostgreSQL, Redshift, Cassandra, MySQL, S3, Elasticsearch, Splunk, HDFS, Kafka, Redis
Data Analysis/Modeling pandas, numpy, scikit-learn, SciPy, Keras, Lasagne, R
Compute Tools Spark, Hadoop, MPI, OpenMP, blas/lapack
Productivity Tools git, Jupyter/IPython, vim, LaTeX, JIRA, svn
Software Engineering Test driven development, scalable architecture design, code review, agile dev
Machine Learning Techniques Linear/Logistic Regression, Neural Networks (MLP, autoencoder, convolution, recurrent, deep learning), Fourier Analysis, Clustering, k-NN, Random Forests, SVD, PCA, NLP, SVMs

projects & additional experience

To see some code I have written, visit my GitHub account.

Experimental neural network and deep learning library; SGD and backpropagation analytic gradient implemented from scratch for multi-layer perceptron, 1-D convolution net, particle network (my own invented flavor of ANN); exploring data parallelization via Spark and GPU acceleration [Python, Spark, numpy, PyCUDA, PyOpenCL]

Convolution neural network model for distinguishing between pictures of bacon and/or Kevin Bacon; web app interface for uploading and classifying pictures (formerly hosted at www.isitbacon.net) [Python, Flask, Lasagne, Theano, HTML, CSS, JavaScript, Twitter Bootstrap]

2014 — 2016

Open-source parallel JavaScript math and statistics library built around HTML5 Web Workers and Node.js cluster library capable of speeding up computations on multi-core devices; accompanying documentation website: mathworkersjs.org, available for install on npm [JavaScript, Node.js, HTML5, CSS/SCSS, Python, Flask, Apache Server]

2013 — present

Full stack coding, back-end to front-end; dynamic blog database. [HTML, CSS/SCSS, JavaScript/jQuery/Node.js, MySQL, Skeleton]

2013 — present

Recreational mathematics and programming problems from Project Euler; currently solved more than 110 problems, 99th percentile [Python, C++]

Parallel interface to Q-Chem program for propagating chemically reactive proton transport simulations with analytic gradients; demonstrated scalability to >200 CPUs [C++, C, MPI]

open source & community contributions

Simple error handling for input server connection list [Python]

2007 — 2014

Lead author of PCM solvent modeling, QM/MM, parallel linear algebra solvers, and Fast Multipole Method code; software design committee; 7th author of 161 co-authors on software white paper [C++, C, Fortran]

Multi-copy communication interface to open-source molecular dynamics software for parallel tempering/replica exchange (LAMMPS Ensembles); optimized compute kernel for pairwise interactions [C++, C, MPI, OpenMP, Python]

honors & awards

Chair's Prime Choice in Computational Division at American Chemical Society Conference
2013
Presidential Fellowship from The Ohio State University Graduate School ($33,150)
2012
Chemical Computing Group Research Excellence Award from American Chemical Society ($1,150)
2012
Travel Fellowship to present at American Conference on Theoretical Chemistry ($600)
2011
Selected to attend Telluride School on Theoretical Chemistry ($850)
2011
U.S. Department of Energy Merit Scholarship for top poster presentation ($300)
2010
3rd place (out of ∼30) at Ohio State University Denman Undergraduate Research Forum ($300)
2006
American Society for Microbiology Undergraduate Research Fellowship ($4,000)
2006
Ohio State Arts & Sciences Undergraduate Honors Research Scholarship ($3,500)
2006

publications

Google Scholar Statistics: 4000+ total citations, h-index 13

  1. Emre Sevgen, Joshua Moller, Adrian W. Lange, John Parker, et al. ProT-VAE: Protein Transformer Variational AutoEncoder for Functional Protein Design bioRxiv (2023).
  2. Yihan Shao, Zhengting Gan, Evgeny Epifanovsky, Andrew T.B. Gilbert, Michael Wormit, Joerg Kussmann, Adrian W. Lange et al. Advances in molecular quantum chemistry contained in the Q-Chem 4 program package Mol. Phys. 1-32 (2014).
  3. John M. Herbert and Adrian W. Lange. Book chapter: Polarizable Continuum Models for (Bio)Molecular Electrostatics: Basic Theory and Recent Developments for Macromolecules and Simulations (2014).
  4. Adrian W. Lange and Gregory A. Voth. Multi-state Approach to Chemical Reactivity in Fragment Based Quantum Chemistry Calculations J. Chem. Theory Comput. 9, 4018-4025 (2013).
  5. Adrian W. Lange, Gard Nelson, Christopher Knight, and Gregory A. Voth. Multiscale Molecular Simulations at the Petascale (Parallelization of Reactive Force Field Model for Blue Gene/Q): ALCF-2 Early Science Program Technical Report Argonne National Laboratory (2013).
  6. Adrian W. Lange and John M. Herbert. Improving generalized Born models by exploiting connections to polarizable continuum models. II. Corrections for salt effects. J. Chem. Theory Comput. 8, 4381-4392 (2012).
  7. Adrian W. Lange and John M. Herbert. Improving generalized Born models by exploiting connections to polarizable continuum models. I. An improved effective Coulomb operator. J. Chem. Theory Comput. 8, 1999-2011 (2012).
  8. Adrian W. Lange and John M. Herbert. A Simple Polarizable Continuum Solvation Model for Electrolyte Solutions. J. Chem. Phys. 134, 204110 (2011).
  9. Adrian W. Lange and John M. Herbert. Symmetric Versus Asymmetric Discretization of the Integral Equations in Polarizable Continuum Solvation Models. Chem. Phys. Lett. 509, 77 (2011).
  10. Adrian W. Lange and John M. Herbert. Response to “Comment on ‘A Smooth, Nonsingular, and Faithful Discretization Scheme for Polarizable Continuum Models: The Switching/Gaussian Approach."’. J. Chem. Phys. 134, 117102 (2011).
  11. Adrian W. Lange and John M. Herbert. A Smooth, Nonsingular, and Faithful Discretization Scheme for Polarizable Continuum Models: The Switching/Gaussian Approach. J. Chem. Phys. 133, 244111 (2010).
  12. Adrian W. Lange and John M. Herbert. Polarizable Continuum Reaction-field Solvation Models Affording Smooth Potential Energy Surfaces. J. Phys. Chem. Lett. 1, 556-561 (2010).
  13. Adrian W. Lange and John M. Herbert. Both Intra- and Interstrand Charge-Transfer Excited States in Aqueous B-DNA Are Present at Energies Comparable to or Just Above the 1ππ* Excitonic Bright States. J. Am. Chem. Soc. 131, 3913-3922 (2009).
  14. Adrian W. Lange, Mary A. Rohrdanz, and John M. Herbert. Charge-Transfer Excited States in a π-Stacked Adenine Dimer, As Predicted Using Long-Range-Corrected Time-Dependent Density Functional Theory. J. Phys. Chem. B 112, 6304 (2008).
  15. Adrian Lange and John M. Herbert. Simple Methods to Reduce Charge-Transfer Contamination in Time-Dependent Density-Functional Calculations of Clusters and Liquids. J. Chem. Theory Comput. 3, 1680 (2007).

contact