I am a seasoned data scientist with more than 10 years experience spanning machine learning, web development, and data engineering.
Vice President of Computation
June 2022 — Present
Senior Data Scientist
September 2021 — June 2022
Evozyne
Chicago, IL
Senior Data Scientist
February 2021 — September 2021
Data Scientist, Applied Machine Learning
May 2019 — February 2021
Tempus Labs
Chicago, IL
Senior Data Scientist
June 2017 — May 2019
Sprout Social
Chicago, IL
Lead Data Scientist
September 2016 — June 2017
GE Transportation
Chicago, IL
Big Data Engineer, iTunes Analytics
May 2015 — August 2016
Apple, Inc.
Cupertino, CA
Software Developer
August 2013 — April 2015
Signal (formerly known as BrightTag)
Chicago, IL
Postdoctoral Appointee
March 2012 — July 2013
Argonne National Laboratory Leadership Computing Facility
Chicago, IL
University of Chicago
Ph.D. Student Researcher
June 2007 — March 2012
The Ohio State University
Columbus, OH
Ph.D. Physical Chemistry
June 2007 — March 2012
The Ohio State University
Columbus, OH
Dissertation:
Multi-layer Methods for Quantum Chemistry in the Condensed Phase: Combining Density
Functional Theory, Molecular Mechanics, and Continuum Solvation Models
B.S. Chemistry, minor in Microbiology
August 2003 — June 2007
The Ohio State University
Columbus, OH
Category | Proficiency in approximate descending order from left to right |
---|---|
Programming Languages | Python, Java, JavaScript, C++, C, awk, Unix/Linux shell (bash), Scala |
Web Technologies | HTML, CSS/SCSS, Flask, Node.js, jQuery, Jinja, AJAX, web workers |
Databases/Storage | PostgreSQL, Redshift, Cassandra, MySQL, S3, Elasticsearch, Splunk, HDFS, Kafka, Redis |
Data Analysis/Modeling | pandas, numpy, scikit-learn, SciPy, Keras, Lasagne, R |
Compute Tools | Spark, Hadoop, MPI, OpenMP, blas/lapack |
Productivity Tools | git, Jupyter/IPython, vim, LaTeX, JIRA, svn |
Software Engineering | Test driven development, scalable architecture design, code review, agile dev |
Machine Learning Techniques | Linear/Logistic Regression, Neural Networks (MLP, autoencoder, convolution, recurrent, deep learning), Fourier Analysis, Clustering, k-NN, Random Forests, SVD, PCA, NLP, SVMs |
2015 — present
Experimental neural network and deep learning library; SGD and backpropagation analytic gradient implemented from scratch for multi-layer perceptron, 1-D convolution net, particle network (my own invented flavor of ANN); exploring data parallelization via Spark and GPU acceleration [Python, Spark, numpy, PyCUDA, PyOpenCL]
2015
Convolution neural network model for distinguishing between pictures of bacon and/or Kevin Bacon; web app interface for uploading and classifying pictures (formerly hosted at www.isitbacon.net) [Python, Flask, Lasagne, Theano, HTML, CSS, JavaScript, Twitter Bootstrap]
2014 — 2016
Open-source parallel JavaScript math and statistics library built around HTML5 Web Workers and Node.js cluster library capable of speeding up computations on multi-core devices; accompanying documentation website: mathworkersjs.org, available for install on npm [JavaScript, Node.js, HTML5, CSS/SCSS, Python, Flask, Apache Server]
2013 — present
Full stack coding, back-end to front-end; dynamic blog database. [HTML, CSS/SCSS, JavaScript/jQuery/Node.js, MySQL, Skeleton]
2013 — present
Recreational mathematics and programming problems from Project Euler; currently solved more than 110 problems, 99th percentile [Python, C++]
2013
Parallel interface to Q-Chem program for propagating chemically reactive proton transport simulations with analytic gradients; demonstrated scalability to >200 CPUs [C++, C, MPI]
Simple error handling for input server connection list [Python]
2007 — 2014
Lead author of PCM solvent modeling, QM/MM, parallel linear algebra solvers, and Fast Multipole Method code; software design committee; 7th author of 161 co-authors on software white paper [C++, C, Fortran]
2013
Multi-copy communication interface to open-source molecular dynamics software for parallel tempering/replica exchange (LAMMPS Ensembles); optimized compute kernel for pairwise interactions [C++, C, MPI, OpenMP, Python]
Google Scholar Statistics: 4000+ total citations, h-index 13