Career Profile

Driven data scientist with a strong background in core business skills. Consistently pioneers innovative approaches by coupling a strong understanding of analytics with an entrepreneurial mindset. Valued technical skills with programming, analytics, data pipelines, machine learning, visualization, simulation, and predictive forecasting. Demonstrated business acumen with program management, business development, client relationship management, BPR, proposal development and leadership. Committed to leading the efforts that capitalize on the most challenging issues.

Experiences

Lead Data Scientist, Underwriting Data Services

3/2022 - Current
CFC Underwriting, London U.K.

• For a scaling specialist insurer focusing on cybersecurity offerings, leading the process design and model development for an autonomous underwriting system that can profitably quote core products without human intervention

• Lead Data Scientist and Product Owner for the team that developed a core underpinning system at CFC. This system enabled correct firm identification and entity resolved a huge variety of attributes to be used for automation, pricing, risk selection and cross sell purposes throughout the business

• Developed a document parsing system that extracted attributes needed for automation decisions. The first impact of this system is how it's enabled CFCs renewal book to proceed autonomously at policy expiry.

Senior Data Scientist, Underwriting

9/2020 - 3/2022
Hastings Direct, London U.K.

•For a leading digital insurer, led a variety of projects focused on: market average price prediction, large loss prediction, claims frequency prediction, modernizing deployment architecture via cloud services (Azure Databricks) and establishing MLOps processes.

• Developed and productionized two market data based XGBoost price prediction algorithms to significantly increase sales and average policy premium. Effort involved all typical aspects of data science model development including data sourcing through model selection & hyperparameter tuning.

• Directed a small team in developing and operationalizing an MLOps platform on Databricks. This solution enhances agility by enabling a balance between model complexity and response time which is crucial for accommodating the quick response time cutoffs imposed by aggregator sites.

• Created model interpretability modules using SHAP values for increasing trust in an NLP based claims prediction algorithm.

Data Scientist, Military Digital Group

5/2019 - 6/2020
GE Aviation, Washington D.C.

• For a major DoD customer, engaged in creating various data pipelines using Python to connect disparate source systems data before developing Tableau views. These views will be implemented as part of a governance process to reform readiness understanding within the organization.

• Using open source AutoML technology and working with engine experts, created a blade failure algorithm to correctly predict failures in this critical engine component. Involved typical Data Science steps of data exploration, feature engineering (to include creating synthetic records), model selection, evaluation and user acceptance.

• Using Docker, established the correct dependencies and containerization procedures to allow a critical engine health algorithm to deploy in any technical environment. This is expected to be utilized across several rotorcraft platforms, become a key component of predicting engine health, and improve predictability at many downstream supply chain and maintenance organizations.

Independent Consultant, Data Science Consulting

8/2019 - 1/2020

Engaged as an independent consultant to provide system architecture and pricing for a new commercial data platform. Pricing project involved the creation of a Monte Carlo simulation to account for unknown variables & provide a holistic Total Cost of Ownership (TCO) to the organization. After the successful delivery of this project, the client is continuing to move forward in constructing the platform we designed & priced.

Senior Consultant, Data Science and Analtyics

4/2018 - 4/2019
Fresh Gravity, Washington D.C.

• Diverse role including business development, project leadership, and solution development for a small Data Science consulting firm with a focus on Machine Learning based solutions.

• Led the successful pursuit of several Machine Learning focused projects for various firms, including Fortune 500 companies. This process often involved many phases such as use case development, solution ideation, technical proposal writing & delivery, and proof-of-concept development.

• Took the lead functional role in the delivery of several diverse cutting-edge projects such as Deep Learning for Time Series Forecasting, translating natural language to SQL (NLIDB), and NLP for pharmacovigilance case processing

Federal Technology Consultant, Analytics Workstream

5/2017 - 3/2018
Deloitte Consulting LLP, Washington D.C.

• In response to a maturing marketing platform, pioneered a suite of R scripts that resulted in major efficiency gains by automatically processing millions of records from hundreds of files into various reporting mechanisms and 200+ metrics.

• In preparation for the 2020 census, led a statistical analysis that combined several demographic factors into a linear regression model for consumer response rates within 39,000+ distinct ZIP codes.

• Successfully positioned our team to win future work by designing and socializing an executive dashboard in Qlik Sense that visualizes system performance, user interaction, and response rates across the United States.

• Created the foundation for a pricing mechanism by developing a Monte Carlo Simulation model that predicts campaign level consumer response rates from various input factors.

Federal Technology Analyst, IT Cost Estimation Lead

7/2015 - 5/2017
Deloitte Consulting LLP, Washington D.C.

• For high visibility IT projects costing over $5 million dollars, led the process and junior resources that would price around $2 billion of IT investments. The project financial estimates would be used to inform an investment committee's decisions for VP level IT and finance clients

• For a budget of approximately a billion dollars, implemented a Time Series Forecasting tool that accurately predicted IT spend to within a 7% error and informed critical funding re-allocations

Projects

Take a look at some of my GitHub repositories below for some work samples and projects!

Note that due to the nature of my work some of my more advanced scripts and projects cannot be posted publicly.These projects are the product of my own extracuricular work, including my graduate studies. Please note some projects were a group effort and this is indicated in the ReadMe file if applicable.

Malware Classification Using Deep Learning - This project first involved translating hex files into a binary stream 1D image. Then building various deep learning networks based on CNN and Bi-Directional LSTM to classify the known malware files into 9 different classes.
Connecting IoT to Automl - This project combined Raspberry Pi derived senor data, with open source AutoML packages, to showcase a proof-of-concept for designing cost efficient industrial optimzation systems.
Network DOS Attack Detection - Using the 1998 DARPA Intrusion Detection Evaluation dataset I configured a Random Forest model for anomaly detection.
Data Mining Project on World Military Spending - This project focused on data mining the SIPRI Milex dataset of world military spending by country.
Predictive Modeling on Seattle Housing Prices - This project built a predictive linear regression model for housing prices in the Seatle Housing market by using the Regsubsets method of feature selection and arriving at a .65 Adjusted Rsquared.
Data Mining Project on Commodities Prices since 1960 - This project mines insights from a comprehensive dataset on publicly traded commodities obtained from the World Bank.
Regsubsets Method for Feature Selection - Analysis focusing on using the various regsubsets selection methods to create a linear regression on health data.
Temporal Analysis on Chicago Crime Data - Temporal Analysis exploring factors contributing to Chicago Crime Data.

Skills & Proficiency

Languages:
Python, R, SQL, Visual Basic, VBA

Machine Learning:
AutoML, Neural Networks, Image Processing, Regression Techniuqes, Tree Based Methods, Clustering Methods

Analytics:
Data Mining, Time Series Forecasting, Reliability Analysis, Monte Carlo Simulation, Chi-Square Testing, Scenario & Sensitivity Analysis, Oracel Crystal Ball

Cloud Services:
AWS (S3, DynamoDB, SageMaker, RDS), Azure (Databricks)

Data Pipeline:
Apache Spark

Database:
MySQL, MySQL Workbench

Visualization:
Tableau, Qlik Sense, PowerBI, Matplotlib

Containerization:
Docker

Solution Architecting:
Persona Mapping, Use Case Modeling, Conceptual & Technical Architecture Diagramming

MS Office:
Word, PowerPoint, Visio, Teams, Flow, Excel (Custom VBA macros, advanced modeling), SharePoint SCA

Business Acumen:
Client Relationships, Business Development, Project Management, Financial Analysis, Technical & Functional proposal writting