Data Scientist
Current- Automated the ETL data pipeline in order to streamline 50GB+ of multi-modal traffic data with Python and Crontab
- Forecasted the congestion rate by using XGBoost, Decision Tree, random forest and Deep Learning models (CNN), and optimized the model, reduced the MSE for 5%
- Developed and maintained the real-time tableau dashboard to show the discrepancy between the forecasted rate and real-time congestion condition
- Identified the root cause of the discrepancy in 4+ different dimensions including seasonality, traffic zone and road condition etc., successfully improved the data reporting accuracy for 20%