Data Scientist - Content Risk Management
- Evaluated and optimized pipelines and architecture to ensure short video data flows remained privacy compliant, proactively preventing the leakage of abnormal videos;
- Utilized Python to seamlessly integrate data from short videos, live streams, merchandise sales, etc., resulting in a 25% increase in data integration efficiency into downstream usage;
- Innovated the automation of operational reports, achieving a 30% increase in report generation efficiency;
- Directed Extraction, Loading and Transformation processes of massive, unstructured datasets with MongoDB;
- Engineered SQL jobs to streamline daily operational data processing, enhancing speed by 20% and facilitating downstream visualizations;
- Automated data collection using Python to generate supervisory dashboard reports, effectively managing datasets with over 1 TB records;