Over 8+ years of experience in Data Analysis, Decision Trees, Random Forest, Data Profiling, Data Integration, Data governance, Migration and Metadata Management, Master Data Management and Configuration Management.
-
Data ScientistInnovee Consulting Llc Apr 2020 - Jan 2021New York City Metropolitan AreaInvolved in extensive hoc reporting, routine operational reporting and data manipulation to produce routine metrics and dashboards for management Created parameters, action filters and calculated sets for preparing dashboards and worksheets in Tableau. Interacting with other data scientists and architects, custom solutions for data visualization using tools like a tableau and Packages in Python. Involved in running MapReduce jobs for processing millions of records. Written complex SQL queries using joins and OLAP functions like Count, CSUM, and Rank etc. Building, publishing customized interactive reports, report scheduling and dashboards using Tableauserver. Worked on different data formats such as JSON, XML and performed MachineLearning algorithms in Python. Used pandas, numpy, seaborn, matplotlib, scikit-learn, scipy, NLTK in Python for developing various MachineLearning algorithms. Utilized ApacheSpark with Python to develop and execute BigData Analytics and Machine Learning applications, executed MachineLearning use cases under Spark ML and Mllib. Designed and developed NLP models for sentiment analysis. Designed and provisioned the platform architecture to execute Hadoop and MachineLearning use cases under Cloud infrastructure, AWS, EMR, and S3. -
Data ScientistMoody'S Analytics Aug 2018 - Dec 2019Prepared the workspace for Markdown. Accomplished Data analysis, statistical analysis, generated reports, listings, and graphs.Found outliers, anomalies, trends in any given data sets.Assisted in migrating data, data pump with the Export/Import utility tool.Implemented various Performance tuning techniques at ETL&TeradataBTEQ for efficient development and performance.Used Simple storage services (s3) for snapshot and Configured S3 lifecycle of Applications & Databases logs, including deleting old logs, archiving logs based on retention policy of Apps and Databases.Built models using Statistical techniques like Bayesian HMM and MachineLearning classification models like XG Boost, SVM, and Random Forest.Setup storage and data analysis tools in Amazon Web Services cloud computing infrastructure.Created logical data model from the conceptual model and its conversion into the physical database design using ERWIN 9.6.Piping and processing massive data-streams in distributed computing environments such as Hadoop to facilitate analysis (ETL). -
Data ScientistStrategic Research & Business Insights, Inc. Feb 2018 - Jul 2018Princeton, New Jersey, United StatesUsed Tableau to automatically generate reports. Worked with partially adjudicated insurance flat files, internal records, 3rdpartydatasources, JSON, XML and more.Experienced in building models by using Spark (PySpark, Spark SQL, Spark MLLib, and Spark ML).Experienced in CloudServices such as AWS EC2, EMR, RDS, S3 to assist with big data tools, solve the data storage issue and work on deployment solution.Worked with several R packages including knitr, dplyr, SparkR, Causal Infer, space time.Performed Exploratory Data Analysis and Data Visualizations using R, and Tableau.Implemented end-to-end systems for Data Analytics, Data Automation and integrated with custom visualization tools using R, Mahout, Hadoop and Mongo DB.Knowledge extraction from Notes using NLP (Python, NLTK, MLLIB, PySpark,)Independently coded new programs and designed Tables to load and test the program effectively for the given POC's using with Big Data/Hadoop.Worked with BTEQ to submit SQL statements, import and export data, and generate reports in Teradata.Built and optimized data mining pipelines of NLP, and text analytic to extract information.Coded R functions to interface with Caffe Deep LearningFrameworkWorking in AmazonWebServices cloud computing environmentInteracted with the other departments to understand and identify data needs and requirements and work with other members of the IT organization to deliver data visualization and reporting solutions to address those needs.Perform a proper EDA, Univariate and bi-variate analysis to understand the intrinsic effect/combined effects.Designed data models and data flow diagrams using Erwinand MS Visio.Performed data cleaning and imputation of missing values using R.Developed, Implemented & Maintained the Conceptual, Logical & Physical Data Models using Erwin for Forward/Reverse Engineered Databases. -
Data ScientistEllkay Jun 2016 - Dec 2017Elmwood Park, New Jersey, United StatesWorked with large amounts of structured and unstructured data. Knowledge in MachineLearning concepts (Generalized Linear models, Regularization, Random Forest, Time Series models, etc.) Worked in Business Intelligence tools and visualization tools such as BusinessObjects, Tableau, Chart IO, etc. Deployed GUI pages by using JSP, JSTL, HTML, DHTML, XHTML, CSS, JavaScript, and AJAX. Configured the project on Web Sphere 6.1 application servers Implemented the online application by using Core Java, JDBC, JSP, Servlets and EJB 1.1, Web Services, SOAP, WSDL. Handled end-to-end project from data discovery to model deployment. Monitoring the automated loading processes. Communicated with other Health Care info by using Web Services with the help of SOAP, WSDLJAX-RPC Used Singleton, factory design pattern, DAO Design Patterns based on the application requirements Used SAX and DOM parsers to parse the raw XML documents Used RAD as Development IDE for web applications. -
Data AnalystTech Mahindra Business Services Ltd Nov 2013 - May 2016
-
Data AnalystAccenture Mar 2012 - Oct 2013
Kapil S Education Details
-
Information Technology
Frequently Asked Questions about Kapil S
What is Kapil S's role at the current company?
Kapil S's current role is Data Scientist at Innovee Consulting LLC.
What schools did Kapil S attend?
Kapil S attended University Of Mumbai.
Not the Kapil S you were looking for?
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial