Big Data Engineer
• Orchestrated Hadoop and Spark cluster environments (CDP – 7.1.8) with precision, overseeing cluster sizing, configuration, smoke testing, security setup, and ongoing performance tuning. • Ensured seamless cluster operations, adeptly managing minor and major upgrades, along with OS patching, to maintain peak system performance. • Spearheaded the migration of CDH and HDP clusters to CDP (Cloudera Data Platform), showcasing expertise in seamless transition. • Implemented Python scripts for data processing tasks and automation, enhancing cluster management efficiency.• Provided detailed reports on resources/services utilization and performance metrics, empowering user communities with actionable insights. • Facilitated coordination with Vendor teams for seamless installation, bug fixes, upgrades, and escalations, ensuring minimal downtime. • Supported Data Engineering teams in the deployment of Hadoop/Spark jobs (DevOps model support), leveraging performance tuning techniques for optimal job execution. • Contributed to the evolution of systems architecture to meet evolving requirements for scaling, reliability, performance, and manageability. • Demonstrated proficiency in administering Hive, Spark, Sqoop, HBase, and Kafka, ensuring the smooth operation of critical data processing components. • Leveraged expertise in networking/security infrastructure, encompassing VLAN, firewalls, Kerberos, LDAP, ensuring robust system defense. • Extensive experience in administering Red Hat Enterprise Linux environments, ensuring stability and security across the system landscape. • Proficient in Version Control tools such as git, facilitating collaborative development and code management.Environment: MapR 6.1, MEP 6.0.0, RHEL 7.0, RDS, Ec2 r3.4xlarge, EMR-5.8.0, MS-SQL, MaprFS, Hive, Zookeeper, Oozie, MapReduce, Yarn, Nagios, Sqoop, Hue, Drill, Spark, HBase