I am an experienced DevOps Engineer / Site Reliability Engineer with over 20 years of IT Experience. During that time, I have been a software developer, a backend systems administrator, as well as customer-facing technical support; as a result, I have spent the last 12+ years helping my teams thrive by combining those skillsets in various DevOps-focused roles.I have extremely strong Python and BASH scripting skills, extensive Linux expertise, an analytic mind, and I am a tenacious problem-solver. I believe in automating tasks whenever possible, collaborating and knowledge-sharing to make those tasks that can't be automated as painless as can be for everyone involved, and in taking full advantage of my strong troubleshooting skills to solve problems when things don’t go according to plans. I view all systems with a security-oriented mindset - when I design a system, I don't just figure out "how can I make this work?", I design for "what are the ways someone could (intentionally or unintentionally) break this, and how do I prevent that?".I am a quick learner, and if I don't know something, I know how to research to figure it out. As an example, I once interviewed for a job as a COBOL programmer; when asked if I knew COBOL, I replied "no, but give me a week and I will". I started that job a week later, worked there for a year and a half, and was even re-hired after a round of layoffs due to the quality of my work.
-
Site Reliability EngineerCanonical Sep 2021 - Mar 2024United StatesAccomplishments and Duties: • Primary network engineer for mixed-vendor lab environment with network equipment from Aruba, Cisco, Dell, FS, HP, Juniper, Mellanox, and Supermicro (Cumulus). This lab was used to test enterprise customer equipment to certify that it would work with our primary product, Ubuntu Linux. Since we had minimal control over the type of equipment that customers would send in, this required a high degree of adaptability and the ability to learn new systems on the fly. • Created tooling to aggregate and automate discovery of and management login procedures for over 750 internal and external cloud deployments across multiple bastion "jump hosts". This allowed SREs to search for and log into a management environment for any deployment with a single command, with updates and new deployments published automatically via Terraform. This tool is estimated to have saved the company over $230,000 in wasted man-hours per year. • Developed scripts and procedures to automate migration process for PostgreSQL database deployments, which allowed for rapid mass migration/upgrade of internal deployments from an outdated OpenStack cluster to a newer one in another datacenter. This was part of a massive team effort that allowed our team to clear out a closing datacenter on minimal notice in under 3 months while maintaining downtime SLAs, saving the company from having to pay extremely expensive fees to remain in the datacenter for another month. • Maintained over 750 internal and external cloud deployments via an Infrastructure as Code (IaC) model using Terraform, Kubernetes, Docker, Juju, and Mojo. These environments were spread across multiple private OpenStack clusters as well as public AWS, Azure, and Google Cloud infrastructure. • Coordinated work with teams spread across every continent in a follow-the-sun model. Company is fully remote, so this required very good communication and technical documentation skills. -
Devops EngineerRackspace Technology Jul 2012 - Jul 2021San Antonio, Texas, United States• Designed a distributed work automation system in Python to log into thousands of customer firewall and load balancer devices as well as over 11,000 servers across six datacenters in four countries. Customer servers were running any of several dozen variants of Windows and Linux OSes, several with highly customized environments. For each device, the system had to analyze the environment and make persistent networking changes with minimal service disruption. As customer environments were highly unpredictable, the system had to have minimal dependencies, be extremely fault tolerant, and be designed to fail in safe, sane ways. Final production run completed in under an hour with a <0.1% error rate (with tickets created for each account identifying any failures, to allow manual follow-up by support teams). This saved the company nearly 15,000 man-hours of highly error-prone work that would likely have further cost hundreds of thousands of dollars in customer downtime reimbursements.• Wrote multiple scripts, libraries, and work-automation utilities using Python, SQL and Bash scripting languages. Many of these scripts were designed to be run in high-risk situations where errors could lead to major customer downtime.• RackConnect is a complex networking architecture that bridges physical network VLANs and software-defined virtual networks together. Product troubleshooting required deep working knowledge of dozens of technologies including Cisco, F5, Arista, and Brocade network gear, Windows and Linux operating systems, MS SQL, MySQL/MariaDB, RabbitMQ, Redis, OpenStack, OpenFlow, Open vSwitch, VMWare NSX-MH Software Defined Networking (VXLAN and STT-based SDNs), and Citrix Xen Server technologies, as well as various Rackspace systems and APIs.• Provided 24/7 on-call support, including investigation, troubleshooting, and postmortem root-cause analysis. The final stage of root-cause analysis involved presenting after-action reports to non-technical senior management. -
Linux Systems Engineer IiiRackspace Technology Oct 2009 - Jul 2012San Antonio, Texas, United StatesAccomplishments and Duties: • Architected and managed a complete systems upgrade and datacenter migration for the company's largest and most critical internal application stack. Project involved organizing 5 operational teams moving 8 applications across 100 servers and 3 technology stacks to new network gear and new servers running a different OS and upgraded versions of nearly all system libraries. • Designed and developed a deployment engine to enable rapid deployment of multiple internal software stacks. This saved our team over 1,000 man-hours per year. • Enforced SOX-compliant change controls for end-to-end development process from conception to release. • Created reports and documentation for technical data and procedures suitable for presentation to semi- and non-technical audiences. -
Linux Systems Administrator IiiRackspace Technology Dec 2007 - Oct 2009San Antonio, Texas, United StatesAccomplishments and Duties: • Worked directly with customers, providing telephone and ticket-based support for a wide variety of Linux distributions and software packages. • Performed forensic analysis to determine if servers were compromised, and assisted customers with recovering from compromised systems. • Trained and assisted other support technicians, increasing both their technical and customer service skillsets. -
Senior Network AdministratorCoi Enterprises Aug 2004 - Nov 2007Boerne, Texas, United StatesAccomplishments and Duties: • Maintained active DoD Top Secret clearance. • Designed, programmed, and implemented Preventive Maintenance tracking program for Palm Pilots in C++, with PC backend in Java. This saved in-the-field workers hundreds of man-hours per year and allowed data previously tracked via paper forms to be submitted electronically, • Designed and implemented port-knocking system in C++/Java for allowing secure remote access to servers without exposing services directly to the internet. • Lead developer for a small team that designed and implemented a secure messaging system utilizing a rotating cipher system. • Set up and installed NISPOM-certified Secure AIS systems. • Set up, installed, and administered web, email, firewall, DNS, proxy, LDAP, database, VPN, network monitoring, ticket tracking, Active Directory, and file sharing servers. • Set up and administered multi-path network involving T1 premises-to-Cloud and Point-to-Point connections.
Russell L. Education Details
-
Bachelor Of Arts - Ba -
Bachelor Of Arts - Ba
Frequently Asked Questions about Russell L.
What is Russell L.'s role at the current company?
Russell L.'s current role is Site Reliability Engineer.
What schools did Russell L. attend?
Russell L. attended Our Lady Of The Lake University, Our Lady Of The Lake University.
Not the Russell L. you were looking for?
-
Russell Higgs l CSAP
Cyber Security And Systems Administrator With 6 Years Of ExperienceAlexandria, Va -
Russell L.
New York City Metropolitan Area
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial