site reliability engineer resume example with 13+ years of experience

(555) 432-1000,
Montgomery Street, San Francisco, CA 94105
Professional Summary

A result driven passionate leader with an exceptional technical background. 12+ years of technical leadership experience in DevOps, Cloud Engineering and Operations (AWS), Technology Operations, and Site Reliability Engineering (SRE).Known for bringing high energy, being a change agent, building high performance teams, creative out of the box solutions and delivering/executing complex programs.

Worked on building monitoring infrastructure by implementing world class observability/monitoring frameworks and infrastructure.Leadership, Management, Technology Transformation, Engineering Excellence, Operational Excellence, Change Agent, Creative Technologist, Early adopter and influencer, Solution Architect, Site Reliability Engineering, Operational Excellence, Amazon Web Services (AWS), Hybrid Cloud Operations, DevOps, Open Source 1st

Reimagined production support operating model to operate exceptionally well with maximum Offshore utilization. Reduced the operating expenses and implemented cost rationalization project successfully.

  • SRE principles
  • Cloud Platforms: AWS & GCP
  • DevOps (CI/CT/CD)
  • Docker/Amazon ECS
  • Release/Change Management
  • Jenkins/Chef/CFT
  • Python,Shell Scripting
  • Weblogic
  • Apache
  • IIS
  • SQL
Calicut University Thrissur, Expected in 2007 B. Tech : Electrical and Electronics Engineering - GPA :
  • AWS Certified Solutions Architect - Professional
  • AWS Certified Security Specialty
  • AWS Certified SysOps Administrator - Associate.
  • Google Certified Professional Cloud Devops Engineer
  • Google Certified Professional Cloud Network Engineer
  • Google Certified Professional Cloud Security Engineer
  • Google Certified Professional Cloud Developer
  • Google Certified Professional Cloud Architect
Work History
Pearson - Site Reliability Engineer
Norman, OK, 01/2016 - Current

RESPONSIBILITIES: Wipro Limited is a leading global information technology, consulting and business process services company. We harness the power of cognitive computing, hyper-automation, robotics, cloud, analytics and emerging technologies to help our clients adapt to the digital world and make them successful.Working as an SRE consultant in Wipro digital division helping clients transform from end to end, from identifying value drivers and developing a road-map, to implementing, operating, and innovating.

  • Implemented and maintained DevOps CI/CT/CD pipeline to automate infrastructure provisioning and software delivery to Amazon web service cloud platform
  • Migration of On-premises application and its infrastructure to AWS Cloud platform by gathering requirement for new enhancements. Familiarity with Reliability techniques, Root Cause Analysis and Reliability Centered Maintenance. Deploy and maintain stacks for 24/7 critical uptime business product offering . Write automation/self-healing scripts in Python/BASH/ Node.js
  • Structuring Application, Server, Network, Database logs and integrating it with monitoring tools to identify anomalies and patterns that can speed investigations and intelligence discovery. Configure and maintain monitoring solutions at server and application level to increase visibility into day-to-day operations and issues, utilizing Splunk, ELK, Prometheus, Grafana, Kibana, Cloud Watch and AppDynamics and New Relic
  • Achieve Containerization using Docker and Amazon ECS technology
  • Design, Configure, Build Disaster Recovery Environment for all applications hosted in Cloud and On-premises Data Centre which demands desideratum to have in-depth erudition of server architecture, networking, Load balancers and exception handling
  • Work closely with Engineering and LOB partners to define, drive, and report on strategic programs and problems to continuously improve stability, efficiency, and performance of infrastructure and to come up with precise numerical target for system availability by defining Service Level Objective (SLO) and to reduce downtime. Resolving High Priority Incidents adhering defined SLA’s and take measures to prevent them in future ensuring high reliability and availability
  • Act as top-tier on-call support to manage alerts on Pager Duty for critical uptime business applications. Troubleshooting java application as well as client-server software, server architectures and performance bottle necks. Fix underlying issue caused by incidents and recording its impact, and actions taken to mitigate or resolve it, Analyze root cause(s), and make follow-up actions is to prevent incident from recurring
  • Change management, release approvals, release automation using Jenkins and Service Now .
    Configure Infrastructure configuration and management using cloud formation and CHEF
American Crystal Sugar Company - Technical Lead
Moorhead, MN, 08/2011 - 12/2015
  • Managed team of 20 personnel focused managing Production systems
  • Planning and management resourcing/staffing within a globally sourced delivery model
  • Worked on and maintained solutions which automated the configuration, provisioning, deployment, scaling and monitoring using
  • Overseen deployments into production and staging environments
  • Led and managed the development of DevOps processes, protocols and tools
  • Ensured high availability and acceptable levels of performance of mission critical applications. • Co-ordinated with other technical teams (i.e. operations, security, development, networking, security teams, IT Management, etc.) and assist in joint projects
  • Ensured System configuration is consistent with institutional policies/procedures
  • Managed delivery effort including configuration, release, change, demand, and operations management within schedule, quality, effort and SLAs
  • Management of issues and risks . Prioritization and management of the scope of application development and maintenance work. Monitored delivery performance and quality using metrics and implement continuous improvements
  • Built ongoing customer relationships, Monitor business cases and Measure benefits realized
  • Ensure quality management through defined quality standards. Led & monitored the performance of team members to ensure efficiency and meeting objectives
  • Identified and implemented strategies for building team effectiveness. Recognized areas of improvements and setting up development plans for the same
Jacobs Engineering Group Inc. - Software Engineer
Miami, FL, 01/2008 - 07/2011
  • Requirement gathering for various applications as part of the Domain migration and identified validation plans
  • Identified and documented the defects and worked with vendor and Project teams to fix defects
  • Worked with Infrastructure support teams to fix infrastructure issues as part of the domain migration
    Experience with Disaster recovery procedures and handled recovery of critical applications
  • Documented the health checks and troubleshooting guides for major applications for the client
  • Handled critical service failures and worked with various infrastructure support teams taking the issue to resolution
  • Identified monitoring needs and worked with Onsite team to get them prioritized and deployed to Production
  • Analyzed and resolved all the priority production tickets for the applications supported. Successfully drove the weekly/monthly Production Releases and provided the off hours support for US business hours
  • I hereby declare that the information furnished above is true to the best of my knowledge.

By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy

Your data is safe with us

Any information uploaded, such as a resume, or input by the user is owned solely by the user, not LiveCareer. For further information, please visit our Terms of Use.

Resume Overview

School Attended

  • Calicut University

Job Titles Held:

  • Site Reliability Engineer
  • Technical Lead
  • Software Engineer


  • B. Tech

By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy

*As seen in:As seen in: