Business Function
Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels.
Key Accountabilities
- First and foremost, to build and own operational excellence
- Owning the production issues and provide resolution with-in agreed SLA and with user satisfaction.
- Manage all major incidents and resolve/recover the system without/with less impact to the business
- Producing batch and incidents trending and measuring systems performance against KPIs
- Overcoming any functional limitations and providing the most fit for purpose solutions
- Manages the identification and development of monitoring and improvements (process/ systemic) to improve the reliability of Production systems. Implements SRE practices in CORE Banking.
- Ensure strong, clear, and effective communication across all release stakeholders
Responsibilities
- Facilitate / Drive recovery calls for major incidents and coordinate with multiple teams to drive the resolution.
- Responsible to communicate on major incidents and provide regular update to the stakeholders
- Ensure Preventive and detective measures of the applications are identified and implemented.
- Automation of manual activities / processes for Production teams. (Automation experience required)
- Identifies persistent or recurring problems and recommends creative solutions
- Great People skills to build and manage performing team
- Strong communications skills and Understands and works well within global team, ensures proper handoff of incidents and details
- Ensure incidents are escalated and facilitated to enable efficient and timely service restorations
Requirements
- Minimum 12 years of experience and out of which minimum 7+ years of Production Management experience preferably in Banking industry
- Managed Open Systems (Eco systems) and multi-countries production environment
- Good level of command over production Infra; performance monitoring and reporting tools
- Implement Site Reliability Engineering principles regarding performance, reliability, monitoring, alerting in Production environment
- Over 7+ years of experience in Incident Management, Change Management, Problem Management.
- Hands-on experience in Unix/Linux/Shell/Python scripting
- Familiar with applications Xcelerate/TBMS systems, Quadient, Jboss , MariaDB, EDB Postgres, Java ( in Linux operating system)
- 10+ Experience working in supporting critical applications using API driven technologies
- 4+ years of working with a modern stack (AWS, PCF, OpenShift, or Kubernetes)
- Good knowledge in Java & Spring boot
- Good working experience in Elasticsearch, Logstash, Grafana/ Kibana, Appdynamics etc.
-en
-en