Job Title : Incident & Service Reliability Manager
Position type : Full time
Place of work : Bangkok, Sathorn district
Salary : Negotiable
Working conditions : Working conditions are normal for an office environment.
Department/Function : IT Service Delivery & CMC
Reporting to Title : Head of IT Service Delivery & CMC
The company:
BRED IT (Thailand) Ltd.is a wholly owned subsidiary of the French bank BRED Banque Populaire based out of Paris.
BRED IT was established in 2008 to become an IT hub and deliver IT operations and support for BRED Group Commercial Banks in South East Asia and Pacific Ocean areas.
Today, it supports Banque Franco Lao in Laos, BRED Bank Cambodia, BRED Bank Vanuatu, BRED Bank Solomon Islands, BRED Bank Fiji and Banque pour le commerce et l’industrie – Mer Rouge (BCIMR) in Djibouti (Africa).
BRED IT provides end to end Infrastructure and Applications management around Core Banking, Internet Banking and E-Payments.
BRED IT has also operated an offshore development center (specialized in Cobol & Java) for Paris headquarters since 2011.
We are a unique company, thanks to our identity and our history: We place our expertise at the service of BRED Group and develop our activities with an entrepreneurial structure. By putting BRED group best interests first, it allows us to deliver tailor-made solutions with high value-added.
Role Purpose:
As Major Incident & Service Reliability Manager, you are the central point of leadership during critical incidents and a key driver of service performance and availability across BRED IT’s international perimeter.
You will:
- Lead major incidents end‑to‑end, from impact assessment to resolution and communication.
- Coordinate cross‑functional technical teams, ensuring efficient investigations and sustainable fixes.
- Drive continuous improvement, strengthening processes, monitoring, and overall service reliability.
- Act as a clear, trusted communicator, providing timely, structured updates to management and international business stakeholders.
Main Responsibilities:
1. Incident Management
- Lead and coordinate the response to major incidents across all supported entities.
- Clarify business impact and criticality with all relevant parties (business, IT, vendors).
- Define and organize workstreams to structure the incident resolution (roles, tasks, timelines).
- Ensure complete and accurate incident records (incident details, impact, timeline, actions).
- Produce and coordinate Root Cause Analysis (RCA) and follow‑up actions with relevant teams.
- Clarify ticket ownership when responsibilities are unclear between teams.
- Summarize key facts in the Incident Report (IR) and pre‑fill required fields.
- Measure and monitor ticket quality and timeliness.
- Take ownership of communication during major incidents, providing concise, regular, and transparent updates to all stakeholders (management, banks, internal IT teams).
- Work closely with the Control and Monitoring Center (24/7) to continuously improve incident response and communication.
- Define, maintain, and publish KPIs on incident response, resolution times, and reporting quality.
- Report on major incident management in monthly reports and quality committees with the banks.
2. Problem Management
- Ensure that, for every major incident, corresponding problems are raised to address root causes.
- Follow up on problems with the relevant teams until permanent fixes are implemented.
- Track and report to management on problem creation, status, and resolution.
- Promote a “no recurrence” mindset, focusing on structural improvements rather than workarounds.
3. Monitoring & Observability Management
- Ensure that all critical services (infrastructure and applications) are properly monitored.
- In collaboration with business and technical teams, design and improve alerts (e.g. Zabbix, Splunk) to:
- Detect incidents early.
- Provide meaningful information (business impact, procedures, escalation).
- Maintain and improve monitoring processes and procedures, ensuring alignment with best practices.
- Act as a key stakeholder in shaping the overall observability strategy (metrics, logs, alerts, dashboards).
4. Governance, Controls & Reporting
- Answer to controls and audits related to CMC activities (internal, external, regulatory).
- Produce clear, data‑driven reports on incident and problem management for:
- Internal IT management
- BRED SA and international banks
- Contribute to quality committees, service reviews, and continuous improvement workshops.
5. Business Continuity & DR (Disaster Recovery)
- Contribute to the planning, organization, coordination, and reporting of DR test activities.
- Provide feedback from incidents and problems to improve DR scenarios, plans, and procedures.
Candidate profile:
You are a hands‑on leader with strong IT operations experience, excellent coordination and communication skills, and a passion for service reliability.
Core Skills & Competencies
- Excellent incident leadership: can manage complex, high‑pressure IT incidents with calm and structure.
- Strong problem‑solving and analytical skills; able to quickly understand technical issues and business impact.
- Excellent coordination skills for multi‑team technical investigations.
- Strong judgment and decision‑making, with a clear sense of priority and urgency.
- High level of initiative, ownership, and reliability; proactive in preventing issues, not only reacting to them.
- Ability to work autonomously and take decisions within the scope of the role.
- Solid understanding of IT governance and operations (ITIL or similar frameworks).
- Strong technical acumen: good understanding of IT systems and operations and willingness to continuously learn new technologies.
- Good business acumen: understands the impact of incidents on costs, customer experience, and production targets.
- Fast learner, self‑driven, highly motivated, with a strong “can‑do” attitude.
- Excellent organizational skills, rigor, and attention to detail.
- Proactive, reactive, and disciplined, with a strong sense of service.
- Demonstrated team spirit and ability to build strong relationships with technical and business teams.
- Proven leadership skills, especially in cross‑functional and international contexts.
Nice to have / Optional skills :
- Knowledge of network concepts.
- Experience with virtualization technologies.
- Familiarity with Linux operating systems.
- Exposure to container platforms (e.g. Docker, Kubernetes, OpenShift).
- Understanding of Microsoft technologies (Windows Server, Active Directory, etc.).
Education
- Minimum Bachelor’s Degree in Computer Science/Engineering or equivalent experience.
Language skills
- English Full Professional Proficiency (be able to work with BRED SA and BRED international Banks).
About the Company
BRED IT Thailand
BRED IT (Thailand) Ltd. is a wholly owned subsidiary of the French bank BRED Banque Populaire based out of Paris (BPCE Group).
BRED IT was established in 2008 with the objective to become the IT hub for BRED Group Commercial Banks in South East Asia, Pacific Ocean, and the Horn of Africa areas.
In parallel, BRED IT has expanded its activities since 2011 to also provide remote IT services to Paris Headquarters.
Today, with more than 200 employees, BRED IT fully supports Banque Franco Lao in Laos, BRED Bank Cambodia, BRED Bank Vanuatu, BRED Bank Solomon Islands, BRED Bank Fiji and Banque pour le commerce et l’industrie Mer Rouge (BCIMR) in Djibouti:
BRED IT hosts and manages all layers of BRED International Banks Information Systems: From Infrastructures to Applications (Core Banking, Internet/Mobile Banking, E-Payments and etc.), on a 24x7 basis. Half of the activity is currently performed for BRED Headquarters, with a focus on Projects (built with Java, COBOL, PHP, DataStage) and Production/Devops.
We are a unique company, thanks to our identity and our history: We place our expertise at the service of BRED Group and develop our activities with an entrepreneurial structure. By putting BRED group best interests first, it allows us to deliver tailor-made solutions with high value-added.