Site Reliability Engineer III - Custody Production Support
J.P. Morgan
Site Reliability Engineer III - Custody Production Support
Job Information
- Job Identification 210662971
- Job Category Software Engineering
- Business Unit Commercial & Investment Bank
- Posting Date 09/08/2025, 03:42 PM
- Locations Chaseside - Hampshire Building, Bournemouth, Dorset, BH7 7DA, GB
- Job Schedule Full time
- Job Shift Day
Job Description
There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.
As a Site Reliability Engineer III at JPMorgan Chase within the Commercial & Investment Bank, you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform.
Job responsibilities
- Independently manage small to medium-sized projects with initial guidance, progressing to designing and delivering projects autonomously
- Utilize technology to address business challenges by developing high-quality, maintainable, and robust code in line with software engineering best practices
- Engage in triaging, analyzing, diagnosing, and resolving incidents, collaborating with others to address root causes
- Identify repetitive tasks within your role and proactively work to eliminate them through appropriate channels
- Comprehend observability patterns and strive to implement and enhance service level indicators, objectives, monitoring, and alerting solutions for optimal transparency and analysis.
- Design, code, test, and deliver software solutions to automate manual operational tasks
- Troubleshoot high-priority incidents, facilitate blameless post-mortems, and ensure the permanent resolution of incidents
- Identify application patterns and analytics to support improved service level objectives. Implement necessary telemetry and observability to monitor and measure service quality in real-time against established SLOs
- Maintain a strong focus on automation and processes, designing, implementing, improving, and utilizing key monitoring tools. Collaborate with SRE, Operations, and Development teams to balance manual operational work with engineering efforts
- Possess a strong understanding of Incident, Problem, and Change Management processes and tools. Participate in Support Rota coverage as needed. Effectively escalate issues and risks across the support framework when necessary
- Supports the adoption of site reliability engineering best practices within your team
Required qualifications, capabilities, and skills
- Formal training or certification on SRE concepts and proficient applied experience.
- Proficiency in one or more technology domains, with the ability to solve complex and mission-critical problems within a business or across the firm. Excellent debugging and troubleshooting skills.
- Proficient in coding with at least one programming language and open to learning modern technologies, such as Python, Java, etc.
- Extensive expertise in the instrumentation, customization, and use of modern monitoring tools like Dynatrace, Grafana, Splunk, AWS, Kubernetes, Geneos, Kafka, MQ, etc.
- Hands-on experience with modern cloud technologies such as AWS, Gaia, etc. Expertise in at least one relational database (e.g., SQL Server, Oracle, DB2).
- Skilled in performance monitoring and capacity management of large systems using various tools. Comfortable working in an Agile environment and proficient in Continuous Integration and Continuous Delivery practices.
- Strong attention to detail and time-management skills. Proficient in Site Reliability Engineering (SRE) concepts, principles, and practices. Proficient with containers or common server operating systems such as Linux and Windows.
- Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision
- Ability to proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
- Ability to identify new technologies and relevant solutions to ensure design constraints are met by the software team
- Ability to initiate and implement ideas to solve business problems
- Certification in programming languages and/or cloud technologies.
- Experience in Custody, Securities, or Trading domains, including areas such as FX Cross Currency, High and Low Value Payments, SWIFT, Real-Time Payments, Trading, Corporate Actions, etc.
- General knowledge of the financial services industry.
About Us
About the Team
Similar Jobs