ExxonMobil

Senior HPC Systems Engineer

Posted on: 10 Feb 2022

Houston, TX

Job Description

Job Role Summary

The HPC Systems Engineer role has the overall responsibility to work within a team to provide a performant, reliable, and secure high performance computing (HPC) environment.  The HPC Systems Engineer will be involved in various aspects of designing and engineering our HPC system as well as be responsible for managing day-to-day operations and maintenance activities including, but not limited to the following: general troubleshooting of any issues that may arise, monitoring overall system health, performing system maintenance tasks, and evaluating new hardware/system software.  

Primary Job Functions

Establish strategies for overall support of the system
Evaluate new hardware and software and understand potential benefits/impacts it can have in the environment
Perform hardware maintenance
Perform software installations and upgrades; inclusive of operating system
Monitor overall system performance and health
Be available periodically for on-call support and weekend maintenance activities
Provide support for the management of data in the environment
Work with users to resolve problems and ensure they are able to effectively utilize the system
Interact with both business customers and technical teams that are globally distributed and within varied time zones
Engaging with vendors for problem resolution of existing infrastructure and discussion of roadmaps and new technologies for evaluations 
Foster a supportive work environment and maintains open, productive interactions among team and across organizations
Build and maintain cross-organizational contacts to facilitate execution of work

Job Requirements

B. S. in Computer Science or related degree area (e.g. Computer Engineering, Information Systems) or equivalent skills work experience
Excellent technical, analytical, and communication skills
A minimum of 3 years of hands-on Linux experience (e.g. RHEL, CentOS) and production infrastructure support (e.g. networking, storage, monitoring, compute)
Experience in system administration and technical support (e.g. installation, configuration, maintenance, upgrade, retirement, problem resolution)
Experience in HPC technologies such as parallel/distributed files systems (e.g. Lustre, GPFS), high speed interconnect fabrics (e.g. Infiniband, Omni-Path), and HPC batch scheduling software suites (e.g. PBSPro, SLURM)
Proficiency in technical writing and documentation of solutions
Works well in a team environment
Self-motivated

Preferred Knowledge/Skills/Abilities

Strong IT skills in infrastructure and applications
Experience with supporting large scale production environments
Experience in implementing changes and security controls in a global framework
Understanding of data center operations fundamentals in networking, cooling, and power
Knowledge and experience with installing/compiling vendor and open source software
Knowledge and experience with application/infrastructure deployment and support in one or more of the major cloud environments

ExxonMobil

Irving, TX

Exxon Mobil Corporation explores for and produces crude oil and natural gas in the United States, Canada/Other Americas, Europe, Africa, Asia, and Australia/Oceania. It operates through Upstream, Downstream, and Chemical segments. The company is also involved in the manufacture, trade, transport, and sale of crude oil, petroleum products, and other specialty products; and manufactures and markets petrochemicals, including olefins, polyolefins, aromatics, and various other petrochemicals. As of December 31, 2018, it had approximately 24,696 net operated wells with proved reserves of 24.3 billion oil-equivalent barrels. The company was founded in 1870 and is headquartered in Irving, Texas.

Similar Jobs