Critical Environment Program Manager - Operations Specialist
San Antonio, TX 
Share
Posted 12 days ago
Job Description
OverviewAs a Critical Environment Program Manager - Operations Specialist (CEPM-Ops) in Microsoft's Cloud Operations & Innovation (CO+I) team, you will play an integral role in delivering best-in-class cloud services that meet the highest standards of performance, reliability, safety, and security. You will ensure the smooth operation, maintenance, and repair of critical infrastructure that keeps our datacenters up and running for 24 hours a day, 365 days a year. A key component of the role involves interpreting technical documents, such as one-lines, engineering documentation/designs, site layouts, and building plans to provide technical support, manage incidents, and coordinate projects at datacenters. You will influence process-related decisions by providing insights, recommendations, and guidance on best practices to identify and incorporate cost-effective and scalable solutions. You will promote a culture of continuous improvement and ensure operational readiness by shaping the Procedure Program, managing the Fuel and Battery Programs, and leading the Critical Environment (CE) Drill Program to enhance preparedness, response, and growth of the capabilities of the Critical Environments team. The CEPM-Ops Specialist reports directly to the CE Operations Manager and will assist regularly in reporting on critical infrastructure, inclusive of operational OKRs and maintenance KPI metrics, organizational policies, procedures, and standards. This role requires technical, interpersonal, communication, and organizational skills to effectively engage with stakeholders and convey information clearly. Microsoft's Cloud Operations & Innovation (CO+I) is the engine that powers our cloud services. As a CO+I Critical Environment Program Manager - Operations Specialist, you will perform a key role in delivering the core infrastructure and foundational technologies for Microsoft's online services including Bing, Office 365, Xbox, OneDrive, and the Microsoft Azure platform. As a group, CO+I is focused on the personal and professional development of all employees and offers trainings and opportunities including Career Rotation Programs, Diversity & Inclusion trainings and events, and professional certifications. Our infrastructure is comprised of a large global portfolio of more than 200 datacenters in 32 countries and millions of servers. Our foundation is built upon and managed by a team of subject matter experts working to support services for more than 1 billion customers and 20 million businesses in over 90 countries worldwide. With environmental sustainability and optimization at the forefront of our datacenter design and operations, we continue to grow and evolve as we meet the ever-changing business demands that hold Microsoft as a world-class cloud provider. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
ResponsibilitiesData Center Operations / Service Delivery:Establish solid working relationships with customers and vendors. Foster positive cross-group collaboration.Oversee the day-to-day operations and timely execution of maintenance and repairs for mechanical and electrical equipment in a data center.Coordinate with Critical Environment Technician Managers who execute the planned preventative and corrective maintenance of electrical and mechanical equipment.Utilize CMMS and/or similar maintenance management systems.Oversee operational monitoring processes and performance of electrical plant and machinery to mitigate risks and safeguard against potential failure/loss.Coordinate, plan, schedule, and supervise Critical Environment audits and compliance verifications.Compliance Tasks - Support FedRAMP, Tier II Reporting, and other assessments.Standards Compliance - Ensure compliance to all local and global standards, perform spot checks and audits. Monitor progress and inspect completed maintenance and repairs to ensure the required standards are achieved.Work Observations - Conduct field observations for critical procedures and vendor quality assessments.Ensure that electrical, mechanical, and/or fire/life safety equipment within the datacenter is operating at peak efficiency.Ensure compliance with infrastructure operations standardization. Act as an escalation point for all facilities-related issues within the datacenter, escalating to the CE Operations Manager. This may include assisting technicians in times of emergencies, job training, and providing input/recommendations on electrical/mechanical design parameters.Manage small-to-medium projects from conception to completion.Support data center Engineers to drive cost/energy efficiency projects and implement TSB (Technical Service Bulletins) and integrate with maintenance schedule.Identify and escalate single points-of-failure and provide remediation of vulnerabilities (such as single points of failure). Provide recommendations on new data center equipment designs, technologies, and construction methods.Attend and participate in routine meetings, incident and engineering bridges, and safety briefings.Act as signatory authority for verification of level of knowledge (LOK) for CE Technician qualifications in the Critical Environment Qualification program.Create request for change tickets and brief the Change Advisory Board to review proposed work and gain approval. Ensure compliance with local and global requirements.Coordinate repairs for incident and/or hardware repair items. Track completion of repairs following an incident.Battery Program: Oversee battery health at data centers and ensure continuous operations. Reduce Mean Time to Repair/Mean Time to Mitigate for battery issues and coordinate battery refreshes.Fuel Program: Order fuel and monitor fuel levels. Establish fuel reorder setpoints and periodicities.Procedure Program: Ensure quality of Method Statement of Work/Method of Procedure authoring and alignment with standards. Maintain version control in central repository and renew procedures before expiration date.CE Drill Program: Develop and execute drills. Develop disaster scenario plans (DSP) based on real life scenarios and anticipate full system response. Assess effectiveness of escalation, evaluate team response, and identify re-training opportunities.Data Center Work Environment:Share best practices, assist others in learning processes and procedures.Promote a culture of safety, security, and compliance in all aspects of datacenter activities.Realize the impact of change on others.Embody the One Microsoft culture and values. Lead through change by bringing clarity, generating energy, and delivering success.Be agile and remove barriers to enable the team to shift priorities quickly without losing productivity.Support and uplift morale of employees.Embody our culture and values

 

Job Summary
Company
Start Date
As soon as possible
Employment Term and Type
Regular, Full Time
Required Experience
Open
Email this Job to Yourself or a Friend
Indicates required fields