Facebook Data Center Production Operations Manager in Huntsville, Alabama
Facebook's mission is to give people the power to build community and bring the world closer together. Through our family of apps and services, we're building a different kind of company that connects billions of people around the world, gives them ways to share what matters most to them, and helps bring people closer together. Whether we're creating new products or helping a small business expand its reach, people at Facebook are builders at heart. Our global teams are constantly iterating, solving problems, and working together to empower people around the world to build community and connect in meaningful ways. Together, we can help people build stronger communities - we're just getting started.
Facebook is seeking a forward thinking experienced individual to join the Data Center Operations Team. The person should enjoy working in a fast-paced environment where adaptability and flexibility is key to their success. This position is full-time and will be based in North Huntsville, Alabama.
We seek an IT professional with management and leadership experience and advanced hands-on technical skills in Server Hardware, Project Management, Quality Management, Data Analytics, Networks, OS repair, Linux and Automation (ideally in a data center environment). Having depth and breadth knowledge of managing servers in a large-scale distributed environment is a core competency of this individual.
The Production Operations Manager is responsible for managing and maintaining server production including uptime, utilization, systemic technical issues and repairs throughout the Data Center.
Establishing and managing a Data Center Operations Team responsible for the maintenance and operation of server hardware and supporting infrastructure at scale
Responsible for the health of server capacity delivering Facebook’s products and services from the data center site, and for ensuring operational delivery through collaboration and partnership with both remote and local peer organizations
Work with peer organizations and regional teams that affect and deliver services to data center operations such as network operations, project management, facilities/maintenance management, logistics, hardware design, automated tooling and supply chain operations in order to successfully maintain data center capacity to support ongoing business growth
Mentoring and developing engineers and technicians such that they can run daily operations with minimal supervision
Build and lead a diverse, world-class data center operations team, developing both the technical capabilities and leadership qualities of engineers and technicians
Collaborating with other Production Operations Managers in data center sites around the globe to evolve and optimize processes and approaches in a globally consistent way to allow Facebook to scale and grow effectively
Creating and driving a culture of ownership, innovation, collaboration, accountability, and safety. Support and contribute thought leadership to the development and implementation of business practices, process and automated tooling which support the growth and ongoing management of our global data center IT footprint
Manage server upgrades, integration, automated OS provisioning process, rebuilds and other projects as required. Understand and debug network, hardware, and Linux OS related issues
Identify and support the creation of documentation for the global DC knowledge base. Implement process improvements and inform best practices in data center operations
Predicting data center growth and scaling issues before they occur and implement solutions. Deep understanding and ownership of a hyper-scale computing fleet through the use of data trending and analysis to identify trends and systemic issues reporting out globally as required
Drive specifications for tooling and automation that facilitate deployment, monitoring, automated remediation and decommissioning of server hardware at scale
BS, BA or BEng in a technical field or commensurate experience
4+ years experience managing of 5+ technical resources
Knowledge with Linux and hardware systems support in an Internet operations environment
Knowledge with Python, SQL and/or shell scripting
2+ years experience managing multiple projects within the same time schedule
Knowledge of enterprise level infrastructure
Knowledge of out-of-band/lights-out server communication methods, such as IPMI and serial console
Project management experience
Experience training, mentoring, and leading other engineers and technicians
- 4+ years of experience in large-scale data center hardware deployments and building scalable infrastructure
Equal Opportunity: Facebook is proud to be an Equal Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Facebook is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at firstname.lastname@example.org.