ECS is seeking a DevOps Manager to work in our Fairfax, VA office (Hybrid). Please Note: This position is contingent upon contract award.
Job Description:
ECS is seeking talented professionals who love a challenge to join us in building the next-generation Continuous Diagnostics and Mitigation (CDM) Cyber data solution. The CDM Program is the Cybersecurity and Infrastructure Security Agency's (CISA) dynamic approach to strengthening the cybersecurity of Federal networks and systems through better awareness and visibility into their security posture and cyber threats. ECS is responsible for designing, building, deploying, operating, and maintaining a complete 'Data Services' solution which includes the collection, normalization, visualization, and sharing of cyber data from more than 100 Federal agencies. The CDM Data Services product is a cloud-hosted solution comprised of multiple Commercial Off the Shelf (COTS), software configuration packages, and custom code which work together to operate as an integrated solution tailored to meet DHS requirements.
We are seeking professionals who thrive in a dynamic, fast-paced, and highly collaborative environment where problem-solving, critical thinking, and a holistic approach to serving the mission are key. Our program operates within the Scaled Agile Framework (SAFe). An aptitude and enthusiasm for continuous learning, improvement, and cyber security is a must!
The DevOps Manager role encompasses two closely related disciplines for our program: Release Engineering and Site Reliability Engineering. Release Engineering is accountable for producing a repeatable process for building and deploying solutions. Site Reliability Engineering ensures the reliability, availability, and performance of our critical production environments. The successful candidate will work closely with development and operations teams to implement best practices in DevOps, automate infrastructure, and maintain scalable and resilient systems.
The successful candidate will design, implement, and maintain systems that are resilient, highly available, and performant. They will set up comprehensive monitoring and logging systems using the Elastic Stack, Prometheus, Grafana, and other tools to ensure the continuous performance of services. Additionally, they will respond to incidents, perform root cause analysis, and implement solutions to prevent recurrence.
The DevOps Manager will be responsible for developing and managing infrastructure as code (IaC) using tools like Terraform and CloudFormation. They will also design and implement CI/CD pipelines to enable reliable and repeatable processes for building, packaging, releasing, and deploying software. This work requires close collaboration with software engineers to integrate reliability and observability into the software development lifecycle.
Continuous improvement is a key focus. The DevOps Manager role requires a focus on continuous improvement, identifying areas for enhancement, and driving initiatives to improve system reliability, scalability, and efficiency. Responsibilities include creating and maintaining detailed documentation, providing training to team members on reliability best practices, and ensuring that the team is well-equipped to maintain the high standards set for system performance and reliability. Ensuring that systems adhere to security policies and compliance requirements is also crucial.
Leadership and team management are core aspects of this role. The DevOps Manager will lead, mentor, and manage a team of DevOps Engineers and Site Reliability Engineers, fostering a culture of continuous improvement and professional growth. Regular performance evaluations, constructive feedback, and career development support for team members are essential.
Required Skills:
Subscribe to job alerts and upload your resume!
*By registering with our site, you agree to our
Terms and Privacy Policy.