About Peraton
Peraton is a next-generation national security company that drives missions of consequence spanning the globe and extending to the farthest reaches of the galaxy. As the world’s leading mission capability integrator and transformative enterprise IT provider, we deliver trusted, highly differentiated solutions and technologies to protect our nation and allies. Peraton operates at the critical nexus between traditional and nontraditional threats across all domains: land, sea, space, air, and cyberspace. The company serves as a valued partner to essential government agencies and supports every branch of the U.S. armed forces. Each day, our employees do the can’t be done by solving the most daunting challenges facing our customers. Visit peraton.com to learn how we’re keeping people around the world safe and secure.
Responsibilities
We are hiring! We are looking for a Senior/Lead Cloud Architect/Engineer. This qualified individual will lead initiatives and build processes to reduce the duration, frequency, and impact of issues. He/She will also spend a portion of your time directing the resolution of high visibility incidents by leading collaborative deep dive root cause analysis efforts to establish suggested technical changes in the environment. Using data learned from those incidents they will drive continuous improvements into the automation, tooling, and processes so that the next event is shorter or avoided entirely.
The Cloud Architect/Engineer will have the technical skills to join a diverse team of AWS Site Reliability Engineers, Automation Engineers, Testing Engineers, Network Engineers, Database Engineers, Cloud Operations Engineers, Tools Engineers, Application Developers, and Security experts.
Key job responsibilities:
- Be a key technical leader to drive the technical resolution of large-scale critical customer impacting issues for AWS cloud-based issues that include DevSecOps, application development, networking, database, tools, and security on a 24/7 basis.
- Lead the real time resolution of critical technical issues AND lead the follow-on root cause analysis and ongoing monitoring of the technical environment to proactively prevent potential issues.
- Facilitate customer and multi stakeholder technical incident/bug fix conference calls and utilize automation and tools during the issue to quickly establish a response, diagnosis and mitigation actions to address the event and reduce the outage or impact time.
- Understand the technical and business environment to be able to assess the true impact and risks and be able to quickly provide executive level communication regarding the status of an event.
- Understand, in detail, the stakeholders engaged in supporting the enterprise and which teams or resources need to be engaged in the incident troubleshooting to quickly mitigate or resolve the issue.
- Identify and troubleshoot recurring enterprise or cloud platform issues and own initiatives to drive improvements.
- Monitor and manage communications during high impact events via relevant channels.
- Prioritize, manage, and own emerging and developing customer issues from start to finish.
- Utilize enterprise domain expertise on monitoring and alerting tools & practices.
- Facilitate post-incident reviews to assess response effectiveness, establish success criteria and improve processes.
- Facilitate detailed technical root cause analysis sessions with the customer and multiple stakeholders to document detailed solutions to prevent similar incidents or to reduce the potential impact for subsequent incidents.
- Conduct continuous real-time proactive monitoring of customer metrics.
- Identify areas where standardization of process will aid incident response.
- Ensure standards are maintained within the technical engineering teams.
- Identify inefficiencies in team process and create plans to address them.
- Analyze data to report on issue trends, patterns and insights to inform and shape future technology strategies.
- Create and review documentation as appropriate such as documentation for incident events and actions taken, root cause analysis and actions needed, design new standard procedures for mitigations, etc.
- Participate in Agile sprints to evolve business processes, technologies, and drive optional improvements.
Qualifications
Required Qualifications:
- Minimum of 12 years with a Bachelors Degree; Additional 4 years of experience maybe accepted in lieu of the degree
- 10+ years of experience building/operating on AWS, networking, IT Security tools, databases and tools
- AWS Solutions Architect Certification
- Hands-on technical expertise in DevSecOps, security, automation, application development, databases, implementation, integration, and release management.
- Must have a strong, outgoing, take-charge leadership presence and communication skills to effectively work with customers at all levels of their organization. This is a key requirement of the position.
- Ability to work with Terraform for infrastructure as code solutions.
- Strong understanding of identity access management roles, permissions, etc.
- Experience in using tools in an AWS Cloud environment (CloudWatch, OpenSearch) and security tools security tools such as Aqua, DLP, etc.)
- Strong hands-on experience with Kubernetes-based containerized applications (AWS EKS).
- Experience with Gitlab and or Jira.
- Experience with tools to be used during real time monitoring, alerting, diagnostics during incident.
- Experience with reporting tools such as Tableau and SQL preferred
- Coding proficiency in at least one software language (e.g., Python, C, C++, Java, Ruby, or PowerShell).
- Must be a self-starter and able to execute at both a tactical and strategic level – with a strong attention to detail.
- Strong analytical acumen, solid technology experience, business judgment, and an ability to dive deep to solve complex problems.
- Must have the leadership presence and communication skills to effectively work with customers at all levels of their organization.
- Ability to communicate in writing and verbally complex technical matters clearly and concisely
- Ability to manage communication for critical incidents in an impromptu manner and ensure detailed technical solutions are actioned, assigned with timelines resulting in timely resolutions.
- Ability to facilitate multi-stakeholder technical root cause sessions, provide a roadmap for future changes and lead the implementation of those changes.
- Experience with PaaS (Platform as a Service) environments.
- Knowledge of network administration and management principles.
- Familiarity with remote access software including SFTP, RDP (Citrix) and PCoIP/DCV (AWS Workspaces) (AWS Session Manager) protocols.
- Strong analytical skills with attention to detail in problem-solving scenarios.
- Must be a US Citizen
- Must be able to obtain and maintain a Public Trust clearance
Preferred Qualifications:
- AWS Certified Security - Specialty Certification
- Possess a passion and desire for leading the resolution of critical incidents.
Benefits:
At Peraton, our benefits are designed to help keep you at your best beyond the work you do with us daily. We’re fully committed to the growth of our employees. From fully comprehensive medical plans to tuition reimbursement, tuition assistance, and fertility treatment, we are there to support you all the way.
#LI-ET1
Target Salary Range
$135,000 - $216,000. This represents the typical salary range for this position based on experience and other factors.
SCA / Union / Intern Rate or Range
EEO
An Equal Opportunity Employer including Disability/Veteran.