What You"ll Do
Design and implement solutions that enhance application reliability, performance, scalability, and resilience.
Build and maintain monitoring, alerting, observability, and telemetry to drive proactive detection and incident analysis.
Lead incident management efforts, perform root cause analysis, and implement action based improvements.
Implement operational workflows using scripting, IaaC, and configuration management tools.
Manage capacity, performance, and scaling solutions to forecast demand and optimize infrastructure.
Collaborate with engineering teams to embed operability, resilience, and security into application architectures.
Build and automate reliable deployments through CI/CD pipelines, release governance, and version control systems.
Maintain clear runbooks, architecture diagrams, and operational documentation that enable efficient production support.
Experience Required
Managing Kubernetes and containerized workloads (EKS, AKS, GKE), including scaling, networking, upgrades, and monitoring.
Experience with cloud platforms (AWS, Azure, or Google Cloud Platform) across compute, storage, networking, IAM, and cost governance.
Using observability and APM tools such as Dynatrace, Splunk, Prometheus, Grafana, Datadog, Elastic/ELK.
Strengthening security and compliance controls in regulated environments (e.g., PCI DSS, SOC 2), including secure management of workloads.
Infrastructure automation experience using Terraform, CloudFormation, Ansible, or similar tools.
Designing and maintaining CI/CD pipelines using Jenkins, GitLab CI, GitHub Actions, or Azure DevOps.
Scripting and automation using Bash, PowerShell, or Python.
Experience in environments of electricity, engineering, or military related background (preferred).
Good to Have
Certifications such as AWS SysAdmin, AWS DevOps Engineer, Google Cloud DevOps Engineer, or CKA.
Experience with legacy applications, IBM iSeries, and/or library systems.
Hands on database operations and performance tuning (Oracle, SQL Server, PostgreSQL).
Prior experience as a major incident commander, stakeholder communicator, or ops lead/coordinator.
Experience with ITIL and ServiceNow (change, incident, and configuration management).
...healthcare provider committed to delivering compassionate, cutting-edge medical care. We are looking for a dedicated and skilled Radiation Oncologist to join our multidisciplinary team, which is known for excellence in oncology services and comprehensive patient care....
...seeking an experienced and hands-on Logistics Supervisor to lead and manage all daily receiving and/or shipping... ..., offsite warehouse, packaging supplies and control of inventory. Directly... ...Associate's or Bachelor's degree in Supply Chain Management, Logistics, or a related...
...manufacturing, join us! Axion Ray is looking for a driven Senior Technical Recruiter to partner with our VP of Engineering in building a world-... ...Strategic Partner: Collaborate with hiring managers and executive leadership to calibrate on talent needs, provide insights,...
...behind the world's most recognizable car battery brands, powering vehicles from leading automakers like Ford, General Motors, Toyota, Honda, and Nissan. With 18,000 employees worldwide, we develop, manufacture, and distribute energy storage solutions while recovering,...
...Capital Development has an open position for a full time Senior Financial and Accounting Analyst. This individual must have good understanding of solar or construction industry and be very comfortable working in a dynamic and demanding environment with challenges and...