Senior Support Engineer (T3)
Description
This is the senior-most individual contributor support role. As the T3 escalation owner, you sit at the intersection of field support, solution engineering, and R&D. You own the hardest cases — those that require deep cross-domain expertise across Kubernetes edge deployments, enterprise network security, Linux infrastructure, and compliance-driven environments.
You won’t just close tickets. You’ll drive architectural resolution, shape playbooks, influence product direction, and be the technical authority customers trust when things go wrong at 2 AM. You’ll work closely with utility and telecom customers — including regulated environments governed by NERC CIP, IEC 62443, and SOC-2 — and engage directly with vendors and system integrators to unblock delivery.
This is a role for someone who is equally at home in a Kubernetes operator log, a network packet trace, a firewall policy audit, and a customer’s war room.
WHAT YOU'LL DO
Escalation & Incident Ownership
· Own all T3 escalations end-to-end: triage, reproduction, root cause isolation, fix or workaround, and full closure documentation
· Lead war-room bridges for critical customer incidents across complex, multi-party environments involving OEMs, SIs, and customer NOCs
· Deliver Root Cause Analyses (RCAs) within SLA, with actionable preventive measures and clear customer-facing summaries
· Act as the technical backstop for T1/T2 engineers — validate hypotheses, unblock complex cases, and transfer knowledge back down the stack
Kubernetes & Edge Infrastructure
· Operate and troubleshoot production Kubernetes deployments across distributions including k3s, OCP (OpenShift Container Platform), and MicroShift — including air-gapped, edge-deployed nodes in constrained environments
· Own the full Kubernetes lifecycle: workload scheduling, rolling upgrades, rollback strategies, resource management, and cluster health across heterogeneous node types
· Debug storage layer issues: CSI driver failures, persistent volume lifecycle (provisioning, expansion, degradation, recovery), and stateful workload failures across databases and message brokers
· Troubleshoot ingress and internal service networking: layer 7 routing, load balancer behavior, DNS resolution, and certificate management across multi-cluster topologies
· Manage container runtime behavior: image lifecycle, pull policy, multi-architecture manifests, and registry integration in air-gapped and connected environments
· Harden Kubernetes workloads: security context constraints, SELinux policy integration, EDR coexistence, and least-privilege pod configuration
· Support bare-metal provisioning pipelines for edge nodes: OS deployment, storage configuration, and post-install automation across RHEL-family/Ubuntu and mixed-OS environments
· Operate and troubleshoot workloads deployed across hybrid topologies spanning on-premises edge clusters and cloud environments (AWS, GCP); understand cloud networking primitives — VPCs, security groups, IAM, and managed Kubernetes services
Network Security & Infrastructure
· Troubleshoot enterprise firewall deployments (Palo Alto, Check Point, FortiNet): policy evaluation, dynamic identity-based groups, API-driven automation, and traffic visibility gaps
· Own IP networking escalations across the stack: routing protocol behavior (OSPF/BGP), NAT traversal, VLAN segmentation, MTU and fragmentation issues, and inter-zone traffic flows
· Support NAC and identity-aware access control integrations: Cisco ISE, FMC policy synchronization, and device compliance enforcement
· Configure and troubleshoot network traffic capture infrastructure: TAP interfaces, GRE and ERSPAN tunneling, and span/mirror port setups for passive visibility into edge and core traffic
· Troubleshoot MPLS and L2 circuit configurations, overlay networking, and WAN path behavior in hybrid edge/cloud topologies
· Operate declarative OS-level network configuration tooling across mixed-OS node pools in multi-cluster environments
Security & Secrets Management
· Support production-grade secrets management deployments: HashiCorp Vault and External Secrets Operator (ESO) — including unsealing, policy management, and dynamic credential rotation
· Troubleshoot hardware-backed secrets sealing on air-gapped edge nodes using TPM 2.0 and associated tooling
· Work with encrypted object stores and audit logging pipelines tied to compliance mandates
· Support enterprise IdP integrations: Keycloak realm administration, OAuth2 proxy configurations, LDAP/AD federation, and SSO flows
· Engage with compliance-sensitive customers on NERC CIP-007, SOC-2, IEC 62443, and NIST SP 800-53 requirements — including generating audit evidence, access control documentation, and control gap analysis
Observability, Telemetry & Documentation
· Operate and extend OTEL collector pipelines for host-level and application metrics; configure exporters, receivers, and processors across edge and cloud environments
· Build and maintain Prometheus/Grafana alert stacks for edge Kubernetes infrastructure and network components
· Use structured log aggregation platforms (ELK/OpenSearch) to correlate events across distributed, multi-vendor deployments
· Own support content creation: author technical how-to guides, configuration references, troubleshooting trees, and customer-facing documentation for complex, multi-component features
· Maintain and continuously improve the internal knowledge base — ensuring articles stay accurate as the product evolves and new deployment patterns emerge
· Develop playbooks and runbooks that reduce MTTR and enable T1/T2 self-service resolution
Private Cellular — Working Knowledge
· Understand private LTE/5G network topology well enough to scope, triage, and route cellular-related incidents: UE connectivity issues, core component health, and RAN integration behavior
· Support SIM and device identity workflows: provisioning pipelines, lifecycle management, and integration with network access control
· Coordinate with carriers and RAN vendors on interoperability issues; escalate with appropriate technical context
Customer & Cross-Functional Engagement
· Serve as the senior technical point of contact for strategic enterprise customers in utility and telecom verticals
· Deliver crisp, technically authoritative briefings to R&D and Product: reproduction steps, log and packet capture artifacts, relevant configurations, blast radius assessment, and recommended next actions
· Participate in pre-sales and solution engineering engagements as the technical authority on deployment complexity and supportability
· Feed structured bug reports, usability gaps, and field-observed patterns into product and engineering backlogs
Requirements
MANDATORY QUALIFICATIONS
· Kubernetes operations: Production-grade experience with k3s, OCP, or upstream Kubernetes — workload management, CSI, CNI, ingress, upgrades, rollbacks, and security hardening
· Linux systems: Deep RHEL/Rocky/Ubuntu fluency — systemd, SELinux, OS networking stack, storage subsystems, and bare-metal provisioning
· Network security: Hands-on with enterprise firewalls (Palo Alto, Check Point), NAC solutions (Cisco ISE), and IP routing (OSPF/BGP, NAT, VLANs)
· IP networking: Strong protocol-level understanding — routing, switching, overlay networks, DNS, and packet-level analysis using traffic capture tools
· Version control: Working proficiency with Git — branching, pull requests, conflict resolution, and using repositories as the source of truth for configuration, scripts, and documentation
· Customer-facing delivery: Proven experience owning customer escalations, delivering RCAs, and coordinating multi-party resolutions in regulated or mission-critical environments
· Compliance exposure: Working knowledge of at least one: NERC CIP, IEC 62443, SOC-2, or NIST SP 800-53
NICE-TO-HAVE
· Private cellular basics: Sufficient familiarity with LTE/5G core concepts to triage incidents and coordinate with telecom stakeholders
· Secrets management: HashiCorp Vault, OpenBao, CyberArk Conjur, External Secrets Operator
· Cloud platforms (AWS, GCP): Managed Kubernetes, VPC networking, IAM, object storage, and hybrid connectivity
· TAP & tunneling: GRE/ERSPAN configuration and passive traffic capture infrastructure
· Observability tooling: OTEL collector pipeline design, Grafana Alloy, Prometheus alerting
· NAC integrations: Cisco ISE/FMC, pxGrid 2.0
· Air-gapped deployments: OpenShift, k3s, or MicroShift in disconnected environments
· Automation & scripting: Bash, Python
· Event-driven messaging: NATS, Kafka on Kubernetes
· TPM 2.0-based secrets sealing: Hardware-backed unsealing for air-gapped nodes
eSIM/eUICC: Provisioning standards or SIM lifecycle management exposure