Be wary of WhatsApp messages impersonating Jobline Resources's staff offering job opportunities. Those who encounter suspicious messages can contact Jobline at +65 6339 7198

Responsibilities

Own and evolve Datadog-based observability platform—collection, pipelines, analytics, alerting, dashboards, and SLOs—to deliver real-time visibility and faster incident response.

As a secondary capability, apply asset-discovery knowledge to publish high-quality discovery feeds and support the CMDB team with accurate, timely inventory data. This is a Tier-0 role (admin by FTE only).

1)  Datadog platform engineering (primary) 
  • Operate Datadog orgs/projects, RBAC, log pipelines/indexes/archives, metrics, traces/APM, Synthetics, RUM, and DBM at enterprise scale.
  • Drive tagging standards and ownership metadata to enable service-aligned dashboards and alert routing.
  • Optimize cost/performance (sampling, routing, tiering/archives, retention, metric cardinality).

2)  Monitoring-as-Code (MaC) & CI/CD (primary)
  • Define monitors, dashboards, SLOs, synthetics, notebooks, service catalog entries, and RBAC as code using Terraform/OpenTofu (Datadog provider) and datadog-ci.
  • Build gated pipelines: linting, query/unit tests, cost/volume guardrails, PII/residency checks, drift detection, and promotion (dev → staging → prod) with automated rollback.
  • Maintain change evidence (who/what/when), versioning, and approvals; rotate tokens/secrets via vault.

3)  Telemetry ingestion & data quality (primary)
  • Engineer unified ingest via Datadog Agent, APIs, and gateways; integrate OpenTelemetry where appropriate.
  • Enforce schema contracts and mandatory tags (e.g., service, env, tier, owner, cost_center); implement validation, deduplication, lineage, and freshness checks.

4)  Asset discovery support for CMDB (secondary)
  • Apply discovery expertise across datacenter/VM, containers/K8s, multi-cloud (AWS/Azure/GCP), network devices, endpoints, and key SaaS.
  • Publish curated discovery feeds (coverage, freshness, deltas) and support reconciliation/exception workflows.

Requirements

  • 6–10+ years in Observability/SRE/Platform Engineering; deep, hands-on expertise with Datadog (logs, metrics, traces/APM, Synthetics, RUM, DBM).
  • Proven Monitoring-as-Code experience with Terraform/OpenTofu (Datadog provider) and datadog-ci; strong Git/GitOps, CI/CD (e.g., GitHub Actions/Azure DevOps).
  • Automation proficiency (Python/PowerShell); YAML/JSON schema design; API integration.
  • Experience with tagging schemes, schema/version management, lineage, and cost governance.
  • Exposure to asset discovery patterns and how discovery feeds support CMDB reconciliation.
  • Comfortable operating Tier-0 platforms with audit rigor.

Preferred qualifications
 
  • Mixed-estate exposure (on-prem/VMware, K8s, AWS/Azure/GCP, network, endpoints, SaaS).
  • Building self-service onboarding patterns (API/CLI/portal) with policy gates.
  • SLO/burn-rate alerting and service catalog adoption at scale.