Helfie is on a mission to redefine global preventative healthcare. We’re pioneering an entirely new operating system for human health – empowering billions with instant, affordable health checks and AI-driven insights. Our vision is to build trust, drive organic engagement, and make preventative healthcare a reality for all 8 billion of us.
Join us as we revolutionise the healthcare landscape and make early detection and prevention accessible to everyone.
About the role:
We are seeking a highly skilled hands‑on Data Engineer to join our team in Melbourne and drive the technical delivery of highly scalable distributed solutions for Helfie.ai, our AI‑powered preventive healthcare platform.
Position Overview
The Data Engineer will play a pivotal role in realizing Helfie.ai’s vision of a globally‑scalable preventive‑healthcare platform. In this position, you will be responsible for the architecture, implementation, and maintenance of end‑to‑end data pipelines that support real‑time analytics, AI insights, and robust reporting across mobile, web, and backend services.
You will collaborate closely with cross‑functional teams—including product, engineering, and DevOps—to ensure that data flows are reliable, secure, and compliant with the company’s stringent governance standards. This role is critical for turning raw telemetry and health‑related data into actionable intelligence that drives product innovation and customer success.
Key Responsibilities
- Data Pipeline Design & Implementation– Engineer scalable, fault‑tolerant ETL/ELT workflows that ingest, transform, and load data from diverse sources (mobile app events, web endpoints, microservice logs, external APIs) into consumable formats for downstream analytics, dashboards, and AI models.
- Infrastructure as Code– Build and manage data‑centric infrastructure (Kubernetes‑deployable jobs, storage clusters, messaging queues) using Terraform, Docker, and Helm, aligning with the organisation’s cloud‑agnostic approach.
- Observability & Monitoring– Integrate monitoring, logging, and telemetry (Prometheus, Grafana, ELK) to surface pipeline health, performance bottlenecks, and data quality metrics, and create automated alerts for operational issues.
- Security & Compliance– Enforce data‑level encryption, access controls, and audit trails in line with the company’s Sec‑DevOps practices; collaborate with the security team to ensure compliance with privacy regulations (e.g., GDPR, HIPAA).
- Data Modeling & Governance– Define relational and graph schemas for essential domains (patient records, device telemetry, AI model inputs/outputs), and maintain a living data catalogue that supports self‑service analytics.
- Performance Optimization– Profile and tune data transfers, aggregate queries, and batch jobs; implement partitioning, indexing, and caching strategies to meet real‑time dashboard and AI inference requirements.
- Collaboration & Knowledge Transfer– Partner with product owners to translate feature requests into data artefacts, and mentor junior engineers on best practices for data engineering, DevOps, and cloud operations.
- Continuous Improvement– Stay abreast of emerging data technologies, evaluate prototypes, and iterate the data architecture to support new feature lines or increased data velocity.
Required Skills & Experience
- 5+ years of professional experience as a data engineer or related role in distributed systems.
- Proven expertise in building pipelines with Spark/Beam, Flink, or equivalent big‑data frameworks; hands‑on experience with Kafka, Pulsar, or similar event stores.
- Strong scripting proficiency in Python, Bash, and/or PowerShell; familiarity with DevOps toolchains (Git, CI/CD, container orchestration).
- Demonstrated ability to design and deploy Kubernetes workloads at scale, including job orchestration and resource management.
- Proficiency with IaC (Terraform, Pulumi) and automation of cloud resources; experience with both on‑prem and multi‑cloud environments.
- Solid understanding of data governance principles, including data lineage, provenance, and privacy controls.
- Experience with monitoring/observability stacks (Prometheus, Grafana, ELK) and automated alerting.
- Excellent communication skills with the ability to translate complex data concepts to technical and non‑technical stakeholders.
- Mentoring mindset and a collaborative approach to cross‑functional problem solving.
Preferred Experience
- Advanced knowledge of graph databases (Neo4j, JanusGraph, GraphDB) or real‑time analytics platforms (Apache Flink, Snowflake).
- Familiarity with machine‑learning pipelines and integration of AI models into production workflows.
- Regulatory exposure (healthcare, finance) that necessitates strict auditability and compliance tracking.
- Prior exposure to building data lakes or lakehouse architectures on Amazon S3, Azure Data Lake, or equivalent.
- Experience leading a small engineering squad or acting as a technical lead in a cloud‑native environment.
- Exposure to Open‑Source projects would be positively assessed.
- Agile product development with a heavy emphasis on DevOps automation, continuous delivery, and metric‑driven decision making.
- Inclusive, fast‑paced culture that rewards ownership, experimentation, and continuous learning.
Equal opportunity statement
Helfie.ai is an Equal Opportunity Employer. We will not discriminate on the basis of age, disability, sex, race, religion or belief, gender reassignment, marriage / civil partnership, pregnancy or sexual orientation. We encourage applications from a wide range of candidates and selection for roles will be based on individual merit alone.