About the Job
Role: Data Engineer, Enterprise Data, Analytics & Innovation, Digital Innovation
Are you passionate about building scalable data infrastructure and enabling innovation through engineering excellence? As a Data Engineer at Vaniam Group, you will own and evolve the foundation of our data systems, ensuring reliability, scalability, and accessibility across our lakehouse and transactional environments. This role sits at the intersection of engineering and innovation, supporting today’s needs while building the foundation for tomorrow’s products.
A Day in the Life
Lakehouse & Pipelines
-
Design, build, and operate reliable ETL/ELT pipelines in Python and SQL
-
Manage Bronze, Silver, and Gold layers of the Medallion architecture
-
Maintain seamless ingestion from MySQL systems into Vaniam Core
-
Implement observability, data quality checks, and lineage tracking
Data Modeling & Governance
-
Design schemas, tables, and views optimized for analytics, APIs, and products
-
Enforce security, privacy, and compliance standards
-
Maintain documentation for datasets, pipelines, and processes
Integration of New Data Sources
-
Lead integration of third-party, client, and product-generated datasets
-
Harmonize and normalize diverse data (scientific, engagement, operational)
-
Build repeatable onboarding processes for new data streams
Analytics & Predictive Tools
-
Collaborate with innovation, data science, and AI teams
-
Support dashboards, APIs, and decision-support tools
-
Ensure pipelines meet modeling and deployment needs
Reliability & Optimization
-
Monitor execution, storage, and cluster performance
-
Troubleshoot and resolve data pipeline issues
-
Contribute to CI/CD practices and code reviews
What You Must Have
Education & Experience
-
5+ years in data engineering or ETL roles
-
Strong Python & SQL proficiency
-
Experience with lakehouse platforms & Medallion architectures
Skills & Competencies
-
Spark or PySpark
-
Workflow orchestration (Airflow, dbt, etc.)
-
Observability/testing frameworks
-
Docker & Git-based version control
-
Excellent communication & collaboration skills
Nice to Have (Not Required)
-
Experience with Databricks & Microsoft Azure
-
Knowledge of Delta Lake & data catalogs
-
Healthcare, scientific, or engagement data experience
-
Experience exposing analytics via APIs or microservices
The Team
You will work closely with Data Science, AI, product, and innovation teams to prototype and productionize analytics solutions. Together, you’ll transform raw data into client-ready insights that drive measurable impact in oncology and hematology communications.
Why You’ll Love Us
-
100% remote-first environment with local meet-ups
-
Positive, diverse, and supportive culture
-
Mission-driven: serving clients in Cancer and Blood diseases
-
Growth & learning via Vaniam Group University
-
Competitive benefits: medical, dental, vision, 401(k) match, parental leave
-
Work-life balance: Flexible Time Off & Volunteer Time Off
-
Wellness perks: virtual workouts, discounts, EAP access
Salary Range: $110,000 – $125,000 (based on experience, skills, and location)
About Vaniam Group
Founded in 2007, Vaniam Group is a people-first, purpose-driven network of healthcare and scientific communications agencies. We partner with biopharmaceutical companies to unlock the full potential of oncology and hematology innovations. As a virtual-by-design organization, our global team brings expertise and passion to every project. Learn more: www.VaniamGroup.com.