About Us:
At Predictive Sales AI (PSAI), we’re redefining how technology and intelligence transform digital marketing. Our AI-powered software enables home services businesses to make smarter, faster decisions—fueling growth through automation, prediction, and precision.
We are seeking a
Data Science Engineer with strong data engineering and MLOps expertise to build scalable, production-grade ML and data platforms that directly impact customer growth and retention.
Job Overview:
As a Data Science Engineer, you will design and operate the data + machine learning foundations behind PSAI’s predictive products. You will build scalable pipelines and robust warehouse/lakehouse models across CRM, marketing, product events, and external datasets — ensuring reliability, accuracy, and business continuity at scale.
This role requires:
- 4+ years in data-centric engineering
- Proven experience deploying ML models via pipelines
- Deep expertise in Python, SQL, and Azure infrastructure
- Architectural ownership through data contracts and resilient modeling
Key Responsibilities:
- Build scalable batch and near-real-time ingestion pipelines using Azure Data Factory, APIs, event streams, and external connectors.
- Develop ML-ready datasets across CRM, marketing automation platforms, product telemetry, and geospatial data sources.
- Design performant, well-modeled warehouse/lakehouse systems in Azure Synapse or Databricks.
- Train and deploy predictive models (lead scoring, churn prediction, forecasting) through reproducible pipelines.
- Build time-aware, leakage-resistant feature pipelines for production ML use cases.
- Support full MLOps lifecycle using Azure Machine Learning, including experiment tracking, model registry, and deployment.
- Implement automated validation, anomaly detection, reconciliation, and monitoring for pipelines and warehouse models.
- Design and enforce data contracts to prevent upstream schema changes from breaking downstream ML workflows.
- Own pipeline SLAs, alerting, incident response, and durable improvements through postmortems.
- Optimize processing for very large datasets (>100GB) through partitioning, incremental loads, distributed compute, and query tuning.
- Improve cost efficiency across compute/storage in Azure environments.
- Maintain clean, testable, production-ready Python codebases using:
- Object-oriented patterns
- Type hinting
- CI/CD workflows via Azure DevOps
- Package models and pipelines using Docker for consistent deployment across dev/staging/prod.
- Communicate architectural trade-offs and technical debt in business terms to Product, RevOps, and leadership.
- Partner with Engineering on instrumentation and scalable data integration.
- Mentor junior engineers through pairing, code reviews, and documentation best practices.
Desired Traits:
We are looking for an individual who is organized, proactive, and detail-oriented. In this role, you will work closely with teams across the company. Here’s what we’re looking for:
- Ownership mindset with a reliability-first approach
- Strong SQL/Python and a high attention to data quality
- Scales systems thoughtfully (performance/cost aware, maintainable designs)
- Collaborative communicator across engineering, RevOps, and analytics
- Documents well and supports others through reviews/mentorship
Required Skills and Experience:
- Preferred Master’s degree in Data Science, Computer Science, Statistics, Engineering, or a closely related quantitative field.
- 4+ years in data engineering, ML engineering, or data platform development.
- Minimum 2 years deploying ML models into production workflows.
- Experience building pipelines and warehouse systems at scale (>100GB datasets).
- Demonstrated adaptability in fast-changing technical and business environments.
- Python (Expert): pandas, polars, scikit-learn; PyTorch, transformers; production engineering (OOP, testing, typing)
- SQL (Expert): advanced analytics, recursive CTEs, query tuning, Azure Synapse optimization
- Azure Data & ML Stack: Data Factory (ETL/ELT), Azure ML (MLOps), Key Vault, Databricks/Spark, Docker deployment
- Distributed & Large-Scale Compute: Spark, Ray, Dask; GPU acceleration with RAPIDS (plus)
- Geospatial & Specialized Data: GeoPandas, Shapely, rasterio
- AI Automation & LLMs: LangChain/Semantic Kernel, agentic workflows
- DevOps & CI/CD: Azure DevOps pipelines, Gitflow, rebasing, clean version control
Why Join Us?
- Innovative Environment: Be part of a forward-thinking company that values creativity and encourages the exploration of new ideas.
- Professional Growth: Access opportunities for continuous learning and career advancement within a supportive and dynamic team.
- Comprehensive Benefits: Enjoy a competitive salary, performance-based bonuses, flexible work arrangements, and a robust benefits package.
- Collaborative Culture: Work in a team-oriented environment where collaboration and mutual respect drive our success.
If you're ready to be part of an innovative, growth-oriented team, apply today!