Data Engineer with Expert-level SQL Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way youu2019d like, where youu2019ll be supported and inspired bya collaborative community of colleagues around the world, and where youu2019ll be able to reimagine whatu2019s possible. Join us and help the worldu2019s leading organizationsunlock the value of technology and build a more sustainable, more inclusive world. Onsite : New York Job Description Key Responsibilities Pipeline Engineering u2022 Design and maintain high-throughput ingestion pipelines for transaction signals, behavioral events, and third-party identity graphs - including LiveRamp RampID, UID2, GCLID chains, and household device graphs u2022 Implement identity resolution logic at scale: deterministic matching, probabilistic graph construction, and household + device-level cluster assembly across 1B+ data points u2022 Build and maintain data clean room connectors and privacy-preserving data exchange pipelines (AWS Clean Rooms, LiveRamp DCR, Google ADH, or equivalent) u2022 Develop integrations between activation platforms (email, CDP, DSP) and the identity graph layer - supporting real-time audience push and match rate monitoring Data Modeling & Quality u2022 Design medallion-architecture or equivalent data models optimized for cohort-level LTV/CAC attribution and multi-touch attribution across owned, paid, and clean room channels u2022 Build automated QC and reconciliation frameworks - deduplication, compliance validation, and data lineage tracking - capable of reducing manual reconciliation cycles from weeks to hours u2022 Implement PII governance controls at the pipeline layer: redacted ID egress, consent signal propagation, and guardrail validation aligned to GLBA, Fair Lending, UDAAP, and TCPA/CAN-SPAM Platform Integration u2022 Integrate LLM-based APIs (e.g., Anthropic Claude, OpenAI, Vertex AI) for AI-powered signal enrichment, audience brief generation, and compliance pre-screening within pipeline workflows u2022 Build serverless microservices and API bridge layers connecting clean room outputs to activation destinations - using any major serverless or edge compute platform u2022 Maintain and evolve authentication, email notification, and managed database services supporting platform-facing APIs and client-facing tooling Required Qualifications u2022 5+ years of data engineering experience u2022 Expert-level SQL across at least one major cloud data warehouse: Snowflake, Google BigQuery, Amazon Redshift, or Azure Synapse u2022 Proficiency in Python for pipeline development, transformation logic, and data quality automation u2022 Hands-on experience with at least one clean room technology: AWS Clean Rooms, LiveRamp DCR, Google ADH, InfoSum, or equivalent privacy-preserving data collaboration platform u2022 Deep understanding of identity resolution concepts: deterministic matching, probabilistic graph construction, household-level aggregation, and device graph assembly u2022 Strong PII governance knowledge: data residency, consent frameworks, and financial services regulatory requirements (GLBA, Fair Lending, UDAAP) u2022 Experience integrating with DSPs, CDPs, or marketing activation platforms at the data layer u2022 Ability to operate in client-facing consulting delivery contexts - translating business requirements into technical specifications Preferred Qualifications u2022 Experience with graph database technologies - Neo4j, Amazon Neptune, or TigerGraph - for identity graph storage and traversal u2022 Familiarity with LiveRamp Embedded Identity, UID2 token handling, or walled garden attribution integrations (Google ADH, Meta CAPI, Amazon Attribution) u2022 Working knowledge of LLM APIs for structured data enrichment and AI-assisted pipeline workflows
Job Title
Data Engineer with Expert-level SQL