Descripción del puesto
<p style="min-height:1.5em">Our mission is to automate coding. The first step in our journey is to build the best tool for professional programmers, using a combination of inventive research, design, and engineering. Our organization is very flat, and our team is small and talent dense. We particularly like people who are truth-seeking, passionate, and creative. We enjoy spirited debate, crazy ideas, and shipping code.</p><h2><strong>About the Role</strong></h2><p style="min-height:1.5em">Cursor ships daily. Every release leaves signals behind: telemetry, prompts, completions, agent runs, sessions. Those signals power model improvement, evals, and experimentation. Data infrastructure is what turns them into something teams can trust. </p><p style="min-height:1.5em"></p><p style="min-height:1.5em">A lot of systems here started simple so we could move fast. Over time, the constraints change and the “good enough” version becomes the bottleneck. This role owns the full ladder: patch what should be patched, redesign what should be redesigned, ship the replacement, and operate it. </p><p style="min-height:1.5em"></p><p style="min-height:1.5em">Privacy guarantees are part of correctness. What we can retain and use depends on Privacy Mode and org configuration, and getting that wrong breaks a product promise. We choose work by business impact: what blocks product and model teams today, and what will block them next month.</p><p style="min-height:1.5em"></p><p style="min-height:1.5em"><strong>Sample projects include...</strong></p><ul style="min-height:1.5em"><li><p style="min-height:1.5em">A core pipeline started as a pragmatic reuse of infrastructure built for something else. It works, but it cannot guarantee properties downstream consumers now need (for example, point-in-time consistency). You design and ship the replacement while keeping the existing system running.</p></li><li><p style="min-height:1.5em">A new product surface ships without instrumentation. You talk to the team, define what needs to be captured, and wire it through before the absence becomes anyone else’s problem.</p></li><li><p style="min-height:1.5em">Eval coverage drops. You trace it to an instrumentation gap introduced weeks ago by a product change nobody flagged. You fix the gap, add a contract so it cannot recur, and ship the dashboard that would have caught it earlier.</p></li><li><p style="min-height:1.5em">Multiple consumers depend on overlapping data. You design schema evolution and validation so changes in one place do not silently degrade the others.</p></li><li><p style="min-height:1.5em">Storage costs rise faster than usage. You decide what is worth keeping, implement retention and compression, and delete what is not.</p></li></ul><p style="min-height:1.5em"></p><h2><strong>What we're looking for</strong></h2><p style="min-height:1.5em">We’re looking for someone who has built real systems at scale and cares about correctness, cost, and ergonomics.</p><p style="min-height:1.5em"></p><p style="min-height:1.5em">Strong signals include:</p><ul style="min-height:1.5em"><li><p style="min-height:1.5em">Deep experience with Spark (Databricks or open-source Spark both count)</p></li><li><p style="min-height:1.5em">Production experience with Ray Data</p></li><li><p style="min-height:1.5em">Hands-on ownership of large data pipelines and storage systems</p></li><li><p style="min-height:1.5em">Comfort debugging performance issues across client instrumentation, streaming, storage, and model-facing workflows, as well as, compute, storage, and networking layers</p></li><li><p style="min-height:1.5em">Clear thinking about data modeling and long-term maintainability</p></li><li><p style="min-height:1.5em">You have good judgment about when to patch and when to rebuild</p></li></ul><p style="min-height:1.5em"></p><p style="min-height:1.5em">Nice to have</p><ul style="min-height:1.5em"><li><p style="min-height:1.5em">Experience running or scaling ClickHouse</p></li><li><p style="min-height:1.5em">Familiarity with dbt, Dagster, or similar orchestration and modeling tools</p></li></ul><p style="min-height:1.5em">We're in-person with cozy offices in North Beach, San Francisco and Manhattan, New York, replete with well-stocked libraries.</p><p style="min-height:1.5em"></p><h2>Applying</h2><p style="min-height:1.5em">If there appears to be a fit, we'll reach to schedule 2-3 short technicals. After, we'll schedule an onsite in our office, where you'll work on a small project, discuss ideas, and meet the team.</p><p style="min-height:1.5em"></p><p style="min-height:1.5em">#LI-DNI</p>