TL;DR

A Ukrainian CS graduate - diploma in hand, zero practical experience - sat down with Claude a week before a fintech job interview. He asked it to walk him through what the job actually required: SQL fundamentals, reconciliation logic, how a payment ops team runs day to day. He got hired. Then he used Claude Code to automate most of the job itself. This is what that looked like - where it worked, and where it hit a wall.

The gap between the diploma and the job

He graduated from one of Ukraine's top universities with a CS degree. The first year or two he was genuinely interested. Then burnout hit, and the back half of the program was survival grinding - helped along by the fact that COVID loosened all the structure and ChatGPT arrived just in time for his thesis.

What he had: a diploma. What he lacked: anything built, anything shipped, any production code. The credential cleared the HR filter at a fintech. The skills had to come from somewhere else, fast.

A week before the interview he spent several sessions with Claude running through exactly what he'd need - common questions for a data analyst role, what hiring managers actually look for, SQL edge cases, what reconciliation means in a payment ops context. He went in. He got the job. The bigger thing came after he started.

25 payment systems, 15% errors, one analyst

The company moves millions in volume weekly. His job was to keep every transaction reconciled across 25+ payment systems - banks, crypto processors, Stripe, PayPal, and smaller processors most people outside ops have never heard of. Each system exported between 1,000 and 12,000 rows per week. Every row had to match against an internal CRM. Catch refunds. Catch partial settlements. Track affiliate payouts, their fee cuts, late-settlement fines.

The CRM was old and didn't communicate properly with most of the payment systems. Before automation, the discrepancy between what the CRM reported and what actually hit the accounts ran above 15%. He was leaving the office last every night.

He asked Claude Code to write a reconciliation script. His prompt described the inputs (CSV exports with varying column formats), the matching logic (by transaction ID), the edge cases (partial refunds, currency conversion, affiliate fee deductions arriving in a separate batch), and the output format (mismatches flagged with source, expected vs. actual, likely cause).

They went back and forth over a few days. The script normalized formats across all 25 systems, matched records against the CRM, and flagged anything misaligned. A second layer handled chargebacks, chargeback reversals, currency gaps, and fee structures.

The discrepancy now sits between 1 and 3%. What took 8+ hours daily now takes about an hour. He estimates roughly 35 hours a week returned to him.

He didn't write any of that code. He described what he needed and reviewed what Claude Code produced.

The arb bot - and the prompt that actually worked

After the job was under control, he found a paid tool that scanned for arbitrage opportunities between prediction markets - Polymarket, Kalshi, and others. It cost $150 a month. He decided to build the equivalent himself.

The hard problem wasn't the API pulls or the spread calculations. It was market matching. "Will Trump win Pennsylvania" on Polymarket might be phrased completely differently on Kalshi. Hundreds of thousands of pairs had to be compared, and spreads of 5-15% net of fees disappear fast if you can't identify them quickly.

Sending every pair to an LLM directly was unrealistic - latency and cost would make it useless. So before anything reached the model, he built three pre-filters that eliminated 95% of obvious mismatches:

  • Resolution date within 48 hours
  • Title + description similarity above 60%
  • Category match across a normalized taxonomy (each platform uses its own labels; Claude Code built a map that collapsed them all into a shared set)

Only pairs that cleared all three went to the LLM. The bot was finding 90 to 150 valid arb pairs a day.

The interesting part wasn't the pipeline. It was how he fixed the matching prompt.

His first attempts followed the obvious path: "You are an expert analyst. Compare these two markets and decide if they are the same." The model was confidently wrong too often - matching "Will SpaceX IPO in 2025" with "Will OpenAI IPO in 2025" because both were tech IPO markets in the same year.

He flipped the default assumption. Instead of asking the model to find a match, he forced it to assume every pair was a mismatch and return is_match=true only if it could disprove its own starting position. Same model. Same data. Different default. False positives dropped to near zero.

This is the part worth holding onto: the code was the easy half. The hard half was figuring out where the model was willing to be wrong, then rewriting the prompt so it had to work against itself instead of toward confirmation.

Where it broke down

Not everything worked.

The biggest miss was a bot for 15-minute Polymarket markets - fast-resolving contracts where entering and exiting profitably requires millisecond execution, model calibration, and risk management that real quant teams spend years developing. Claude Code wrote the code without issue. But the code isn't the constraint when the domain knowledge isn't there to describe what the code should actually do.

His summary of what that failure taught him: "Claude Code lets you build anything you can describe well. If you can't describe it well, no tool fixes that."

YouTube editing hit the same ceiling. He automated the full production pipeline - Claude wrote scripts from competitor transcripts, ElevenLabs cloned his voice, Claude handled thumbnails and descriptions. The channel reached 400 subscribers with 2-4K views on the stronger videos. But every video still took 6-8 hours of manual editing against 30 minutes of automated work. AI can't cut B-roll to voiceover. He closed it when the interest ran out.

What this adds up to

At the end of a year: stable job, first car, apartment he likes, savings going into side projects. Total tool cost: a Claude subscription, starting at $20 a month.

Five rules for starting from zero:

  1. Describe the task like you're explaining it to someone new. Specific inputs, specific outputs, edge cases named. Vague prompts produce vague code.
  2. Think in steps, not projects. "Write a script that reads a CSV and outputs JSON" works. "Build me a CRM" doesn't.
  3. When something breaks, paste the error back. Claude Code fixes its own output better than it guesses at fresh prompts.
  4. Save the prompts that work. His reconciliation prompt has been reused in three other projects with minor tweaks.
  5. Learn from other people's prompts. One borrowed template can save a week of trial and error.

The gap between "I have an idea" and "the thing exists" used to require money or engineering time. For a wide category of problems it's now a few evenings and the ability to describe what you want precisely enough. That last part - precision of description - is the skill that hasn't been automated yet.

Source: Claude Code - Anthropic. Prediction market arbitrage background via Trevor Lasn.