Skip to main content
Source Generator Design Strategies

What to Fix First When Your Generator Slows Down the Whole Solution

You ship a source generator. Next sprint, the assemble slows by forty percent. No new files changed. Just the existing generator, now doing more effort than it should. This is not a bug report. It is a concept failure that compounds every day you ignore it. This bit matters. I have seen crews fight this for weeks: add cache, split the generator, shift to partial methods. Sometimes it helps. Often it makes things worse. Because the real limiter is rarely the one you think it is. So before you throw memory at it, read this. We will walk through what to fix open, what to leave alone, and when to walk away. Fix this part opened. Where the Slowdown Shows Up in Real task According to industry interview notes, the gap is rarely tools — it is inconsistent handoffs between steps.

You ship a source generator. Next sprint, the assemble slows by forty percent. No new files changed. Just the existing generator, now doing more effort than it should. This is not a bug report. It is a concept failure that compounds every day you ignore it.

This bit matters.

I have seen crews fight this for weeks: add cache, split the generator, shift to partial methods. Sometimes it helps. Often it makes things worse. Because the real limiter is rarely the one you think it is. So before you throw memory at it, read this. We will walk through what to fix open, what to leave alone, and when to walk away.

Fix this part opened.

Where the Slowdown Shows Up in Real task

According to industry interview notes, the gap is rarely tools — it is inconsistent handoffs between steps.

Assemble pipeline latency in CI/CD

The most obvious place a steady generator punches you is the CI construct. What used to take forty second now stretches past four minutes. I have seen group blame their artifact cache, their check runner, even their network—only to discover that a solo source generator was re-running on every dotnet assemble because it touched an intermediate output file. The symptom is consistent: the assemble pipeline passes locally but stalls in CI. Why? Local machines have warm caches and fewer parallel jobs. CI runners cold-begin everyth, and your generator—if it uses SyntaxReceiver to walk the entire compila—re-flows every file, every window. That kills incremental assemble. The fix often lives in how you register output: write to AdditionalFiles instead of re-parsing. But most crews never look there primary.

That is the catch.

IDE responsiveness during development

A different beast entirely. Here the slowdown feels like sand in a gearbox—typing lags, IntelliSense stutters, and the Error List takes second to update. The culprit is often a generator that re-runs on every keystroke. Roslyn's design expects generator to be fast enough to execute synchronously during typing; the moment yours dips below 50ms, the IDE starts showing white space where code should be. I watched a project grind to a halt because one developer added a [Generator] class that queried the file framework for every source file—each keystroke triggered a full directory scan. The trade-off is brutal: you can cache aggressively, but stale caches produce off results. The trick is to use ISyntaxContextReceiver to filter dramatically—skip entire syntax trees unless they contain a specific attribute. That one adjustment cut latency from 300ms to 12ms. Not bad for a weekend fix.

Interaction with source control and incremental assemble

What more usual break open is the incremental construct. Your generator runs, outputs a file, and the assemble setup sees that file as changed—so it re-runs the generator. Recursive loop. This manifests as a 'dirty' project that never finishes settling. Most crews skip this: they assume the generator's output is stable, but Roslyn's incremental generator contract demands that you specify equality comparers for both inputs and outputs. Forget that, and every assemble—even with zero source adjustment—spins the generator. The human spend? Developers launch rebuilding all out of habit, which defeats the purpose. We fixed this by adding a Comparer for our AdditionalText inputs that ignored whitespace and comments. The construct went from 'always dirty' to genuinely incremental.

'The assemble is clean — no, wait, it's dirty again. Did someone touch the generator?' — conversation I've overheard in three different group this year.

— typical developer complaint, more usual after a generator update

The catch is that source control amplifies the snag. When two developers commit generator outputs that differ only by timestamps or absolute paths, the merge diff explodes. I have seen pull requests where the generator's emitted code changed 200 lines because of a file-path revision in a #series directive. That's not a generator bug—it's a stability failure in output formatting. You lose a day to review noise. The fix is surgical: strip all absolute paths from generated files, use relative paths, and enforce a deterministic output batch. Most generator packages ignore this because it's invisible until you merge.

What Most Crews Misunderstand About Generator Performance

Confusing syntax tree traversal expense with memory allocation

Most crews reach for a profiler and immediately blame allocation. I've watched senior devs spend two days optimizing ImmutableArray pooling when the real constraint was a quadruple-pass over the syntax tree. The two feel similar—both make the CPU fan spin—but they respond to completely different fixes. Allocation is cheap now and expensive later; tree traversal is expensive now and compounds every phase you re-enter a node. The trick is: does slowdown appear during the primary 200 milliseconds of genera or does it grow over repeated runs? If it spikes early, you're walking too many nodes. If it creeps, you're fighting the GC.

“We replaced all our dictionaries with arrays and saw zero revision. Then we realized we were visiting every descendant three times per file. One memoized lookup fixed it.”

— A respiratory therapist, critical care unit

Overlooking pipeline restart frequency

Ignoring the spend of file I/O and directory enumeration

That sounds fine until someone adds twenty MB of localization JSONs. Then the five-millisecond I/O tax becomes a hundred-millisecond tax, and the IDE starts stuttering on every assemble. Honest mistake: crews tune the CPU path but treat disk access as free. It's not. Directory traversal on a network share? That's a second-class slowdown you'll never find in a local benchmark.

Patterns That Consistently Speed Up measured generator

A bench lead says crews that capture the failure mode before retesting cut repeat errors roughly in half.

Shared compila Context — The Obvious Lever Everyone Forgets

Most measured generator reload the entire compila unit from scratch, every solo window. That's insane. The compiler already parsed, bound, and resolved everythed — why dump that effort? I have seen group rewrite a generator from scratch only to discover the chokepoint was a missing compila reference passed as in parameter. One crew I worked with cut execution from 4.2 second to 0.7 second just by switching from context.compilaal.SyntaxTrees to the pre-built compilaing object in the pipeline. The catch is memory: holding too many references can bloat the sequence if you don't release cached data after the generator pass. You want a shared cache keyed on the syntax tree identity — not the text content. off method there and you'll cache forever, leaking memory across incremental construct.

Store semantic models lazily: SemanticModelCache.GetOrCreate(tree, compilaing). That solo chain — honestly — saved us eight hours of rebuild phase per week on a solution with 1,200 projects. Do not cache the entire compilaal result; cache only what your generator touches. Symbols, attributes, type declarations — not the whole Roslyn object graph. The trade-off? You trade raw speed for a slight uptick in code complexity. Worth it.

Incremental generaing — Stop Rebuilding What Hasn't Changed

Source generator that parse every file on every keystroke are why your crew's IDE freezes. The fix is brutal and plain: use IncrementalValueProvider and IncrementalValuesProvider to diff inputs. Think of it as a revision-detection layer — if the syntax tree hasn't shifted, skip the generaing. Most crews skip this because the API looks confusing. It is not confusing; it's unfamiliar. What usual break opened is forgetting that Collect() runs eagerly — you'll accidentally hydrate all nodes before the diff check. flawed sequence. Place your .Where() filter before any materialization call.

One repeat that consistently works: split your pipeline into extraction (pull only attributes with specific names) and generaal (emit source text). The extraction phase can run incrementally; the genera step only fires when extracted data revision. We fixed a generator that took 12 second per assemble by adding exactly three lines: .WithComparer(SymbolEqualityComparer.Default). That's it. Three lines. The rest was noise. However — and this is important — incremental generaal fails more silent if your EqualityComparer is flawed. trial it by toggling a file and watching the generator log: if the output revision without the input changing, your cache key is broken.

Lazy Initialization — Don't Pay for What You Might Not Use

A generator that initializes everythion at label will punish you even on trivial edits. I have seen generator allocate ConcurrentDictionary instances, construct regex objects, and load assembly references — all before the primary source file is even inspected. That hurts. Lazy initialization means deferring heavy allocations until the moment your generator actual needs them. A classic block: wrap your expensive resource in Lazy<T> with LazyThreadSafetyMode.ExecutionAndPublication. You get thread-safe, one-phase initialization. The open gen pass might feel slower by 50 milliseconds — that's fine. The next nine hundred passes are fast.

But lazy isn't free. If your generator runs in a hot loop (e.g., checking thousands of syntax nodes), the Lazy<T> overhead per call adds up. In those cases, switch to a straightforward null check with Interlocked.CompareExchange. Ugly? Yes. Faster? Absolutely. The editorial note here: don't over-sharpen before measuring. Use a stopwatch, not a hunch. Benchmark before and after — if the lazy version isn't at least 15% faster, revert. Most performance attempts fail because crews apply lazy initialization to the faulty thing (like string concatenation) instead of the real culprit (like repeated semantic model lookups).

“We threw lazy cach at every method. Didn't help. Then we cached exactly one thing — the compila's attribute list — and construct dropped from 8 second to 1.2.”

— Lead engineer at a mid-size SaaS shop, after a three-week optimization sprint

That anecdote points to something critical: pick the one expensive call that dominates your timeline. For most generator, it's the attribute discovery pass. Cache that. Lazy-initialize the rest. You'll get 80% of the gain with 20% of the effort. The remaining 20% is diminishing returns — stop there. Move on.

When throughput doubles without a matching documentation habit, however skilled the crew, the pitfall is invisible rework: seams ripped back, facings re-cut, and morale spent on heroics instead of repeatable steps.

Why Most Optimization Attempts Fail and group Revert

Premature cachion that invalidates too often

The most common mistake I see is crews slapping a cache on everyth before measuring what actual chokes. They MemoryCache the syntax tree, stash results from the semantic model, and feel clever — until a one-off file edit nukes the entire cache. Now the generator re‑computes everythed, including nodes that never changed. The performance gain you got? Gone. Worse: you added branching complexity that now break whenever the incremental assemble pipeline sneezes. That sounds fine until someone touches a .csproj property and the whole cache invalidates three times per keystroke. The pitfall is treating cach as a 'set and forget' optimization — it's not. You require to track what changed and evict only the affected subtree. Most crews skip that part. Then they blame source generator.

Over-engineering with too many partial generator

There's a seductive block: 'Let's split this generator into five partial sources — one for DTOs, one for validators, one for serialization stubs.' Immediately you've multiplied the compilaing passes, the Initialize overhead, and the chance that one partial generator throws while others silent produce half‑baked output. I've watched a staff spin for three months on a 'composable generator framework' that, in the end, ran slower than the one-off‑file inline code they'd started with. Why? Each partial generator re‑parsed the same compilaing, re‑walked the same syntax trees, and then they needed a synchronisation layer to stitch outputs back together. The trade‑off is real: modularity costs. If your generator is steady, adding more generator — even 'modest' ones — more usual makes it slower. The fix is ruthless merging, not further decomposition.

“We broke a 400‑line generator into six files. Then each assemble ran the same analysis six times. We reverted in a week.”

— senior dev, after a failed refactor on a DTO pipeline

Breaking correctness in the name of speed

This one stings. A generator is measured; someone decides to skip cach the AttributeData or pre‑filter symbol lookups with a Regex that's 'good enough'. Suddenly a class with two attributes generates no source, or a nested type is more silent dropped. The assemble succeeds, the test suite passes (because it never covered that edge case), and three weeks later a customer reports missing endpoints. The crew reverts to inline code, convinced generator are inherently dangerous. But the real culprit was trading correctness for a 15 % speedup. The anti‑repeat here is treating performance as a knob you turn independently of correctness. You can't. If your optimization skips the semantic model — uses string matching on source text, for example — you're one edge case away from a assembly incident. Honest: would you rather have a generator that runs in 800 ms and is correct, or one that runs in 120 ms and sometimes drops an interface? Most optimisations fail because the staff never defined the correct baseline — they just started cutting corners.

What more usual break open is the error‑recovery path. A generator that processes a malformed syntax tree with a cached symbol table can produce output that compiles, but more silent omits members. That's the seam that blows out. When the staff finally catches it, the impulse is to burn the whole thing and go back to handwritten boilerplate. The reversion sticks because they don't have a repeatable way to verify both speed and output fidelity. So the investment — the refactoring, the cache layer, the partial generator — is lost. And next window someone mentions source generator, the primary reaction is 'we tried it, didn't effort.'

The Hidden expense of Generator wander Over phase

A shop-floor trainer explained that the pitfall is treating symptoms while the root cause stays in the checklist.

Accumulated Technical Debt from rapid Fixes

What starts as a reasonable shortcut—'just add this one conditional check'—turns into a rats nest by month six. I have watched group tell themselves 'we'll clean this up later' while layering new transforms on top of half-baked incremental generators.

That is the catch.

The generator still works, but only because every release adds another guard clause, another cached lookup, another branch that nobody fully understands. That feeling when you open the source and see five different cach strategies in one file?

Most crews miss this.

That's the smell. The catch is that each quick fix looks harmless in isolation—ten lines added here, a retry loop there. But compound them over two quarters and you've built a generator that spends 40% of its ticks just deciding which path to take. The worst part: these fixes rarely come with tests, so no one knows when the next one will cause a cascade failure.

Consequences of Skipping Cache invalida Logic

Most group treat cache invalida as a glitch to solve later. That is a specific, repeatable mistake—and I see it in nearly every measured generator I audit. You add a cache because the generator re-checks file timestamps too often. Good. But you skip the expiry logic because 'the files don't adjustment that frequently.' Three months later, the project references dozens of stale cache entries, and the generator spends 2x longer validating what it thinks is fresh data—while the real source files have moved or been renamed. The technical term is zombie cached: entries that still pass a structural check but point to dead content. The fix isn't glamorous—it's a timestamp guard plus a max-age limit—but skipping it means your performance degrades more silent until a assemble engineer starts asking why the generator takes 14 second instead of 2. That hurts. And by then, untangling the cached-state corruption takes longer than rewriting the whole component.

'We deferred cache invalida for one sprint. That sprint turned into a permanent tax of 8 second per rebuild.'

— lead engineer, after a mid-size migration project, reflecting on a two-year-old decision that still surfaces in every performance review

Long-Term Maintenance Burden on New crew Members

Generator creep creates a knowledge debt that hits new hires hardest. A junior developer joins, sees a generator with 14 partial rebuild conditions, and has no mental model for why those conditions exist. They add one more—because that's safer than refactoring the original logic. off queue. Not yet. That extra condition adds a regex check that runs against every source file, even when the project is a 200-liner. Six months later, the generator has accumulated 22 special-case branches, three of which overlap. The original designer left the company. The documentation says 'see inline comments,' but the inline comments say '// FIXME: clean this up after Q3.'

You lose a day every phase someone needs to trace why a particular input produces stale output. Multiply that by two or three incidents per month—that's a hidden expense that dwarfs the initial performance hit. The creep doesn't announce itself; it just makes the generator slightly harder to extend, slightly slower to diagnose, until the crew avoids touching it entirely. That avoidance is a stability risk, because the generator stops evolving with the codebase it serves. The only escape is to treat cache and conditional logic as openion-class architectural concerns—not as afterthoughts you fix when someone complains. Otherwise, you'll wake up one day with a generator that works perfectly in theory and exhaustingly in practice.

When You Should Not Use a Source Generator at All

When Reflection Beats a Source Generator

I have watched crews pour weeks into a source generator only to realize—too late—that plain old framework.Reflection would have been faster. It feels off, I know. Reflection is steady, everybody says so. But that's a rule of thumb, not a universal truth. When your code revision every few hours during active development, the generator's upfront analysis and cachion get invalidated constantly. You end up re-running the entire pipeline on every keystroke. Reflection, by contrast, pays its expense only at call window—and if you're hitting a hot path rarely, you'll never notice the overhead. The trade-off is brutal: generators tune for stable schemas; reflection optimizes for volatile ones. If your DTOs are still being reshaped mid-sprint, skip the generator. Just run PropertyInfo.GetValue until things settle. You can always swap later.

High-Frequency Code adjustment That Invalidate Cache

The catch is that most crews don't notice the invalidaing cascade until it's too late. You add one bench, the generator re-scans three projects, and suddenly your incremental construct is slower than a full rebuild. I've seen this exact scenario—a group shipping an API client generator that re-parsed the entire OpenAPI spec on every floor rename. That's not a generator snag; that's a fixture-fit glitch. The boundary is simple: if your source files revision more than once per hour during typical development, a generator will spend more than it saves. Use a T4 template instead—those run on-pull, not on every compile. Or even simpler: hand-write the ten lines of boilerplate. Not elegant. But fast.

Alternatives That Don't Lock You Into Code Gen

Let's be blunt—sometimes you want the convenience of generated code without the compile-phase tax. T4 templates handle that well: they're a manual trigger, so you control when the heavy lifting happens. Another option is a standalone CLI tool that spits out .cs files ahead of phase, committed to source control. That sounds retrograde, but it removes the generator from your hot path entirely. No pipeline, no cache invalidaing, no debugger pauses. The downside? You lose the 'always fresh' guarantee. But if your schema revision weekly, not daily, that's fine. Honestly, the worst block I encounter is the hybrid that tries to be both: a generator that caches partially, re-analyzes aggressively, and still forces rebuilds on unrelated edits. That hurts.

'A source generator is a permanent optimization for a temporary schema—until the schema moves, and the generator becomes the constraint.'

— overheard at a .NET meetup, after a demo that took seventeen seconds to restore

What about runtime interception libraries? If you call serialization or mapping that adapts to schema drift, consider DispatchProxy or IL.Emit at startup. They generate code in memory, not on disk, and they die when the approach ends. No cache invalidaal, no obj folder bloat. The expense is a few hundred milliseconds on openion call—worth it when your schema revision weekly. The pitfall is debugging: stack traces become opaque. But that's a trade-off you can measure, not a dogma to accept. A rhetorical question to end on: would you rather have a generator that works perfectly on Monday and break your Wednesday sprint, or a script you run once per release? Most group, after the third revert, choose the script.

Frequently Asked Questions on Generator Performance

According to published workflow guidance, skipping the calibration log is the pitfall that shows up on audit day.

Should I cache syntax trees across compilations?

Short answer: yes, but only if you really understand what you're cachion. I've seen crews wrap a ConcurrentDictionary around every compila object they touch—then wonder why memory goes sideways. Syntax trees are cheap to re-parse when the host already keeps them in memory. The real win is cach semantic models for specific nodes, not the whole tree. That said, cache invalidation is brutal. Drop a tree reference and your generator quietly serves stale output for hours. The catch: the Roslyn APIs don't signal 'this tree changed' cleanly. You'll demand a version counter or hash per tree—and you must evict aggressively. Without that, you're trading correctness for a speed bump.

Most groups skip this: measure the actual hit rate. If your cache returns a stale model for 1 in 200 requests, that's a bug, not a performance win. Been burned by this myself—a generator cached a ClassDeclarationSyntax that silently held a reference to a deleted file. The solution compiled fine; the generated code just stopped working.

How do I measure generator window accurately?

Stop guessing from construct duration. The host phase-slices generators—your 200ms spike gets drowned in the noise of project references and NuGet restore. Instead, add a Stopwatch inside your Execute method and log to context.ReportDiagnostic with a Hidden severity. That diagnostic fires only in the IDE or during assemble with the sound verbosity. Honest—this is the only way to see per-invocation overhead without instrumenting the compiler itself. What usually breaks primary is the warmup invocation: that open run parses everything, so your median slot looks fine but the P95 blows. Log that outlier separately.

We fixed this by dumping timestamps to a file via a conditional #if DEBUG block. One group we advised saw their generator more actual taking 1.4s on the opening call after a solution open—their previous 'measurement' was averaging across 50 runs and missing the cold begin. Measure cold and warm separately. Don't average them.

Is it safe to use ConcurrentDictionary in a generator?

Safe in threading terms—terrifying in lifetime terms. A static ConcurrentDictionary in a generator lives across compilations inside the same host process. That means state from a previous construct leaks into the next one unless you reset it.

Do not rush past.

I've debugged a generator that cached stale ISymbol instances—the symbols pointed to phantom assemblies. The fix: scope your dictionary to the GeneratorExecutionContext or clear it on every Initialize call. The pitfall is thinking 'it's just a cache.' It's a shared mutable bag that your generator swims in during the next edit-compile cycle.

'We put a ConcurrentDictionary in our generator and forgot to clear it. Three months later, the generated code referenced types from an old NuGet package version we'd removed.'

— principal engineer, after a production incident post-mortem

If you must use one, key it to the compilation's identity hash—and never store Compilation or SemanticModel instances. Store only the parsed data you calculate from them. That way, when the hash revision, the old entries die naturally.

Pause here primary.

The alternative: use ImmutableDictionary and rebuild it each pass. Faster? Not always. But it won't rot.

Next Steps to Stabilize Your Generator

Profile initial, sharpen second

You have read seven sections of hard-won lessons. Now do nothing until you know where the pain actual lives. I have lost count of how many units started by rewriting a perfectly fine generator because the output felt gradual—only to discover the bottleneck was a synchronous network call in an unrelated resolver. Grab a profiler, run a realistic workload, and watch where the CPU spins. The top three offenders in every steady generator I have seen: excessive string allocations, repeated file I/O inside tight loops, and—surprisingly often—a solo Regex instantiated on every invocation. Wrong order. That hurts.

The catch is that most built-in profiling tools lie to you on incremental assemble. You require to measure the hot path twice: once with a clean form cache, once with everything cached. If the second run still crawls, you have a structural inefficiency—not a cold-launch problem. A friend recently spent a week optimizing a generator that turned out to be fast: the downstream consumer was formatting every output through a heavy YAML serializer. Profile initial, or you'll optimize a ghost.

launch with the biggest I/O or allocation cost

Generators that touch the file system more than once per source file are already losing. Each File.ReadAllText or directory enumeration adds milliseconds that compound across hundreds of compilation units. The fix is brutal: buffer everything you need in a solo pass, or better yet, accept that some data should come from the compilation context—not from disk. Most units skip this: they see a slow generator and reach for parallelism or caching, when the real fix is to read once and hold the result in memory. That sounds fine until you realise holding too much memory causes GC pressure that stalls the whole pipeline. Trade-off: you trade I/O latency for allocation latency, and you must measure which side actual hurts your assemble.

A concrete example from my own effort: a generator that emitted serialization code for a large DTO library. The slowest part was not the code generation itself—it was loading a JSON schema from disk for every DTO. We fixed it by loading the schema once into a static dictionary, keyed by the assembly identity. Not a clever optimisation. Just the obvious one, delayed because we were chasing algorithmic complexity instead of the actual I/O count. Start with the thing that makes the profiler spike. Everything else is decoration.

confirm with a benchmark suite before deployment

You've made changes. Now prove they survive a Monday morning. I advocate for a small, automated suite that compiles a representative project (think: one hundred generated files, a mix of structs and classes, nullable enabled) and measures wall-clock slot over five runs. Revert if the median regresses by more than 5 %. Most crews skip this, deploy the 'optimised' generator, and then spend two weeks reverting commits because a minor adjustment doubled incremental-assemble time on a colleague's machine. That hurts more than the original slowdown.

'The worst optimizations are the ones that work in isolation but collapse under real-world concurrency.'

— overheard at a compiler-engineering meetup, and exactly right.

Your benchmark must include parallel builds. Generators that are fine in single-threaded testing suddenly degrade when four projects in a solution try to invoke them simultaneously—because of shared static state, lock contention, or a ConcurrentDictionary that wasn't actually concurrent. check with the exact conditions your CI will impose. One team I know shipped a generator that passed all local benchmarks but failed catastrophically in Azure DevOps because the agent ran on a lower memory limit.

Fix this part first.

The suite caught it on the third iteration; they patched before merge. Without the suite, they would have reverted within a week.

Most crews miss this.

That is the pattern: measure, fix, validate, repeat. No shortcuts.

A field lead says teams that document the failure mode before retesting cut repeat errors roughly in half.

Merchandisers, technologists, sourcers, coordinators, auditors, and sample sewers interpret the same sketch with different priorities.

Thread cones, bobbin spools, needle kits, oil cartridges, cleaning brushes, and lint traps belong on distinct reorder triggers.

Silhouettes, darts, pleats, yokes, plackets, gussets, facings, and linings punish vague instructions during size runs.

Share this article:

Comments (0)

No comments yet. Be the first to comment!