Bernard Ngandu - Why OFFSET Failed Us: Implementing Keyset Pagination End‑to‑End with PostgreSQL, Drizzle, tRPC, and Next.js 16

The day we crossed the hundred‑thousand‑row mark was the day the cracks began to show. Our once‑snappy feed—built on a blissfully simple LIMIT … OFFSET … query—started behaving erratically. As new rows were inserted faster than our users could scroll, pages duplicated items, others mysteriously disappeared, and the loading time ballooned. Eventually we realised the culprit wasn’t a rogue cache or a missing index. It was the very pagination strategy we’d been leaning on since day one. This post is a first‑person account of how we migrated from offset‑based pagination to a more robust cursor‑based keyset approach, the database tuning that made it performant, and the full‑stack integration with Next.js 16, tRPC and TanStack Query that let us expose it through infinite scrolling.

When LIMIT/OFFSET Breaks Down

Offset pagination looks elegant on paper. You pick a page size, compute an offset from the page number and apply it after an ORDER BY clause:

SELECT *
FROM items
ORDER BY created_at DESC, id DESC
LIMIT 20 OFFSET 60;

For small datasets it works fine, but there are two fundamental problems that only manifest at scale:

Performance degrades with large offsets. PostgreSQL must scan and discard OFFSET rows before returning anything. A test query with LIMIT 10 OFFSET 100 000 forces the database to discard one hundred thousand rows just to return ten results. Measurements in a one‑million‑row table show latency increasing roughly linearly with the offset: a query with an offset of 500 000 is hundreds of times slower than one with an offset of 0.
Results become inconsistent when the underlying data changes. When you fetch page 1 (OFFSET 0) and then page 2 (OFFSET pageSize), any insert or delete between those queries shifts the entire result set. New rows can push existing rows forward, causing duplicates or omissions. An article on optimising SQL pagination notes that rows may be duplicated or skipped entirely when using consecutive OFFSET queries and there are concurrent inserts or deletes. The database also has to generate the entire result set to apply the offset, which is expensive for complex joins.

We felt both issues acutely. As our ingestion pipeline accelerated, the feed would occasionally repeat articles or skip a batch entirely. Meanwhile our metrics showed that each extra page was slower than the previous one. It was time to admit that OFFSET was the wrong tool for a live, continuously growing dataset.

Keyset (Cursor‑Based) Pagination

Keyset pagination, sometimes called cursor pagination, eliminates the concept of “page number” altogether. Instead of skipping arbitrary rows, you resume from the last item of the previous page. You choose a stable sort order—typically a timestamp plus a unique identifier—capture those fields from the last row returned, encode them as a cursor and use them in the WHERE clause of the next query. For example:

-- first page
SELECT *
FROM items
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- subsequent page
SELECT *
FROM items
WHERE (created_at, id) < (cursor.created_at, cursor.id)
ORDER BY created_at DESC, id DESC
LIMIT 20;

Because you’re always ordering by deterministic columns, new rows inserted after the current cursor will never appear in previous pages. According to the Drizzle ORM guide, cursor‑based pagination offers consistent query results without skipped or duplicated rows and is more efficient than offset because it doesn’t need to scan and skip previous rows. The same guide warns that you can’t jump directly to an arbitrary page and that the WHERE clause becomes more complex. These trade‑offs are acceptable for a feed that users consume sequentially.

We opted for a two‑column keyset: published_at (a timestamp) and id (a monotonic UUID v7). Each cursor is base64‑encoded JSON { date, id } so it’s opaque to the client yet contains enough information for the backend to resume precisely. On the client we treat the cursor as an opaque string; the server is solely responsible for encoding and decoding it.

eyJwdWJsaXNoZWRfYXQiOiAiMjAyNS0xMi0xMiAxMjoxMjoxMiIsICJpZCI6ICI4MmVlY2FkNC1hMWM5LTQyNWEtOTQ1Zi1hMDM2NGE4MjM5YzEifQ==

{
   "published_at": "2025-12-12 12:12:12",
   "id": "82eecad4-a1c9-425a-945f-a0364a8239c1"
}

Database Optimisation and Indexes

Switching to keyset pagination exposed some heavy lifting we’d postponed: indexes and query plans. When your ORDER BY is published_at DESC, id DESC, you need a composite index on those columns in the exact order and direction:

CREATE INDEX idx_items_published_at_id
ON items (published_at DESC, id DESC);

Without a matching index, PostgreSQL can’t avoid a full scan. The Drizzle docs emphasise that the sort columns must be properly indexed for cursor‑based pagination to work efficiently. We also maintained a separate tsvector column for full‑text search and a GIN index on it so that search filters didn’t degrade the pagination query.

Beyond indexing, we had to think about composite keys. If two articles share the same published_at timestamp, ordering purely by timestamp will not be deterministic. The tie‑breaker id ensures a stable order. The WHERE clause for fetching the next page becomes (published_at < lastDate) OR (published_at = lastDate AND id < lastId). Many call this approach keyset pagination because the combination {published_at, id} uniquely identifies each row.

Building the Cursor Layer

The cursor layer sits between the API and the database. Its responsibilities include:

Decoding the incoming cursor. When the client passes a base64 string, we decode it into an object { date, id }. If no cursor is provided, we fetch the first page.
Normalising the limit. We cap the page size to a sensible maximum (say 100) to protect the database.
Generating the WHERE clause. Using the decoded cursor, we build the keyset filter (published_at < date) OR (published_at = date AND id < id). If no cursor is present, the filter is omitted.
Fetching limit + 1 rows. The extra row lets us determine if there is a next page without making an expensive COUNT(*) query. After slicing off the extra row, we return the remaining rows as the current page. If the extra row exists, we encode its {date, id} as the nextCursor.

These steps live in a utility module (we called ours pagination.ts) and are reused throughout the database layer. Keeping cursor logic in one place reduces the surface area for bugs.

Implementing the Query in Drizzle ORM

Drizzle makes it easy to write type‑safe SQL. Here’s a simplified version of our getArticles query:

export async function getArticles(db: Database, params: GetArticlesParams) {
  const pagination = buildPaginationState(params);
  const filters = buildFilters(params, pagination);

  const rows = await applyFilters(
    db
      .select({ ...articles, source: { ...sources } })
      .from(articles)
      .innerJoin(sources, eq(articles.sourceId, sources.id))
      .orderBy(desc(articles.publishedAt), desc(articles.id)),
    filters,
  ).limit(pagination.limit + 1);

  return buildPaginatedResult(rows, pagination, { date: 'publishedAt', id: 'id' });
}

The buildFilters function composes optional filters like category, sentiment, search and sourceId along with the keyset predicate. We order by publishedAt DESC, id DESC, request limit + 1 rows and let buildPaginatedResult slice and encode the cursor.

On the ingestion side, our crawlers POST new articles through an authenticated Hono route. The payload is validated against a Zod schema (createArticleSchema) and then persisted with Drizzle. Using Zod at both the API and database boundaries ensures that the same constraints are enforced everywhere.

Exposing the Feed via tRPC

We expose our database queries through a tRPC router. Each procedure is declared with a Zod schema for input validation and returns a typed result. Our articlesRouter looked like this:

export const articlesRouter = createTRPCRouter({
  list: protectedProcedure
    .input(getArticlesSchema)
    .query(({ ctx, input }) => getArticles(ctx.db, input)),
  create: protectedProcedure
    .input(createArticleSchema)
    .mutation(({ ctx, input }) => createArticle(ctx.db, input)),
  // other procedures
});

Because tRPC infers the output type from getArticles, our React components can import RouterOutputs["articles"]["list"]["items"][number] and know exactly what fields to expect. The cursor string is transported as-is; SuperJSON serialises Date objects without converting them to strings. The tRPC docs emphasise that the procedure used with useInfiniteQuery must accept a cursor input—this matches our schema’s cursor?: string | null field.

Server‑Side Prefetching with Next.js 16

Next.js 16’s server components allow us to prefetch tRPC queries during the initial request. In our server component we call prefetch before rendering:

export async function ArticlesPage() {
  prefetch(trpc.articles.list.infiniteQueryOptions({ limit: 12 }));

  return (
    <HydrateClient>
        <ArticlesFeed />
    </HydrateClient>
  );
}

The dehydrated state is passed into HydrateClient so that the client can reuse the initial data without an extra round trip. This pattern makes the first paint instantaneous and ensures that cursors and dates are hydrated correctly.

Streaming Pages with TanStack React Query

On the client we wrap the tRPC client with TanStack’s useInfiniteQuery hook:

const trpc = useTRPC();
const query = useInfiniteQuery(
    trpc.articles.list.infiniteQueryOptions(
      {
        limit: 12,
      },
      {
        getNextPageParam: (lastPage) => (lastPage.meta.hasNext ? lastPage.meta.nextCursor : null),
        initialCursor: null,
      },
    ),
  );

const articles = React.useMemo(
    () => query.data?.pages.flatMap((page) => page.items) ?? [],
    [query.data],
);

useInfiniteQuery manages the pagination state for us. When the user clicks “Load more,” we simply call query.fetchNextPage(). The hook inspects the meta object returned by the server and forwards the opaque nextCursor. TanStack Query caches each cursor separately, so changing filters resets the pagination automatically. It also deduplicates requests and tracks loading states for free.

What Changed When We Switched

After the migration, two improvements were immediately visible:

Stable results. New articles arriving at the top of the feed never cause duplicates or missing entries. Each page is anchored to the last published_at and id values seen, so inserts, updates or deletes between requests don’t shuffle the deck. ReadySet’s discussion on cursor pagination notes that it remains efficient even for large offsets and produces stable results even if new records are inserted or deleted between page requests.
Predictable performance. We no longer pay an O(n) penalty when users scroll deep into the feed. Because the query uses an indexed keyset, it jumps directly to the next page instead of discarding thousands of rows. The same ReadySet analysis points out that cursor‑based pagination avoids scanning and discarding rows, reducing database load and resource consumption.

There were trade‑offs. Users can’t jump to “page 37” because pages are no longer addressed by offset. Our pagination helper had to handle more edge cases: what happens if the last page has fewer items? How do we handle deletions? We resolved these by always returning meta.hasNext and letting the client disable the “Load more” button when it’s false. For deletions we accept that there may be occasional gaps—an acceptable trade‑off for a live feed.

Takeaways and Next Steps

Our experience taught us that pagination isn’t a UI concern—it’s an end‑to‑end architectural concern. Once our dataset started behaving like a conveyor belt rather than a bookshelf, the offset strategy fell apart. Migrating to cursor‑based pagination required coordination between the database schema, the query helpers, the API layer, and the frontend. But the payoff—a consistent, smooth feed that scales with our ingestion rate—was worth it.

If you’re considering a similar migration, start by:

Choosing your keyset. Pick columns that uniquely identify each row and reflect your desired ordering—usually a timestamp and a primary key.
Creating the right indexes. Without a composite index that matches your keyset and sort direction, keyset pagination will be slow.
Encapsulating pagination logic. Centralise cursor encoding/decoding and the WHERE clause generation in a helper module.
Using shared schemas. Define Zod schemas (or equivalent) once and reuse them across your backend and frontend. This prevents type drift and ensures the cursor is always passed correctly.
Leaning on your tooling. Drizzle, tRPC, and TanStack Query all provide primitives for keyset pagination. Let them do the heavy lifting.

Offset pagination isn’t bad—it’s quick to implement and works fine for small, static datasets. But once your application is ingesting data in real time and your pages number in the hundreds, it’s time to upgrade. Keyset pagination may take some extra thought, but it rewards you with consistency and performance.

Happy coding !

Why OFFSET Failed Us: Implementing Keyset Pagination End‑to‑End with PostgreSQL, Drizzle, tRPC, and Next.js 16

When LIMIT/OFFSET Breaks Down

Database Optimisation and Indexes

Building the Cursor Layer

Implementing the Query in Drizzle ORM

Exposing the Feed via tRPC

Server‑Side Prefetching with Next.js 16

Streaming Pages with TanStack React Query

What Changed When We Switched

Takeaways and Next Steps

Related writing

Monorepo UI Sharing with Next.js 16 and shadcn/ui (Bun + Turborepo)

How to Change Algorithms in Symfony without Code Modifications: The Strategy Pattern

Decoupling your application's User Model from Symfony's Security System