06 December 2021 MS SQL Server Robert Muehsig

Let’s say we have a SQL table and want to retrieve 10 rows randomly - how would you do that? Although I have been working with SQL for x years, I have never encountered that problem. The solution however is quite “simple” (at least if you don’t be picky how we define “randomness” and if you try this on millions of rows):

ORDER BY NEWID()

The most boring way is to use the ORDER BY NEWID() clause:

SELECT TOP 10 FROM Products ORDER BY NEWID()

This works, but if you do that on “large” datasets you might hit performance problems (e.g. more on that here)

TABLESAMPE

The SQL Server implements the Tablesample clause which was new to me. It seems to perform much bettern then the ORDER BY NEWID() clause, but behaves a bit weird. With this clause you can specify the “sample” from a table. The size of the sample can be specified as PERCENT or ROWS (which are then converted to percent internally).

Syntax:

SELECT TOP 10 FROM Products TABLESAMPLE (25 Percent)
SELECT TOP 10 FROM Products TABLESAMPLE (100 ROWS)

The weird part is that the given number might not match the number of rows of your result. You might got more or less results and if our tablesample is too small you might even got nothing in return. There are some clever ways to work around this (e.g. using the TOP 100 statement with a much larger tablesample clause to get a guaranteed result set), but it feels “strange”. If you hit limitations with the first solution you might want to read more on this blog or in the Microsoft Docs.

Stackoverflow

Of course there is a great Stackoverflow thread with even wilder solutions.

Hope this helps!


Written by Robert Muehsig

Software Developer - from Saxony, Germany - working on primedocs.io. Microsoft MVP & Web Geek.
Other Projects: KnowYourStack.com | ExpensiveMeeting | EinKofferVollerReisen.de