Where to put DISTINCT in a CTE?

CTE's are surprisingly useful for paging data, but I'm running into a problem where I have to add a distinct clause. The idea is to pull topic records for a user's forum posts, but only one topic record even if there are multiple posts in the topic. Here's what I have so far:

CREATE PROCEDURE pf_PagedTopicsByUser(
@StartRow int,
@PageSize int,
@UserID int
)

AS

DECLARE @Counter int
SET @Counter = (@StartRow + @PageSize - 1)

SET ROWCOUNT @Counter;

WITH Entries AS (
SELECT ROW_NUMBER() OVER (ORDER BY IsPinned DESC, LastPostTime DESC)
AS Row, pf_Topic.TopicID, pf_Topic.ForumID, pf_Topic.Title, pf_Topic.ReplyCount, pf_Topic.ViewCount,
pf_Topic.StartedByUserID, pf_Topic.StartedByName, pf_Topic.LastPostUserID, pf_Topic.LastPostName,
pf_Topic.LastPostTime, pf_Topic.IsClosed, pf_Topic.IsPinned, pf_Topic.IsDeleted, pf_Topic.IsIndexed
FROM pf_Topic JOIN pf_Post ON pf_Topic.TopicID = pf_Post.TopicID
WHERE pf_Post.UserID = @UserID AND pf_Topic.IsDeleted = 0)

SELECT TopicID, Entries.ForumID, Entries.Title, ReplyCount, ViewCount,
StartedByUserID, StartedByName, LastPostUserID, Entries.LastPostName,
Entries.LastPostTime, IsClosed, IsPinned, IsDeleted, IsIndexed, pf_Forum.Title
FROM Entries JOIN pf_Forum ON Entries.ForumID = pf_Forum.ForumID
WHERE Row between
@StartRow and @StartRow + @PageSize - 1

SET ROWCOUNT 0

GO


The problem is that you can't put the DISTINCT in the last SELECT, because then you won't return an entire "page" of results. It would also appear that you can't use DISTINCT in the CTE declaration.

Any suggestions


Answer this question

Where to put DISTINCT in a CTE?

  • Juraj Borza

    Any takers on this one

  • Progalex

    I have one more question about this query. Is there some way that I can figure out how many rows were in the total result set I understand that CTE's are even more temporary than temp objects, so I suspect this isn't very easy. I tried @@ROWCOUNT, but that returns the number of the page size.

  • Biocide

    Like Rob says, you could do the count(*) from CTE, but I wouldn't unless it is a necessity (doing the paging will be cost enough, without adding the count pass.) I would just go with the paradigm where you keep adding pages until you hit the end, like search engines have gone to.

  • Old Jeffrey Zhao

    Oh, and the reason this got missed is that this thread is marked as having been answered already. Sorry.


  • Dave12349

    That doesn't work. It still only returns the number of records, i.e., if the page size is 20, it returns 20.

  • Chrispragash

    Something like this using a second CTE:

    WITH FirstEntries AS (
    SELECT ROW_NUMBER() OVER (PARTITION BY Topic ORDER BY IsPinned DESC, LastPostTime DESC)
    AS GroupRow
    , pf_Topic.TopicID, pf_Topic.ForumID, pf_Topic.Title, pf_Topic.ReplyCount, pf_Topic.ViewCount,
    pf_Topic.StartedByUserID, pf_Topic.StartedByName, pf_Topic.LastPostUserID, pf_Topic.LastPostName,
    pf_Topic.LastPostTime, pf_Topic.IsClosed, pf_Topic.IsPinned, pf_Topic.IsDeleted, pf_Topic.IsIndexed
    FROM pf_Topic JOIN pf_Post ON pf_Topic.TopicID = pf_Post.TopicID
    WHERE pf_Post.UserID = @UserID AND pf_Topic.IsDeleted = 0),

    Entries as (
    SELECT *,ROW_NUMBER() OVER (ORDER BY IsPinned DESC, LastPostTime DESC) AS Row
    FROM FirstEntries
    WHERE GroupRow = 1)


    SELECT TopicID, Entries.ForumID, Entries.Title, ReplyCount, ViewCount,
    StartedByUserID, StartedByName, LastPostUserID, Entries.LastPostName,
    Entries.LastPostTime, IsClosed, IsPinned, IsDeleted, IsIndexed, pf_Forum.Title
    FROM Entries JOIN pf_Forum ON Entries.ForumID = pf_Forum.ForumID
    WHERE Row between
    @StartRow and @StartRow + @PageSize - 1



  • Tang Meister

    I don't entirely understand. Where does that go in relation to the original query

    Thank you for your help.

  • xavito

    But the CTE quits after the last record. So if the page is 2 and the the page size is 20, you'll get 40 every time.

  • TTris

    Awesome! You absolutely nailed it. Thank you SO much. You've made my day entirely.

    One little change, the column after the PARTITION BY should be pf_Topic.TopicID, otherwise, perfect!

  • Tdah

    Just try using: select count(*) from CTE


  • helghe

    You could add another layer of CTE and another ROW_NUMBER:

    WITH FirstEntries AS ( --untested, so you might have to tweak just a bit to get the first
    --row you want

    SELECT ROW_NUMBER() OVER (PARTITION BY Topic ORDER BY IsPinned DESC, LastPostTime DESC)
    AS GroupRow, ...

    Then add to the WHERE clause

    WHERE GroupRow = 1

    That will work. Then do the paging query.



  • Mirko Messori

    Try having something like:

    COUNT(*) OVER()

    in your select clause, before you've done your filtering. Like this (hopefully you can see the green stuff):

    WITH FirstEntries AS (
    SELECT ROW_NUMBER() OVER (PARTITION BY Topic ORDER BY IsPinned DESC, LastPostTime DESC)
    AS GroupRow, pf_Topic.TopicID, pf_Topic.ForumID, pf_Topic.Title, pf_Topic.ReplyCount, pf_Topic.ViewCount,
    pf_Topic.StartedByUserID, pf_Topic.StartedByName, pf_Topic.LastPostUserID, pf_Topic.LastPostName,
    pf_Topic.LastPostTime, pf_Topic.IsClosed, pf_Topic.IsPinned, pf_Topic.IsDeleted, pf_Topic.IsIndexed
    FROM pf_Topic JOIN pf_Post ON pf_Topic.TopicID = pf_Post.TopicID
    WHERE pf_Post.UserID = @UserID AND pf_Topic.IsDeleted = 0),

    Entries as (
    SELECT *,ROW_NUMBER() OVER (ORDER BY IsPinned DESC, LastPostTime DESC) AS Row, COUNT(*) OVER () as cnt
    FROM FirstEntries
    WHERE GroupRow = 1)

    SELECT TopicID, Entries.ForumID, Entries.Title, ReplyCount, ViewCount,
    StartedByUserID, StartedByName, LastPostUserID, Entries.LastPostName,
    Entries.LastPostTime, IsClosed, IsPinned, IsDeleted, IsIndexed, pf_Forum.Title, cnt
    FROM Entries JOIN pf_Forum ON Entries.ForumID = pf_Forum.ForumID
    WHERE Row between
    @StartRow and @StartRow + @PageSize - 1


    --Hope this helps...

    Rob


  • Where to put DISTINCT in a CTE?