Puzzler: Distinct counts on aggregate data using cube
Author Message
Puzzler: Distinct counts on aggregate data using cube

Hi all,

I have an odd need.  I have constructed a simple sample of what I am trying
to do.  It turns out that my sample is a bit stupid but here goes.

First, run creation sql at the bottom of this message to create the tables
and data that you need for my example.

Now, look at the results of the following.  Notice Goerge bought 2 jackets,
Bob 2 pants and 2 socks:

SELECT CustomerName, ItemName, COUNT(*)
FROM Purchase
JOIN Item ON Purchase.Item = Item.ItemId
JOIN Customer ON Purchase.Customer = Customer.CustomerId
GROUP BY CustomerName, ItemName

CustomerName ItemName
------------ ---------- -----------
Dan          gloves     1
Dan          hat        1
Goerge       hat        1
Goerge       jacket     2
Bob          pants      2
Dan          pants      1
Bob          shirt      1
Bob          socks      2

This query shows the total number of each item that was sold:

SELECT CASE WHEN GROUPING(ItemName) = 1 THEN 'All Items' ELSE ItemName END
AS ItemName, COUNT(*) AS NumberSold
FROM Purchase
JOIN Item ON Purchase.Item = Item.ItemId
GROUP BY ItemName WITH CUBE

ItemName   NumberSold
---------- -----------
gloves     1
hat        2
jacket     2
pants      3
shirt      1
socks      2
All Items  11

We can see how many distinct cutomers purchased each item with the
following:

SELECT ItemName, COUNT(DISTINCT Customer) AS NumberSold
FROM Purchase
JOIN Item ON Purchase.Item = Item.ItemId
GROUP BY ItemName

ItemName   NumberSold
---------- -----------
gloves     1
hat        2
jacket     1
pants      2
shirt      1
socks      1

What I need to do is find out both with one query.  I would like to be able
to do something like the following:

SELECT ItemName, CustomerName, COUNT(*) AS NumberSold, COUNT(DISTINCT
CustomerId)
FROM Purchase
JOIN Item ON Purchase.Item = Item.ItemId
JOIN Customer ON Purchase.Customer = Customer.CustomerId
GROUP BY ItemName, CustomerName WITH CUBE

This is not possible.  Taking into account that I need a lightning fast
query as I am dealing with a lot of data.  Also, my group by's contain a
much larger number of columns so even if a huge number of unions would
perform, they would be very inconvinient to build.

Matthew Field

****************************************************************
THE FOLLOWING IS THE SQL NEED TO RUN THE EXAMPLE QUERIES
****************************************************************

if exists (select * from sysobjects where id =
object_id(N'[dbo].[Customer]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[Customer]
GO

if exists (select * from sysobjects where id = object_id(N'[dbo].[Item]')
and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[Item]
GO

if exists (select * from sysobjects where id =
object_id(N'[dbo].[Purchase]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[Purchase]
GO

CREATE TABLE [dbo].[Customer] (
[CustomerId] [int] NOT NULL ,
[CustomerName] [char] (10) NOT NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[Item] (
[ItemId] [int] NOT NULL ,
[ItemName] [char] (10) NOT NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[Purchase] (
[Item] [int] NOT NULL ,
[Customer] [int] NOT NULL
) ON [PRIMARY]
GO

INSERT INTO Item VALUES (1, 'socks     ')
INSERT INTO Item VALUES (2, 'shirt     ')
INSERT INTO Item VALUES (3, 'pants     ')
INSERT INTO Item VALUES (4, 'jacket    ')
INSERT INTO Item VALUES (5, 'hat       ')
INSERT INTO Item VALUES (6, 'gloves    ')

INSERT INTO Customer VALUES (1, 'Bob       ')
INSERT INTO Customer VALUES (2, 'Dan       ')
INSERT INTO Customer VALUES (3, 'Frank     ')
INSERT INTO Customer VALUES (4, 'Goerge    ')
INSERT INTO Customer VALUES (5, 'Biff      ')

INSERT INTO Purchase VALUES (1, 1)
INSERT INTO Purchase VALUES (2, 1)
INSERT INTO Purchase VALUES (3, 1)
INSERT INTO Purchase VALUES (3, 2)
INSERT INTO Purchase VALUES (6, 2)
INSERT INTO Purchase VALUES (5, 2)
INSERT INTO Purchase VALUES (5, 4)
INSERT INTO Purchase VALUES (4, 4)
INSERT INTO Purchase VALUES (4, 4)
INSERT INTO Purchase VALUES (3, 1)
INSERT INTO Purchase VALUES (1, 1)

Sun, 23 Nov 2003 04:40:53 GMT
Puzzler: Distinct counts on aggregate data using cube

Please give an example of the result set you would like.

David

Quote:
> Hi all,

> I have an odd need.  I have constructed a simple sample of what I am
trying
> to do.  It turns out that my sample is a bit stupid but here goes.

> First, run creation sql at the bottom of this message to create the tables
> and data that you need for my example.

> Now, look at the results of the following.  Notice Goerge bought 2
jackets,
> Bob 2 pants and 2 socks:

> SELECT CustomerName, ItemName, COUNT(*)
> FROM Purchase
> JOIN Item ON Purchase.Item = Item.ItemId
> JOIN Customer ON Purchase.Customer = Customer.CustomerId
> GROUP BY CustomerName, ItemName

> CustomerName ItemName
> ------------ ---------- -----------
> Dan          gloves     1
> Dan          hat        1
> Goerge       hat        1
> Goerge       jacket     2
> Bob          pants      2
> Dan          pants      1
> Bob          shirt      1
> Bob          socks      2

> This query shows the total number of each item that was sold:

> SELECT CASE WHEN GROUPING(ItemName) = 1 THEN 'All Items' ELSE ItemName END
> AS ItemName, COUNT(*) AS NumberSold
> FROM Purchase
> JOIN Item ON Purchase.Item = Item.ItemId
> GROUP BY ItemName WITH CUBE

> ItemName   NumberSold
> ---------- -----------
> gloves     1
> hat        2
> jacket     2
> pants      3
> shirt      1
> socks      2
> All Items  11

> We can see how many distinct cutomers purchased each item with the
> following:

> SELECT ItemName, COUNT(DISTINCT Customer) AS NumberSold
> FROM Purchase
> JOIN Item ON Purchase.Item = Item.ItemId
> GROUP BY ItemName

> ItemName   NumberSold
> ---------- -----------
> gloves     1
> hat        2
> jacket     1
> pants      2
> shirt      1
> socks      1

> What I need to do is find out both with one query.  I would like to be
able
> to do something like the following:

> SELECT ItemName, CustomerName, COUNT(*) AS NumberSold, COUNT(DISTINCT
> CustomerId)
> FROM Purchase
> JOIN Item ON Purchase.Item = Item.ItemId
> JOIN Customer ON Purchase.Customer = Customer.CustomerId
> GROUP BY ItemName, CustomerName WITH CUBE

> This is not possible.  Taking into account that I need a lightning fast
> query as I am dealing with a lot of data.  Also, my group by's contain a
> much larger number of columns so even if a huge number of unions would
> perform, they would be very inconvinient to build.

> Matthew Field

> ****************************************************************
> THE FOLLOWING IS THE SQL NEED TO RUN THE EXAMPLE QUERIES
> ****************************************************************

> if exists (select * from sysobjects where id =
> object_id(N'[dbo].[Customer]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
> drop table [dbo].[Customer]
> GO

> if exists (select * from sysobjects where id = object_id(N'[dbo].[Item]')
> and OBJECTPROPERTY(id, N'IsUserTable') = 1)
> drop table [dbo].[Item]
> GO

> if exists (select * from sysobjects where id =
> object_id(N'[dbo].[Purchase]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
> drop table [dbo].[Purchase]
> GO

> CREATE TABLE [dbo].[Customer] (
>  [CustomerId] [int] NOT NULL ,
>  [CustomerName] [char] (10) NOT NULL
> ) ON [PRIMARY]
> GO

> CREATE TABLE [dbo].[Item] (
>  [ItemId] [int] NOT NULL ,
>  [ItemName] [char] (10) NOT NULL
> ) ON [PRIMARY]
> GO

> CREATE TABLE [dbo].[Purchase] (
>  [Item] [int] NOT NULL ,
>  [Customer] [int] NOT NULL
> ) ON [PRIMARY]
> GO

> INSERT INTO Item VALUES (1, 'socks     ')
> INSERT INTO Item VALUES (2, 'shirt     ')
> INSERT INTO Item VALUES (3, 'pants     ')
> INSERT INTO Item VALUES (4, 'jacket    ')
> INSERT INTO Item VALUES (5, 'hat       ')
> INSERT INTO Item VALUES (6, 'gloves    ')

> INSERT INTO Customer VALUES (1, 'Bob       ')
> INSERT INTO Customer VALUES (2, 'Dan       ')
> INSERT INTO Customer VALUES (3, 'Frank     ')
> INSERT INTO Customer VALUES (4, 'Goerge    ')
> INSERT INTO Customer VALUES (5, 'Biff      ')

> INSERT INTO Purchase VALUES (1, 1)
> INSERT INTO Purchase VALUES (2, 1)
> INSERT INTO Purchase VALUES (3, 1)
> INSERT INTO Purchase VALUES (3, 2)
> INSERT INTO Purchase VALUES (6, 2)
> INSERT INTO Purchase VALUES (5, 2)
> INSERT INTO Purchase VALUES (5, 4)
> INSERT INTO Purchase VALUES (4, 4)
> INSERT INTO Purchase VALUES (4, 4)
> INSERT INTO Purchase VALUES (3, 1)
> INSERT INTO Purchase VALUES (1, 1)

Sun, 23 Nov 2003 05:39:19 GMT
Puzzler: Distinct counts on aggregate data using cube
Interestingly the RAC crosstab and pivoting utility

'Purchase join Item on Purchase.Item = Item.ItemId

'select ItemName,
cast(dbo.racrowfval(rowfunct,1) as varchar(3)) as [distinct cnt],
funct,_pvtcols_ from rac order by rd'

ItemName distinct cnt funct  Totals  Bob   Dan   Goerge
--------     ------------ ----- ------ ------ ------ ------
gloves       1                 cnt   1                   1
hat            2                 cnt   2                   1        1
jacket       1                 cnt   2                             2
pants        2                 cnt   3         2        1
shirt          1                 cnt   1         1
socks       1                 cnt   2         2
Totals      3                  cnt   11      5         3      3

stevie
Check out rac at:
www.angelfire.com/ny4/rac/
Inquire for new s2k beta.

Sun, 23 Nov 2003 06:46:53 GMT
Puzzler: Distinct counts on aggregate data using cube
The result set that I would like would be this:

Item               Customer          NumberSold
----------     -------------         -----------
All Items      All Customers         8
All Items      Bob                       5
All Items      Dan                       3
All Items      Goerge                    3
gloves          All Customers             1
gloves          Dan                       1
hat               All Customers         2
hat               Dan                       1
hat               Goerge                    1
jacket           All Customers         1
jacket           Goerge                    2
pants            All Customers         2
pants            Bob                           2
pants            Dan                           1
shirt              All Customers             1
shirt              Bob                           1
socks            All Customers             1
socks             Bob                           2

Notice that the sock total is the number of unique users that purchased
socks.  This number is 1 even though the total number of socks purchased was
2 because Bob bought 2 pair.  This is the result set that I believe should
be produced if the following query could execute with COUNT(*) replaced with
COUNT(DISTINCT CustomerId):

SELECT CASE WHEN GROUPING(ItemName) = 1 THEN 'All Items' ELSE ItemName END
AS Item,
CASE WHEN GROUPING(CustomerName) = 1 THEN 'All Customers' ELSE CustomerName
END AS Customer,
COUNT(*) AS NumberSold
FROM Purchase
JOIN Item ON Purchase.Item = Item.ItemId
JOIN Customer ON Purchase.Customer = Customer.CustomerId
GROUP BY ItemName, CustomerName WITH CUBE
ORDER BY ItemName, CustomerName

Quote:
> Please give an example of the result set you would like.

> David

> > Hi all,

> > I have an odd need.  I have constructed a simple sample of what I am
> trying
> > to do.  It turns out that my sample is a bit stupid but here goes.

> > First, run creation sql at the bottom of this message to create the
tables
> > and data that you need for my example.

> > Now, look at the results of the following.  Notice Goerge bought 2
> jackets,
> > Bob 2 pants and 2 socks:

> > SELECT CustomerName, ItemName, COUNT(*)
> > FROM Purchase
> > JOIN Item ON Purchase.Item = Item.ItemId
> > JOIN Customer ON Purchase.Customer = Customer.CustomerId
> > GROUP BY CustomerName, ItemName

> > CustomerName ItemName
> > ------------ ---------- -----------
> > Dan          gloves     1
> > Dan          hat        1
> > Goerge       hat        1
> > Goerge       jacket     2
> > Bob          pants      2
> > Dan          pants      1
> > Bob          shirt      1
> > Bob          socks      2

> > This query shows the total number of each item that was sold:

> > SELECT CASE WHEN GROUPING(ItemName) = 1 THEN 'All Items' ELSE ItemName
END
> > AS ItemName, COUNT(*) AS NumberSold
> > FROM Purchase
> > JOIN Item ON Purchase.Item = Item.ItemId
> > GROUP BY ItemName WITH CUBE

> > ItemName   NumberSold
> > ---------- -----------
> > gloves     1
> > hat        2
> > jacket     2
> > pants      3
> > shirt      1
> > socks      2
> > All Items  11

> > We can see how many distinct cutomers purchased each item with the
> > following:

> > SELECT ItemName, COUNT(DISTINCT Customer) AS NumberSold
> > FROM Purchase
> > JOIN Item ON Purchase.Item = Item.ItemId
> > GROUP BY ItemName

> > ItemName   NumberSold
> > ---------- -----------
> > gloves     1
> > hat        2
> > jacket     1
> > pants      2
> > shirt      1
> > socks      1

> > What I need to do is find out both with one query.  I would like to be
> able
> > to do something like the following:

> > SELECT ItemName, CustomerName, COUNT(*) AS NumberSold, COUNT(DISTINCT
> > CustomerId)
> > FROM Purchase
> > JOIN Item ON Purchase.Item = Item.ItemId
> > JOIN Customer ON Purchase.Customer = Customer.CustomerId
> > GROUP BY ItemName, CustomerName WITH CUBE

> > This is not possible.  Taking into account that I need a lightning fast
> > query as I am dealing with a lot of data.  Also, my group by's contain a
> > much larger number of columns so even if a huge number of unions would
> > perform, they would be very inconvinient to build.

> > Matthew Field

> > ****************************************************************
> > THE FOLLOWING IS THE SQL NEED TO RUN THE EXAMPLE QUERIES
> > ****************************************************************

> > if exists (select * from sysobjects where id =
> > object_id(N'[dbo].[Customer]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
> > drop table [dbo].[Customer]
> > GO

> > if exists (select * from sysobjects where id =

object_id(N'[dbo].[Item]')

- Show quoted text -

Quote:
> > and OBJECTPROPERTY(id, N'IsUserTable') = 1)
> > drop table [dbo].[Item]
> > GO

> > if exists (select * from sysobjects where id =
> > object_id(N'[dbo].[Purchase]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
> > drop table [dbo].[Purchase]
> > GO

> > CREATE TABLE [dbo].[Customer] (
> >  [CustomerId] [int] NOT NULL ,
> >  [CustomerName] [char] (10) NOT NULL
> > ) ON [PRIMARY]
> > GO

> > CREATE TABLE [dbo].[Item] (
> >  [ItemId] [int] NOT NULL ,
> >  [ItemName] [char] (10) NOT NULL
> > ) ON [PRIMARY]
> > GO

> > CREATE TABLE [dbo].[Purchase] (
> >  [Item] [int] NOT NULL ,
> >  [Customer] [int] NOT NULL
> > ) ON [PRIMARY]
> > GO

> > INSERT INTO Item VALUES (1, 'socks     ')
> > INSERT INTO Item VALUES (2, 'shirt     ')
> > INSERT INTO Item VALUES (3, 'pants     ')
> > INSERT INTO Item VALUES (4, 'jacket    ')
> > INSERT INTO Item VALUES (5, 'hat       ')
> > INSERT INTO Item VALUES (6, 'gloves    ')

> > INSERT INTO Customer VALUES (1, 'Bob       ')
> > INSERT INTO Customer VALUES (2, 'Dan       ')
> > INSERT INTO Customer VALUES (3, 'Frank     ')
> > INSERT INTO Customer VALUES (4, 'Goerge    ')
> > INSERT INTO Customer VALUES (5, 'Biff      ')

> > INSERT INTO Purchase VALUES (1, 1)
> > INSERT INTO Purchase VALUES (2, 1)
> > INSERT INTO Purchase VALUES (3, 1)
> > INSERT INTO Purchase VALUES (3, 2)
> > INSERT INTO Purchase VALUES (6, 2)
> > INSERT INTO Purchase VALUES (5, 2)
> > INSERT INTO Purchase VALUES (5, 4)
> > INSERT INTO Purchase VALUES (4, 4)
> > INSERT INTO Purchase VALUES (4, 4)
> > INSERT INTO Purchase VALUES (3, 1)
> > INSERT INTO Purchase VALUES (1, 1)

Sun, 23 Nov 2003 21:47:51 GMT
Puzzler: Distinct counts on aggregate data using cube
I have dowloaded your tool and the exec that you have listed below does not

parameters.  What am I doing wrong?

Matt

Quote:
> Interestingly the RAC crosstab and pivoting utility

> 'Purchase join Item on Purchase.Item = Item.ItemId
> join Customer on Purchase.Customer = Customer.CustomerId',

> 'select ItemName,
> cast(dbo.racrowfval(rowfunct,1) as varchar(3)) as [distinct cnt],
> funct,_pvtcols_ from rac order by rd'

> ItemName distinct cnt funct  Totals  Bob   Dan   Goerge
> --------     ------------ ----- ------ ------ ------ ------
> gloves       1                 cnt   1                   1
> hat            2                 cnt   2                   1        1
> jacket       1                 cnt   2                             2
> pants        2                 cnt   3         2        1
> shirt          1                 cnt   1         1
> socks       1                 cnt   2         2
> Totals      3                  cnt   11      5         3      3

> stevie
> Check out rac at:
> www.angelfire.com/ny4/rac/
> Inquire for new s2k beta.

Sun, 23 Nov 2003 22:40:22 GMT
Puzzler: Distinct counts on aggregate data using cube

Quote:
> I have dowloaded your tool and the exec that you have listed below does
not

> parameters.  What am I doing wrong?

This is based on the new s2k beta.I will e-mail the script to you.

stevie

Sun, 23 Nov 2003 23:35:30 GMT

 Page 1 of 1 [ 6 post ]

Relevant Pages