low cardinality columns and indexing. 
Author Message
 low cardinality columns and indexing.
I am using the standard edition of oracle.  I realize that bitmap indexes
aren't supported.

I have a table that can contain around 50 million records and can be looked
up by column
X.  X has a cardinality of 20.  i.e. it can contain 20 distinct values.  A
lookup by 'X' can
will be done often.

Does it make sense to not use an index at all.  Or is a Btree index better
then nothing.



Sun, 23 May 2004 23:58:04 GMT
 low cardinality columns and indexing.



Quote:
>I am using the standard edition of oracle.  I realize that bitmap indexes
>aren't supported.

>I have a table that can contain around 50 million records and can be looked
>up by column
>X.  X has a cardinality of 20.  i.e. it can contain 20 distinct values.  A
>lookup by 'X' can
>will be done often.

>Does it make sense to not use an index at all.  Or is a Btree index better
>then nothing.

Isn't this a perfect case for INDEX COMPRESSion?

Anyway,

CREATE TABLE Moo (Cow NUMBER);

BEGIN
 FOR A IN 1..20 LOOP
   FOR B IN 1..1000 LOOP
     INSERT INTO Moo VALUES (A);
   END LOOP;
 END LOOP;
END;
/

COMMIT;

SELECT COUNT(*) FROM Moo WHERE Cow = 5;
SELECT COUNT(*) FROM Moo WHERE Cow = 6;
SELECT COUNT(*) FROM Moo WHERE Cow = 7;

For any of them:

First time:   Elapsed: 00:00:00.03
After that:  Elapsed: 00:00:00.01

CREATE INDEX A ON Moo(Cow);

SELECT COUNT(*) FROM Moo WHERE Cow = 8;
SELECT COUNT(*) FROM Moo WHERE Cow = 9;
SELECT COUNT(*) FROM Moo WHERE Cow = 10;

For any of them:
First time: Elapsed: 00:00:00.01
After that: Elapsed: 00:00:00.00

I'd say the index helps. That's with only 20,000 records. You may just
want to test yourself on a larger set.

Brian



Mon, 24 May 2004 01:21:31 GMT
 low cardinality columns and indexing.
I will suggest that you store these 50 millions rows into a
partitioned table by using each cardinality as the key for each
partition.
Quote:

> I am using the standard edition of oracle.  I realize that bitmap indexes
> aren't supported.

> I have a table that can contain around 50 million records and can be looked
> up by column
> X.  X has a cardinality of 20.  i.e. it can contain 20 distinct values.  A
> lookup by 'X' can
> will be done often.

> Does it make sense to not use an index at all.  Or is a Btree index better
> then nothing.



Mon, 24 May 2004 04:19:37 GMT
 low cardinality columns and indexing.
But this becomes non-maintanable if you have, say, 5 columns in a table that
you can do lookups on that have a low cardinality.


Quote:
> I will suggest that you store these 50 millions rows into a
> partitioned table by using each cardinality as the key for each
> partition.




Quote:
> > I am using the standard edition of oracle.  I realize that bitmap
indexes
> > aren't supported.

> > I have a table that can contain around 50 million records and can be
looked
> > up by column
> > X.  X has a cardinality of 20.  i.e. it can contain 20 distinct values.
A
> > lookup by 'X' can
> > will be done often.

> > Does it make sense to not use an index at all.  Or is a Btree index
better
> > then nothing.



Mon, 24 May 2004 05:06:24 GMT
 low cardinality columns and indexing.
That depends on how those values are distributed and how the rows are
distributed in your blocks.  Depending on your block size, you could
have one of your 'X' rows in nearly every block.  Thus, you would do a
range scan of the index and then virtually do a full tablescan as you
would still read in every block.  Now, if you have skewed values, that
is your column X has some values that take up most of the rows, then
an index would help when going after the values that do not exist in
most of the blocks (this is where histograms and the cost based
optimizer come in handy!)  Unfortunately, you've stumbled upon the
biggest problem with using cardinality as a benchmark for the
usefulness of an index.  A classic example is indexing by gender.  The
cardinality will (generally) be 2.  Which would lead you to think that
an index wouldn't be a good idea.  In the general population, this is
true.  Odds are, you will fetch all of the blocks anyway in a query
using gender.  Now, take the case of VMI or the Citidel.  The
cardinality is still two, but you and I both know that 50% of the
school isn't made up of women cadets.  In that case, an index would
REALLY help when you were trying to access female cadet information,
but hurt you when accessing male cadet information.

I apologize for being a bit long winded, but you can't rely solely on
cardinality when making index choices.  You need to know the nature of
your data and the distribution of values in the data set.

HTH

Dan Peacock
DBA
Wolverine World Wide

Quote:

> I am using the standard edition of oracle.  I realize that bitmap indexes
> aren't supported.

> I have a table that can contain around 50 million records and can be looked
> up by column
> X.  X has a cardinality of 20.  i.e. it can contain 20 distinct values.  A
> lookup by 'X' can
> will be done often.

> Does it make sense to not use an index at all.  Or is a Btree index better
> then nothing.



Mon, 24 May 2004 05:44:08 GMT
 low cardinality columns and indexing.
Yes, i understand your point.  I'm assumming that our column 'X' will be
distributed
most evenly.  But you're right, if there is an almost-unique value in a low
cardinality column,
then maybe an index is worth doing.


Quote:
> That depends on how those values are distributed and how the rows are
> distributed in your blocks.  Depending on your block size, you could
> have one of your 'X' rows in nearly every block.  Thus, you would do a
> range scan of the index and then virtually do a full tablescan as you
> would still read in every block.  Now, if you have skewed values, that
> is your column X has some values that take up most of the rows, then
> an index would help when going after the values that do not exist in
> most of the blocks (this is where histograms and the cost based
> optimizer come in handy!)  Unfortunately, you've stumbled upon the
> biggest problem with using cardinality as a benchmark for the
> usefulness of an index.  A classic example is indexing by gender.  The
> cardinality will (generally) be 2.  Which would lead you to think that
> an index wouldn't be a good idea.  In the general population, this is
> true.  Odds are, you will fetch all of the blocks anyway in a query
> using gender.  Now, take the case of VMI or the Citidel.  The
> cardinality is still two, but you and I both know that 50% of the
> school isn't made up of women cadets.  In that case, an index would
> REALLY help when you were trying to access female cadet information,
> but hurt you when accessing male cadet information.

> I apologize for being a bit long winded, but you can't rely solely on
> cardinality when making index choices.  You need to know the nature of
> your data and the distribution of values in the data set.

> HTH

> Dan Peacock
> DBA
> Wolverine World Wide




- Show quoted text -

Quote:
> > I am using the standard edition of oracle.  I realize that bitmap
indexes
> > aren't supported.

> > I have a table that can contain around 50 million records and can be
looked
> > up by column
> > X.  X has a cardinality of 20.  i.e. it can contain 20 distinct values.
A
> > lookup by 'X' can
> > will be done often.

> > Does it make sense to not use an index at all.  Or is a Btree index
better
> > then nothing.



Mon, 24 May 2004 07:07:43 GMT
 
 [ 6 post ] 

 Relevant Pages 

1. When to use bitmap index based on cardinality

2. low fat column?

3. Indexes with low selectivity

4. Get Lowest Column Value in a Row

5. Low Performance due to unused indexes

6. Low Performance Due to unused index

7. indexes with low selectivity

8. Low hit ratio, but only for indexes

9. Usage of indexes with greater/lower then operators

10. Column name in lower case in 8.1

11. Usage of indexes with greater/lower then operators

12. I need to add a new record with a low index number to a Paradox table


 
Powered by phpBB® Forum Software