request for database bug 'war stories' 
Author Message
 request for database bug 'war stories'

Hello--

I'm looking for anecdotes describing debugging experiences with
database systems. In particular, I want to hear about how you've
solved particularly difficult bugs that were a real headache in a real
system. The bugs could have occurred in any aspect of a database:
problems with the user interface, the SQL, processing efficiency, data
model, stored procedures/triggers/constraints, etc. I'm interested in
both solved and unsolved bugs.

I'd like to hear about how you solved the problem--did you use any
supporting tools, did you have a systematic approach for homing in on
the bug, did a solution suddenly come to you out of nowhere? What made
this problem particularly memorable? Indeed, was a solution ever
found?

A brief, stream of consciousness style response (right now!) would be
wonderful, and much better than a carefully worked out story (in a few
days). I can then get back to you with further questions if necessary.

I'm collecting these database bug war stories as part of a research
project that is examining the types of difficult database problems
that database professionals (at all levels) experience. These war
stories will be used for analysis only, and no names or other
identifying characteristics of your tales will be stated in the final
(or any other) report.

Thank you for your help!
Sally Jo Cunningham

Dept. of Computer Science
University of Waikato           phone:  64-7-838-4402
Private Bag 3105                fax:    64-7-838-5095
Hamilton, New Zealand



Fri, 12 Mar 2004 13:27:50 GMT
 request for database bug 'war stories'

I can share a silly incident that happened with one of my friends.

They setup SQL Server 7.0 a couple of months back. Server was doing great
and everything was fine. Then they wanted to setup replication between two
SQL Servers. They opened the Enterprise Manager and started clicking on the
replication options, but no wizards were coming up!!! and surprisingly no
error messages were thrown. Out of frustration, they tried stuff like
replacing Enterprise Manager files by copying them from another SQL Server,
reinstalling Enterprise Manager etc. They were all set to reinstall SQL
Server as well. Just before they went ahead with the last resort, he called
me. I asked him to recollect if the replication components were unchecked
while installing SQL Server. He couldn't. So, I asked him to go ahead and
run the setup, and obviously, replication components were not selected. So,
we selected replication components alone, installed them, and now the
replication wizards come up in Enterprise Manager.

The bug here is that, the replication options are NOT disabled in Enterprise
Manager, though replication was not installed. Furhter, you can click on
those replication menu items, but you don't get any errors. Later we figured
that, Microsoft didn't want to disable those options in Enterprise Manager,
because it affects the discoverability of features :-)
--
HTH,
Vyas
SQL Server FAQ, articles, code samples,

http://vyaskn.tripod.com/



Quote:
> Hello--

> I'm looking for anecdotes describing debugging experiences with
> database systems. In particular, I want to hear about how you've
> solved particularly difficult bugs that were a real headache in a real
> system. The bugs could have occurred in any aspect of a database:
> problems with the user interface, the SQL, processing efficiency, data
> model, stored procedures/triggers/constraints, etc. I'm interested in
> both solved and unsolved bugs.

> I'd like to hear about how you solved the problem--did you use any
> supporting tools, did you have a systematic approach for homing in on
> the bug, did a solution suddenly come to you out of nowhere? What made
> this problem particularly memorable? Indeed, was a solution ever
> found?

> A brief, stream of consciousness style response (right now!) would be
> wonderful, and much better than a carefully worked out story (in a few
> days). I can then get back to you with further questions if necessary.

> I'm collecting these database bug war stories as part of a research
> project that is examining the types of difficult database problems
> that database professionals (at all levels) experience. These war
> stories will be used for analysis only, and no names or other
> identifying characteristics of your tales will be stated in the final
> (or any other) report.

> Thank you for your help!
> Sally Jo Cunningham

> Dept. of Computer Science
> University of Waikato           phone:  64-7-838-4402
> Private Bag 3105                fax:    64-7-838-5095
> Hamilton, New Zealand



Fri, 12 Mar 2004 14:57:07 GMT
 request for database bug 'war stories'
Hi Sally,

Interesting question.

It has been my experience that these types of problems are usually something
simple that happens sporadically.

For example. One system, I work on was developed by a consultant, (I'll call
him Bob). Bob setup this system to track all changes to audit tables.
(Tables which store every change, Insert and Update to the database).. One
day I discovered a nightly process was not running. Bob said. "I don't know
the stored procedure always runs in the Query analyzer.".

After several days of this, I somehow determined the problem. The audit
tables had a field for the user that updated the database. This field was 10
charcacters long. When the nightly job ran it was trying to put the full
name of the sql server agent account in this field. This name (including
doman) was well over 10 characters. So not surprisingly the procedure
failed.

 Bob wrote another system. Its function was to do various sales reporting.
After about year (after Bob left). The users complained that the Last Year
figures were not being calculated correctly. When Bob wrote the system,
there was no "Last Year". The data was not yet available. So he wrote all
these procedures without testing them with data (so 0 was a plausible
answer).

I rewrote much of that system. (There were several other problems).

One day, I discovered that TempDB was getting to be massive. (Like 8-10GB
when our largest database was about 6gb). After weeks of trying to figure
out what was happening. Including talking to Microsoft as thought there was
a bug in SQL, I wrote a procedure to track the size of TempDB. I soon
realized it was growing primarly when it was building the temporary holding
tables for the reporting system Bob wrote.

Looking more closely at this code I realzied some major inefficiencies in
terms of the way it was doing things. (The main thing was it was creating
temporary tables which included records for sales down to the ticket level
of sales detail (every sale in our 380 retail stores).)

HTH

--
-Dick Christoph

http://www1.minn.net/~dchristo



Quote:
> Hello--

> I'm looking for anecdotes describing debugging experiences with
> database systems. In particular, I want to hear about how you've
> solved particularly difficult bugs that were a real headache in a real
> system. The bugs could have occurred in any aspect of a database:
> problems with the user interface, the SQL, processing efficiency, data
> model, stored procedures/triggers/constraints, etc. I'm interested in
> both solved and unsolved bugs.

> I'd like to hear about how you solved the problem--did you use any
> supporting tools, did you have a systematic approach for homing in on
> the bug, did a solution suddenly come to you out of nowhere? What made
> this problem particularly memorable? Indeed, was a solution ever
> found?

> A brief, stream of consciousness style response (right now!) would be
> wonderful, and much better than a carefully worked out story (in a few
> days). I can then get back to you with further questions if necessary.

> I'm collecting these database bug war stories as part of a research
> project that is examining the types of difficult database problems
> that database professionals (at all levels) experience. These war
> stories will be used for analysis only, and no names or other
> identifying characteristics of your tales will be stated in the final
> (or any other) report.

> Thank you for your help!
> Sally Jo Cunningham

> Dept. of Computer Science
> University of Waikato           phone:  64-7-838-4402
> Private Bag 3105                fax:    64-7-838-5095
> Hamilton, New Zealand



Fri, 12 Mar 2004 20:09:56 GMT
 request for database bug 'war stories'
Hi Sally

Not long after leaving Hamilton, NZ (my home town) in '85 I had the
misfortune to work for a very successful company in London, England.

I say misfortune because had the company not been quite so successful
the database would have been fit for purpose. As it turned out it was
taking them 3 days to generate their weekly report (to tape) and 5
days to do the 4-weekly period report. The worst thing was that the
report runs occupied the only available backup device - back in the
days of 1/2" tape. The revenue generated by the reports was vitaly
important to the company - so important that the decision had been
made to forego the backup that was usually performed prior to each
report-run. Not only that but the heads were never cleaned - even
though, as you will see, the tape unit was *very* heavily used and
vitally important to the company.

My first involvement with the system was on a Thursday morning. They
had had something called "Group Format Errors" (essentially database
corruption caused by ???) which meant they had to abort the weekly
report-run in order to attempt to restore from the latest backup tape
and re-enter the data captured since then. Problem. The backup tape
had parity errors. Bigger problem - that tape had been cut over a week
beforehand and the one prior to it was two weeks old. Oh what fun we
had! It was all smiles and laughter I can tell you - except that I
would be lying big-time.

The root cause of all of this mayhem was the database design and also
the lack of database management expertise of any kind. Poor
operational management was a peripheral problem ;). The files had been
originally created when there were about a dozen customers and about a
hundred product-lines to be tracked in 120 stores. Over time these
figures grew rapidly but the files were never "re-sized". Hence almost
all of the data was in linked, or overflow, space. It is in just this
sort of environment that GFE's breed. Additionally, the system had
been designed and written without any form of indexing whatsoever - so
that a great deal of time was spent selecting and sorting files with
large keys.

The first thing I did, after restoring from the latest (but very old)
backup tape was resize the files. With all of the data now in
contiguous file space every step immediately became quicker. While the
operators were hard at work updating the data I identified and built
the most effective indexes. I then cleaned the heads on the tape drive
and backed up the system. We were then able to rerun the weekly report
in 18 hours. So far so good. Now came the real challenge. Over the
next 30 months I rewrote the software and redesigned the database -
with the effect that they were able to produce what was essentially
the period report every week - in around 5 hours.

The most rewarding part of redesigning the database was in seperating
the OLTP stuff from the OLAP and writing bridging software that gave
all of the users what they wanted. The people hammering away at their
keyboards updating product profiles got the speed they needed, the OCR
was running it's socks off capturing the data that had been sent in
and management was able to pull up the latest data in boardroom
presentations. I kinda wish they had stuck with Pick and sometimes
wonder how they're getting on with the DEC/Vax they replaced it with.
I had just managed to turn them around and get them to appreciate what
could be done on a multidimensional database and they had already
committed too heavily to the DEC/Vax development. Still - it was fun.

Cheers
Mike.


Quote:
> Hello--

> I'm looking for anecdotes describing debugging experiences with
> database systems. In particular, I want to hear about how you've
> solved particularly difficult bugs that were a real headache in a real
> system. The bugs could have occurred in any aspect of a database:
> problems with the user interface, the SQL, processing efficiency, data
> model, stored procedures/triggers/constraints, etc. I'm interested in
> both solved and unsolved bugs.

> I'd like to hear about how you solved the problem--did you use any
> supporting tools, did you have a systematic approach for homing in on
> the bug, did a solution suddenly come to you out of nowhere? What made
> this problem particularly memorable? Indeed, was a solution ever
> found?

> A brief, stream of consciousness style response (right now!) would be
> wonderful, and much better than a carefully worked out story (in a few
> days). I can then get back to you with further questions if necessary.

> I'm collecting these database bug war stories as part of a research
> project that is examining the types of difficult database problems
> that database professionals (at all levels) experience. These war
> stories will be used for analysis only, and no names or other
> identifying characteristics of your tales will be stated in the final
> (or any other) report.

> Thank you for your help!
> Sally Jo Cunningham

> Dept. of Computer Science
> University of Waikato           phone:  64-7-838-4402
> Private Bag 3105                fax:    64-7-838-5095
> Hamilton, New Zealand



Fri, 12 Mar 2004 22:01:12 GMT
 request for database bug 'war stories'
A few tidbits of my years of experience:

Using a database system called MUMPS, which I had never heard of before
being hired as the DP department, the WHOLE department, I self taught MUMPS
to where I found the previous programmer had written a number of core
programs to a large donor list tracking system that when a new "branch" was
created in the database, that specific record that created the branch could
never be accessed again.  Because of the way MUMPS uses a tree structure to
store it's data, and the depth of number of branches in this particular
database, there was about 15% of the database that was either (hopefully)
duplicated, or (worse for the organization) was never accessed again.

Another time in a Multivalue Database, two record key values were written
using a system delimiter.  This caused, ultimately, three data records to be
created and you couldn't ever be sure which record you were going to get in
reports or entry screens.  Took me 3 days to figure out the first time it
happened then another week to find the program that was causing it.

Then there was the time that a OS failure caused duplicate records to be
written into a file that used the same record key.  I ended up using a hex
editor to hack the database and repair the damage.

Another thing that happened was I filled up an old 80 Meg DISK Pack (old
multi platter DEC system) and because of the age of the O/S, it wrote file
pointers of where to return to before it wrote where it was coming from.
Effectively, I wrote off the end of the DISK Pack and couldn't get back.
Took me a MONTH to hack the system and never did recover the whole database.

Oh, and lets not forget the ONLY time I've ever had to hit "The Panic
Button."  A/C workers turned off the power to the wrong air conditioner and
our system came with 4 degrees of the vendors "S{*filter*}It" temperature.



Quote:
> Hello--

> I'm looking for anecdotes describing debugging experiences with
> database systems. In particular, I want to hear about how you've
> solved particularly difficult bugs that were a real headache in a real
> system. The bugs could have occurred in any aspect of a database:
> problems with the user interface, the SQL, processing efficiency, data
> model, stored procedures/triggers/constraints, etc. I'm interested in
> both solved and unsolved bugs.

> I'd like to hear about how you solved the problem--did you use any
> supporting tools, did you have a systematic approach for homing in on
> the bug, did a solution suddenly come to you out of nowhere? What made
> this problem particularly memorable? Indeed, was a solution ever
> found?

> A brief, stream of consciousness style response (right now!) would be
> wonderful, and much better than a carefully worked out story (in a few
> days). I can then get back to you with further questions if necessary.

> I'm collecting these database bug war stories as part of a research
> project that is examining the types of difficult database problems
> that database professionals (at all levels) experience. These war
> stories will be used for analysis only, and no names or other
> identifying characteristics of your tales will be stated in the final
> (or any other) report.

> Thank you for your help!
> Sally Jo Cunningham

> Dept. of Computer Science
> University of Waikato           phone:  64-7-838-4402
> Private Bag 3105                fax:    64-7-838-5095
> Hamilton, New Zealand



Fri, 12 Mar 2004 23:37:22 GMT
 request for database bug 'war stories'
You'd probably get better stories if we could send it via email...
Then you could repost for us anonymously....  

I'm a little hesitant to send a more recent story publicly...  ;-)

Once upon a time me and another guy invested 6 months of our live's,
and our job security, in what is now called a CRM system. We wrote it
on a "multiuser" database.

Symptoms were that it worked fine for me and my buddy testing, and
even for 2-3 users, but roll it out to the sales department, and
corrupted indexes within 2 hours.

Rebuild all indexes, died again randomly. Usually a "history" file,
but on great occasion could get it to die on other files.

Stress.

Eventually figured out that record and file locking didn't exacly work
on this great newly released database engine.

Proved it, and called the vendor. "YUR THE ONLY ONE IN HOLE WORLD".
yeah right.
Created a 100 record table. Wrote one program that updated a key
field, and started sequentially at the top. Wrote another program that
started at the bottom. Fired them off simultaneously, and got 100
percent failure rate.

Posted to a public bulletin board and got to be teh newest beta tester
for the fix!!!!!!! Bad news was that the new beta was MUCH slower, but
at least it worked.  ;-)

This was loong loong ago, on a system now gone, by a vendor out of
business.

Most problems are little stuff. Figure out what the problem isn't, and
purty soon, no matter how unlikely, what is left is at least involved
in the cause of the problem.

-Doug Miller



Sat, 13 Mar 2004 06:24:14 GMT
 request for database bug 'war stories'



Quote:
> Hello--

Hmm, the first one that comes to mind...

I had been consulting for about 10 years... a lot with one particular
company that sold a specialized system for a certain type of business.  This
software could run on Oracle or SQL Server.

ALL my experience at that point had been on SQL Server.

So, at this point I've mostly stopped consulting and am now working for a
new company as an employee....

I get a phone call from the old client of mine...

"We need you to fly to X and fix a customer's database on Monday can you do
it?"

I look at my watch... it's 4:30 PM... on Friday.

"Well, what sort of system?"

"Oracle."

"Sorry, can't help you, I don't know a thing about Oracle."

"That's ok... we'll fly you out anyway."

"No, seriously, you don't understand, I wouldn't even know how to start up
the server."

"That's ok, we'll fly you out anyway."

"No, seriously.... well, ok.. fine... two day minimum.  That means if I get
there Monday morning and fix it in 5 minutes and fly back the same day you
still pay me 2 days."

"Sure."

"Oh and it's X per diem (50% higher than my old rate with them)."

"Sure."

So, Saturday I get an Oracle Press book on Backup and Recovery and preceed
to teach myself a crash course in Oracle.

Sunday I hope on the plane.  Well, not really.  Notice National is closed.
Note, "I have to fly through National".  Manage to get the last seat on the
last plane to my destination.

At Pittsburgh, go through de-icing... twice.  And get refueled while waiting
for de-icing.  Finally take off from Pittsburgh at the time I was originally
scheduled to land at my destination.

Finally get to my distination at 3:45 AM.  Realize I have to be on site at
the customers at 7:00 AM.

I can tell this is shaping to be a winner of a trip.

Get to the customer site, surprisingly awake.

Customer fills me in... frazzled.... 3 of her other servers are also toast
due to various reasons.

Get in... figure out one table is corrupt.

Easy.. redo logs... no go.

Ok, easy... backups.

"Backups?  Hmm, I think we have one from last week..."

"Well, that's great you'll lose 4 days of production at least..."

"I'm desperate, try it!"

Well, I try it.  What I discover is they have a tape that they would
routinely insert into the tape backup machine and just as routinely remove
and put in storage, never testing to see if an actual backup was made.
Guess what, it wasn't.

So... finally in a fit of desperation, I try to dump out the table to a temp
table.

With a little work I get all the records but one!

Drop bad table, rename the temp one, start up the system... success.

Startup the production system... it's working like a charm.  Has some
records backlogged, but nothing it can't cope with.

It's not even 2:00 PM on Monday.

Tell them they can start processing and I'll be back in the morning to
double-check everything.

Things check out the next day and I manage to catch an earlier than
scheduled flight home.

All in all, a fun 48 hours.

Not bad for a guy that knows no Oracle. :-)

(And no, I never want to do that again.)



Sat, 13 Mar 2004 09:32:49 GMT
 request for database bug 'war stories'
Could it be the Vendor?

I was the data warehouse architect and lead (and only)  report developer for
a utility in NY State.  Our users were performing ad hoc queries using a
vendor's front-end interface.  All things were going well until one of our
users asked why he couldn't print graphs generated by this interface in
color.  What good is a 15 section pie chart that only prints in grey scale?

We checked his PC, we checked his printer, we checked and re-installed his
interface.  Still no luck.  I couldn't actually print color reports myself
at my office because we only have black and white printers.  But it said
right there in the manual and on-line help that all you needed to do was
check the "Print in Color" option and viola! it should print in color.

Opened a case with the vendor's Tech Support.  It took them nearly a month
to finally get back to me with an answer.  It seems that in their entire
enterprise (which is world wide) there are NO COLOR PRINTERS.  This little
feature was never actually tested by the vendor!  It was logged as a bug and
the next release fixed it.



Sun, 14 Mar 2004 18:58:52 GMT
 request for database bug 'war stories'
Thank you all for your stories!  They are very helpful to me in this
project, and are a hoot to read as well. Not many research projects
have such interesting data!

If you have any other stories that you'd like to send me direct, my

Cheers,
Sally Jo


Quote:
> Hello--

> I'm looking for anecdotes describing debugging experiences with
> database systems. In particular, I want to hear about how you've
> solved particularly difficult bugs that were a real headache in a real
> system. The bugs could have occurred in any aspect of a database:
> problems with the user interface, the SQL, processing efficiency, data
> model, stored procedures/triggers/constraints, etc. I'm interested in
> both solved and unsolved bugs.

> I'd like to hear about how you solved the problem--did you use any
> supporting tools, did you have a systematic approach for homing in on
> the bug, did a solution suddenly come to you out of nowhere? What made
> this problem particularly memorable? Indeed, was a solution ever
> found?

> A brief, stream of consciousness style response (right now!) would be
> wonderful, and much better than a carefully worked out story (in a few
> days). I can then get back to you with further questions if necessary.

> I'm collecting these database bug war stories as part of a research
> project that is examining the types of difficult database problems
> that database professionals (at all levels) experience. These war
> stories will be used for analysis only, and no names or other
> identifying characteristics of your tales will be stated in the final
> (or any other) report.

> Thank you for your help!
> Sally Jo Cunningham

> Dept. of Computer Science
> University of Waikato           phone:  64-7-838-4402
> Private Bag 3105                fax:    64-7-838-5095
> Hamilton, New Zealand



Tue, 16 Mar 2004 13:57:56 GMT
 request for database bug 'war stories'

Quote:

>> I'd like to hear about how you solved the problem--did you use any
>> supporting tools, did you have a systematic approach for homing in on
>> the bug, did a solution suddenly come to you out of nowhere? What made
>> this problem particularly memorable?

So far, the stories I've read have not really included the answers to
these questions, have they? They all seem to state the problem and
then jump immediately to the solution without detailing the steps they
took during the course of attempting to solving it. This is, of
course, difficult. People tend to forget the "false starts" and "wild
goose chases" and focus instead on the solution to the problem.  Are
you really getting what you need from these stories?

I still think you would get much more material by just browsing the
messages from the lists you've posted this to. For example, I just
posted a message with the subject "Intermittent timeouts inserting
into temp table" on the comp.databases.ms-sqlserver list. It is a
request for assistance, so it has no solution (yet), but it does
detail some of the steps I've taken to try to identify the problem.
Again, however, it does not include a couple of the false starts I
took, mainly because I didn't think of them while writing it and
because it was already long enough.

Hope this helps,
Bob Barrows



Tue, 16 Mar 2004 21:21:40 GMT
 
 [ 10 post ] 

 Relevant Pages 

1. request for database bug 'war stories'

2. request for database bug 'war stories'

3. request for database bug 'war stories'

4. request for database bug 'war stories'

5. request for database bug 'war stories'

6. request for database bug 'war stories'

7. request for database bug 'war stories'

8. request for database bug 'war stories'

9. request for database bug 'war stories'

10. request for database bug 'war stories'

11. request for database bug 'war stories'

12. request for database bug 'war stories'


 
Powered by phpBB® Forum Software