Decoding the binary format of Ingres copy-out data
Author |
Message |
Sean Phel #1 / 8
|
 Decoding the binary format of Ingres copy-out data
Hi, I'm wondering if there's any material available on how Ingres encodes the data files generated by "copy out" when the "-c" option is not used to generate clear text. We have about 10 years worth of archive data stored in this format, and we've moved the database to Oracle. The catch is that we need the ability to de-archive that old data and bring it into the new Oracle database. Therefore, I need to build a utility to convert these data files into something we can load. Does anyone have any knowledge about the binary format for copy out? Thanks in advance, Sean Phelan
|
Tue, 02 Nov 2004 05:07:23 GMT |
|
 |
Martin Bow #2 / 8
|
 Decoding the binary format of Ingres copy-out data
Hi Sean, I suspect that these are dumb questions, but... 1. Other than the data files do you still have the associated copy.in scripts? 2. It is a common practice to do an unloaddb on the databases. Do you have any output from that? Note that in the past ten years if you have been upgrading your Ingres Installation regurly the output format from unloaddb has altered. 3. Do you have any backup tapes of the old (presumably trashed) ingres installation that would hold viable checkpoints? How far back in time do these go? 4. Do you still have a running Ingres Installation? What version is it and have you been upgrading the version regurly? 5. Do you have a record of changes made to the structures of the Ingres database tables?. ie added column x dataatype Y to table fred on this date? If you can answer yes to any of these questions then you may be in luck. What you could do is load the data back into an Ingres database and from that generate files in a more appropriate format for Oracle. The recovery of the data over this time scale may be very difficult and require a lot of different techniques for each individual case. For instance , you may have to restore an old version of Ingres, just to recover a checkpoint file that recreates the database to a point in time close to the time stamp on the copydb data files. From this database some experimentation may solve which data file belongs to which table. Of course their are no garuntees. Martin Bowes Quote: > Hi, > I'm wondering if there's any material available on how Ingres encodes > the data files generated by "copy out" when the "-c" option is not > used to generate clear text. > We have about 10 years worth of archive data stored in this format, > and we've moved the database to Oracle. > The catch is that we need the ability to de-archive that old data and > bring it into the new Oracle database. > Therefore, I need to build a utility to convert these data files into > something we can load. > Does anyone have any knowledge about the binary format for copy out? > Thanks in advance, > Sean Phelan
-- Random Earthworm Jim Quote #20: PsyCrow - Maximum Suckage!
|
Tue, 02 Nov 2004 07:15:03 GMT |
|
 |
Karl & Betty Schend #3 / 8
|
 Decoding the binary format of Ingres copy-out data
Quote: >I'm wondering if there's any material available on how Ingres encodes >the data files generated by "copy out" when the "-c" option is not >used to generate clear text. >We have about 10 years worth of archive data stored in this format, >and we've moved the database to Oracle.
Are you still on the same platform? if your hardware architecture has changed, you may be in trouble. Anyway, the binary output is simply the binary data, each column following with no padding. integer1's are output in a single byte, i2's in two bytes, etc. char(n) takes n bytes with trailing space padding. varchar(n) has a 2-byte leading length followed by n bytes. (all N bytes are always output although only "length" bytes are significant.) A "with null" column is followed by a single byte null indicator, 0 if not null and 1 if null. dates are magical and your best bet is to reinstall Ingres somewhere if you need to translate dates. If you can just skip them, they are 12 bytes in binary. MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will be output as floating 110.0). I'm not sure about decimal but if your data is that old you probably don't have any decimal columns. --
K/B Computer Associates www.kbcomputer.com Ingres, Unix, VMS Consulting and Training
|
Tue, 02 Nov 2004 08:08:14 GMT |
|
 |
Kevin Robinso #4 / 8
|
 Decoding the binary format of Ingres copy-out data
Yes, A C Strcuture should get you around the problem. You'll have to make sure you only read the exact nuumner of bytes into the program and be carefull of things like nulls and the like. I thinj the C instruction is fread(*file,sizeof(struct) ir something like it. A s a test you could set a table up of exactly the same format as the houst table and copy the data back in with () and see what happens. Again beware of nulls and table structures. kevin
Quote:
> >I'm wondering if there's any material available on how Ingres encodes > >the data files generated by "copy out" when the "-c" option is not > >used to generate clear text. > >We have about 10 years worth of archive data stored in this format, > >and we've moved the database to Oracle. > Are you still on the same platform? if your hardware architecture > has changed, you may be in trouble. Anyway, the binary output is > simply the binary data, each column following with no padding. > integer1's are output in a single byte, i2's in two bytes, etc. > char(n) takes n bytes with trailing space padding. > varchar(n) has a 2-byte leading length followed by n bytes. (all N > bytes are always output although only "length" bytes are significant.) > A "with null" column is followed by a single byte null indicator, > 0 if not null and 1 if null. dates are magical and your best bet is > to reinstall Ingres somewhere if you need to translate dates. If you > can just skip them, they are 12 bytes in binary. > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will > be output as floating 110.0). I'm not sure about decimal but if your > data is that old you probably don't have any decimal columns. > --
> K/B Computer Associates www.kbcomputer.com > Ingres, Unix, VMS Consulting and Training
|
Sun, 07 Nov 2004 05:05:31 GMT |
|
 |
Bob McCallist #5 / 8
|
 Decoding the binary format of Ingres copy-out data
Quote:
> > Are you still on the same platform? if your hardware architecture > > has changed, you may be in trouble. Anyway, the binary output is > > simply the binary data, each column following with no padding. > > integer1's are output in a single byte, i2's in two bytes, etc. > > char(n) takes n bytes with trailing space padding. > > varchar(n) has a 2-byte leading length followed by n bytes. (all N > > bytes are always output although only "length" bytes are significant.) > > A "with null" column is followed by a single byte null indicator, > > 0 if not null and 1 if null. dates are magical and your best bet is > > to reinstall Ingres somewhere if you need to translate dates. If you > > can just skip them, they are 12 bytes in binary. > > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will > > be output as floating 110.0). I'm not sure about decimal but if your > > data is that old you probably don't have any decimal columns. > > --
> > K/B Computer Associates www.kbcomputer.com > > Ingres, Unix, VMS Consulting and Training
We wrote a program several years ago to switch the byte order of integer datatypes etc in unloaded ingres data files. If you end up needing the code, just let me know. I think it used the copy.in script as a reference so you may or may not be able to use it.
|
Tue, 09 Nov 2004 04:52:10 GMT |
|
 |
KDP #6 / 8
|
 Decoding the binary format of Ingres copy-out data
If I remember right dates look all messed up but they are not impossible. I spent many hours looking at binary dates in binary ways until I saw they are really not all that binary. If I remember right they are made up of several word length fields in a strange order. For example 29-June-2002 will look like 1432 16448 29 5 2002 33. Notice the 29 the 5 and 2002 it's that dumb. The tricky part is the other fields, they change depending on the data! Some have times some do not and, again it's been a while, but I remember that there is a flag that marks the those with times with a 0 and without times is a 1. The offset between Greenwich and the location where the client application was running is also stored. The time I think is in milliseconds since some odd initial date that never made sense to me, maybe it is the initiation date for GMT or something. The most difficult part for me was getting the date part was just three words that were simple integers holding the simple values after that the rest fell into place. The real problem is knowing the ORIGINAL sctructure of the table. If you don't have that it's a big mess!! If you have a really big pile of old data and it's worth the pain I could get out some old code that can read all that binanry stuff and even figure out the sctrure dynamically by referencing a copy of the system catalogs. If you don't have many tables and they didn't change much over their life it's not worth it but what you are trying to do could become impossible to do by hand, if that is the case let me know. Later, KP
Quote: > Yes, > A C Strcuture should get you around the problem. You'll have to make sure > you only read the exact nuumner of bytes into the program and be carefull of > things like nulls and the like. I thinj the C instruction is > fread(*file,sizeof(struct) ir something like it. > A s a test you could set a table up of exactly the same format as the houst > table and copy the data back in with () and see what happens. Again beware > of nulls and table structures. > kevin
> > >I'm wondering if there's any material available on how Ingres encodes > > >the data files generated by "copy out" when the "-c" option is not > > >used to generate clear text. > > >We have about 10 years worth of archive data stored in this format, > > >and we've moved the database to Oracle. > > Are you still on the same platform? if your hardware architecture > > has changed, you may be in trouble. Anyway, the binary output is > > simply the binary data, each column following with no padding. > > integer1's are output in a single byte, i2's in two bytes, etc. > > char(n) takes n bytes with trailing space padding. > > varchar(n) has a 2-byte leading length followed by n bytes. (all N > > bytes are always output although only "length" bytes are significant.) > > A "with null" column is followed by a single byte null indicator, > > 0 if not null and 1 if null. dates are magical and your best bet is > > to reinstall Ingres somewhere if you need to translate dates. If you > > can just skip them, they are 12 bytes in binary. > > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will > > be output as floating 110.0). I'm not sure about decimal but if your > > data is that old you probably don't have any decimal columns. > > --
> > K/B Computer Associates www.kbcomputer.com > > Ingres, Unix, VMS Consulting and Training
-----------== Posted via Newsfeed.Com - Uncensored Usenet News ==---------- http://www.newsfeed.com The #1 Newsgroup Service in the World! -----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers =-----
|
Mon, 15 Nov 2004 09:47:33 GMT |
|
 |
Kevin Robinso #7 / 8
|
 Decoding the binary format of Ingres copy-out data
for date format , look at he ca ingres support pagese, they have got an article on how they work interally
Quote: > If I remember right dates look all messed up but they are not impossible. I > spent many hours looking at binary dates in binary ways until I saw they are > really not all that binary. > If I remember right they are made up of several word length fields in a > strange order. For example 29-June-2002 will look like 1432 16448 29 > 5 2002 33. Notice the 29 the 5 and 2002 it's that dumb. The tricky > part is the other fields, they change depending on the data! Some have > times some do not and, again it's been a while, but I remember that there is > a flag that marks the those with times with a 0 and without times is a 1. > The offset between Greenwich and the location where the client application > was running is also stored. The time I think is in milliseconds since some > odd initial date that never made sense to me, maybe it is the initiation > date for GMT or something. The most difficult part for me was getting the > date part was just three words that were simple integers holding the simple > values after that the rest fell into place. > The real problem is knowing the ORIGINAL sctructure of the table. If you > don't have that it's a big mess!! If you have a really big pile of old data > and it's worth the pain I could get out some old code that can read all that > binanry stuff and even figure out the sctrure dynamically by referencing a > copy of the system catalogs. If you don't have many tables and they didn't > change much over their life it's not worth it but what you are trying to do > could become impossible to do by hand, if that is the case let me know. > Later, > KP
message
> > Yes, > > A C Strcuture should get you around the problem. You'll have to make sure > > you only read the exact nuumner of bytes into the program and be carefull > of > > things like nulls and the like. I thinj the C instruction is > > fread(*file,sizeof(struct) ir something like it. > > A s a test you could set a table up of exactly the same format as the > houst > > table and copy the data back in with () and see what happens. Again beware > > of nulls and table structures. > > kevin
> > > >I'm wondering if there's any material available on how Ingres encodes > > > >the data files generated by "copy out" when the "-c" option is not > > > >used to generate clear text. > > > >We have about 10 years worth of archive data stored in this format, > > > >and we've moved the database to Oracle. > > > Are you still on the same platform? if your hardware architecture > > > has changed, you may be in trouble. Anyway, the binary output is > > > simply the binary data, each column following with no padding. > > > integer1's are output in a single byte, i2's in two bytes, etc. > > > char(n) takes n bytes with trailing space padding. > > > varchar(n) has a 2-byte leading length followed by n bytes. (all N > > > bytes are always output although only "length" bytes are significant.) > > > A "with null" column is followed by a single byte null indicator, > > > 0 if not null and 1 if null. dates are magical and your best bet is > > > to reinstall Ingres somewhere if you need to translate dates. If you > > > can just skip them, they are 12 bytes in binary. > > > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will > > > be output as floating 110.0). I'm not sure about decimal but if your > > > data is that old you probably don't have any decimal columns. > > > --
> > > K/B Computer Associates www.kbcomputer.com > > > Ingres, Unix, VMS Consulting and Training > -----------== Posted via Newsfeed.Com - Uncensored Usenet News ==---------- > http://www.newsfeed.com The #1 Newsgroup Service in the World! > -----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers
=-----
|
Thu, 18 Nov 2004 05:55:02 GMT |
|
 |
Mark Murph #8 / 8
|
 Decoding the binary format of Ingres copy-out data
The Ingres internal date format is described in http://support.ca.com/techbases/ingres/4040.html Quote:
> for date format , look at he ca ingres support pagese, they have got an > article on how they work interally
> > If I remember right dates look all messed up but they are not impossible. > I > > spent many hours looking at binary dates in binary ways until I saw they > are > > really not all that binary. > > If I remember right they are made up of several word length fields in a > > strange order. For example 29-June-2002 will look like 1432 16448 29 > > 5 2002 33. Notice the 29 the 5 and 2002 it's that dumb. The tricky > > part is the other fields, they change depending on the data! Some have > > times some do not and, again it's been a while, but I remember that there > is > > a flag that marks the those with times with a 0 and without times is a 1. > > The offset between Greenwich and the location where the client application > > was running is also stored. The time I think is in milliseconds since > some > > odd initial date that never made sense to me, maybe it is the initiation > > date for GMT or something. The most difficult part for me was getting the > > date part was just three words that were simple integers holding the > simple > > values after that the rest fell into place. > > The real problem is knowing the ORIGINAL sctructure of the table. If you > > don't have that it's a big mess!! If you have a really big pile of old > data > > and it's worth the pain I could get out some old code that can read all > that > > binanry stuff and even figure out the sctrure dynamically by referencing a > > copy of the system catalogs. If you don't have many tables and they > didn't > > change much over their life it's not worth it but what you are trying to > do > > could become impossible to do by hand, if that is the case let me know. > > Later, > > KP
> message
> > > Yes, > > > A C Strcuture should get you around the problem. You'll have to make > sure > > > you only read the exact nuumner of bytes into the program and be > carefull > > of > > > things like nulls and the like. I thinj the C instruction is > > > fread(*file,sizeof(struct) ir something like it. > > > A s a test you could set a table up of exactly the same format as the > > houst > > > table and copy the data back in with () and see what happens. Again > beware > > > of nulls and table structures. > > > kevin
> > > > >I'm wondering if there's any material available on how Ingres encodes > > > > >the data files generated by "copy out" when the "-c" option is not > > > > >used to generate clear text. > > > > >We have about 10 years worth of archive data stored in this format, > > > > >and we've moved the database to Oracle. > > > > Are you still on the same platform? if your hardware architecture > > > > has changed, you may be in trouble. Anyway, the binary output is > > > > simply the binary data, each column following with no padding. > > > > integer1's are output in a single byte, i2's in two bytes, etc. > > > > char(n) takes n bytes with trailing space padding. > > > > varchar(n) has a 2-byte leading length followed by n bytes. (all N > > > > bytes are always output although only "length" bytes are significant.) > > > > A "with null" column is followed by a single byte null indicator, > > > > 0 if not null and 1 if null. dates are magical and your best bet is > > > > to reinstall Ingres somewhere if you need to translate dates. If you > > > > can just skip them, they are 12 bytes in binary. > > > > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 > will > > > > be output as floating 110.0). I'm not sure about decimal but if your > > > > data is that old you probably don't have any decimal columns. > > > > --
> > > > K/B Computer Associates www.kbcomputer.com > > > > Ingres, Unix, VMS Consulting and Training > > -----------== Posted via Newsfeed.Com - Uncensored Usenet News > ==---------- > > http://www.newsfeed.com The #1 Newsgroup Service in the World! > > -----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers > =-----
|
Thu, 18 Nov 2004 21:14:38 GMT |
|
|
|