Decoding the binary format of Ingres copy-out data 
Author Message
 Decoding the binary format of Ingres copy-out data

Hi,

I'm wondering if there's any material available on how Ingres encodes
the data files generated by "copy out" when the "-c" option is not
used to generate clear text.

We have about 10 years worth of archive data stored in this format,
and we've moved the database to Oracle.

The catch is that we need the ability to de-archive that old data and
bring it into the new Oracle database.

Therefore, I need to build a utility to convert these data files into
something we can load.

Does anyone have any knowledge about the binary format for copy out?

Thanks in advance,
Sean Phelan



Tue, 02 Nov 2004 05:07:23 GMT
 Decoding the binary format of Ingres copy-out data

Hi Sean,

    I suspect that these are dumb questions, but...
    1.    Other than the data files do you still have the associated copy.in
        scripts?

    2.  It is a common practice to do an unloaddb on the databases. Do you have
        any output from that? Note that in the past ten years if you have been
        upgrading your Ingres Installation regurly the output format from
        unloaddb has altered.

    3.  Do you have any backup tapes of the old (presumably trashed) ingres
        installation that would hold viable checkpoints? How far back in time do
        these go?

    4.  Do you still have a running Ingres Installation? What version is it
        and have you been upgrading the version regurly?

    5.  Do you have a record of changes made to the structures of the Ingres
        database tables?. ie added column x dataatype Y to table fred on this
        date?

    If you can answer yes to any of these questions then you may be in luck.
    What you could do is load the data back into an Ingres database  and from
    that generate files in a more appropriate format for Oracle.

    The recovery of the data over this time scale may be very difficult and
    require a lot of different techniques for each individual case. For instance
    , you may have to restore an old version of Ingres, just to recover a
    checkpoint file that recreates the database to a point in time close to the
    time stamp on the copydb data files. From this database some experimentation
    may solve which data file belongs to which table. Of course their are no
    garuntees.

    Martin Bowes

Quote:

> Hi,

> I'm wondering if there's any material available on how Ingres encodes
> the data files generated by "copy out" when the "-c" option is not
> used to generate clear text.

> We have about 10 years worth of archive data stored in this format,
> and we've moved the database to Oracle.

> The catch is that we need the ability to de-archive that old data and
> bring it into the new Oracle database.

> Therefore, I need to build a utility to convert these data files into
> something we can load.

> Does anyone have any knowledge about the binary format for copy out?

> Thanks in advance,
> Sean Phelan

--
Random Earthworm Jim Quote #20:
PsyCrow - Maximum Suckage!


Tue, 02 Nov 2004 07:15:03 GMT
 Decoding the binary format of Ingres copy-out data

Quote:

>I'm wondering if there's any material available on how Ingres encodes
>the data files generated by "copy out" when the "-c" option is not
>used to generate clear text.

>We have about 10 years worth of archive data stored in this format,
>and we've moved the database to Oracle.

Are you still on the same platform?  if your hardware architecture
has changed, you may be in trouble.  Anyway, the binary output is
simply the binary data, each column following with no padding.
integer1's are output in a single byte, i2's in two bytes, etc.
char(n) takes n bytes with trailing space padding.
varchar(n) has a 2-byte leading length followed by n bytes.  (all N
bytes are always output although only "length" bytes are significant.)
A "with null" column is followed by a single byte null indicator,
0 if not null and 1 if null.  dates are magical and your best bet is
to reinstall Ingres somewhere if you need to translate dates.  If you
can just skip them, they are 12 bytes in binary.
MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will
be output as floating 110.0).  I'm not sure about decimal but if your
data is that old you probably don't have any decimal columns.

--

K/B Computer Associates       www.kbcomputer.com
Ingres, Unix, VMS             Consulting and Training



Tue, 02 Nov 2004 08:08:14 GMT
 Decoding the binary format of Ingres copy-out data
Yes,

A C Strcuture should get you around the problem. You'll have to make sure
you only read the exact nuumner of bytes into the program and be carefull of
things like nulls and the like. I thinj the C instruction is
fread(*file,sizeof(struct) ir something like it.

A s a test you could set a table up of exactly the same format as the houst
table and copy the data back in with () and see what happens. Again beware
of nulls and table structures.

kevin



Quote:

> >I'm wondering if there's any material available on how Ingres encodes
> >the data files generated by "copy out" when the "-c" option is not
> >used to generate clear text.

> >We have about 10 years worth of archive data stored in this format,
> >and we've moved the database to Oracle.

> Are you still on the same platform?  if your hardware architecture
> has changed, you may be in trouble.  Anyway, the binary output is
> simply the binary data, each column following with no padding.
> integer1's are output in a single byte, i2's in two bytes, etc.
> char(n) takes n bytes with trailing space padding.
> varchar(n) has a 2-byte leading length followed by n bytes.  (all N
> bytes are always output although only "length" bytes are significant.)
> A "with null" column is followed by a single byte null indicator,
> 0 if not null and 1 if null.  dates are magical and your best bet is
> to reinstall Ingres somewhere if you need to translate dates.  If you
> can just skip them, they are 12 bytes in binary.
> MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will
> be output as floating 110.0).  I'm not sure about decimal but if your
> data is that old you probably don't have any decimal columns.

> --

> K/B Computer Associates       www.kbcomputer.com
> Ingres, Unix, VMS             Consulting and Training



Sun, 07 Nov 2004 05:05:31 GMT
 Decoding the binary format of Ingres copy-out data

Quote:


> > Are you still on the same platform?  if your hardware architecture
> > has changed, you may be in trouble.  Anyway, the binary output is
> > simply the binary data, each column following with no padding.
> > integer1's are output in a single byte, i2's in two bytes, etc.
> > char(n) takes n bytes with trailing space padding.
> > varchar(n) has a 2-byte leading length followed by n bytes.  (all N
> > bytes are always output although only "length" bytes are significant.)
> > A "with null" column is followed by a single byte null indicator,
> > 0 if not null and 1 if null.  dates are magical and your best bet is
> > to reinstall Ingres somewhere if you need to translate dates.  If you
> > can just skip them, they are 12 bytes in binary.
> > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will
> > be output as floating 110.0).  I'm not sure about decimal but if your
> > data is that old you probably don't have any decimal columns.

> > --

> > K/B Computer Associates       www.kbcomputer.com
> > Ingres, Unix, VMS             Consulting and Training

We wrote a program several years ago to switch the byte order of
integer datatypes etc in unloaded ingres data files.  If you end up
needing the code, just let me know.  I think it used the copy.in
script as a reference so you may or may not be able to use it.


Tue, 09 Nov 2004 04:52:10 GMT
 Decoding the binary format of Ingres copy-out data
If I remember right dates look all messed up but they are not impossible.  I
spent many hours looking at binary dates in binary ways until I saw they are
really not all that binary.

If I remember right they are made up of several word length fields in a
strange order.  For example 29-June-2002 will look like 1432 16448    29
5  2002    33.  Notice the 29 the 5 and 2002 it's that dumb.  The tricky
part is the other fields, they change depending on the data!  Some have
times some do not and, again it's been a while, but I remember that there is
a flag that marks the those with times with a 0 and without times is a 1.
The offset between Greenwich and the location where the client application
was running is also stored.  The time I think is in milliseconds since some
odd initial date that never made sense to me, maybe it is the initiation
date for GMT or something.  The most difficult part for me was getting the
date part was just three words that were simple integers holding the simple
values after that the rest fell into place.

The real problem is knowing the ORIGINAL sctructure of the table.  If you
don't have that it's a big mess!!  If you have a really big pile of old data
and it's worth the pain I could get out some old code that can read all that
binanry stuff and even figure out the sctrure dynamically by referencing a
copy of the system catalogs.  If you don't have many tables and they didn't
change much over their life it's not worth it but what you are trying to do
could become impossible to do by hand, if that is the case let me know.

Later,
KP


Quote:
> Yes,

> A C Strcuture should get you around the problem. You'll have to make sure
> you only read the exact nuumner of bytes into the program and be carefull
of
> things like nulls and the like. I thinj the C instruction is
> fread(*file,sizeof(struct) ir something like it.

> A s a test you could set a table up of exactly the same format as the
houst
> table and copy the data back in with () and see what happens. Again beware
> of nulls and table structures.

> kevin




> > >I'm wondering if there's any material available on how Ingres encodes
> > >the data files generated by "copy out" when the "-c" option is not
> > >used to generate clear text.

> > >We have about 10 years worth of archive data stored in this format,
> > >and we've moved the database to Oracle.

> > Are you still on the same platform?  if your hardware architecture
> > has changed, you may be in trouble.  Anyway, the binary output is
> > simply the binary data, each column following with no padding.
> > integer1's are output in a single byte, i2's in two bytes, etc.
> > char(n) takes n bytes with trailing space padding.
> > varchar(n) has a 2-byte leading length followed by n bytes.  (all N
> > bytes are always output although only "length" bytes are significant.)
> > A "with null" column is followed by a single byte null indicator,
> > 0 if not null and 1 if null.  dates are magical and your best bet is
> > to reinstall Ingres somewhere if you need to translate dates.  If you
> > can just skip them, they are 12 bytes in binary.
> > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1 will
> > be output as floating 110.0).  I'm not sure about decimal but if your
> > data is that old you probably don't have any decimal columns.

> > --

> > K/B Computer Associates       www.kbcomputer.com
> > Ingres, Unix, VMS             Consulting and Training

-----------== Posted via Newsfeed.Com - Uncensored Usenet News ==----------
   http://www.newsfeed.com       The #1 Newsgroup Service in the World!
-----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers =-----


Mon, 15 Nov 2004 09:47:33 GMT
 Decoding the binary format of Ingres copy-out data
for date format , look at he ca ingres support pagese, they have got an
article on how they work interally


Quote:
> If I remember right dates look all messed up but they are not impossible.
I
> spent many hours looking at binary dates in binary ways until I saw they
are
> really not all that binary.

> If I remember right they are made up of several word length fields in a
> strange order.  For example 29-June-2002 will look like 1432 16448    29
> 5  2002    33.  Notice the 29 the 5 and 2002 it's that dumb.  The tricky
> part is the other fields, they change depending on the data!  Some have
> times some do not and, again it's been a while, but I remember that there
is
> a flag that marks the those with times with a 0 and without times is a 1.
> The offset between Greenwich and the location where the client application
> was running is also stored.  The time I think is in milliseconds since
some
> odd initial date that never made sense to me, maybe it is the initiation
> date for GMT or something.  The most difficult part for me was getting the
> date part was just three words that were simple integers holding the
simple
> values after that the rest fell into place.

> The real problem is knowing the ORIGINAL sctructure of the table.  If you
> don't have that it's a big mess!!  If you have a really big pile of old
data
> and it's worth the pain I could get out some old code that can read all
that
> binanry stuff and even figure out the sctrure dynamically by referencing a
> copy of the system catalogs.  If you don't have many tables and they
didn't
> change much over their life it's not worth it but what you are trying to
do
> could become impossible to do by hand, if that is the case let me know.

> Later,
> KP


message

> > Yes,

> > A C Strcuture should get you around the problem. You'll have to make
sure
> > you only read the exact nuumner of bytes into the program and be
carefull
> of
> > things like nulls and the like. I thinj the C instruction is
> > fread(*file,sizeof(struct) ir something like it.

> > A s a test you could set a table up of exactly the same format as the
> houst
> > table and copy the data back in with () and see what happens. Again
beware
> > of nulls and table structures.

> > kevin




> > > >I'm wondering if there's any material available on how Ingres encodes
> > > >the data files generated by "copy out" when the "-c" option is not
> > > >used to generate clear text.

> > > >We have about 10 years worth of archive data stored in this format,
> > > >and we've moved the database to Oracle.

> > > Are you still on the same platform?  if your hardware architecture
> > > has changed, you may be in trouble.  Anyway, the binary output is
> > > simply the binary data, each column following with no padding.
> > > integer1's are output in a single byte, i2's in two bytes, etc.
> > > char(n) takes n bytes with trailing space padding.
> > > varchar(n) has a 2-byte leading length followed by n bytes.  (all N
> > > bytes are always output although only "length" bytes are significant.)
> > > A "with null" column is followed by a single byte null indicator,
> > > 0 if not null and 1 if null.  dates are magical and your best bet is
> > > to reinstall Ingres somewhere if you need to translate dates.  If you
> > > can just skip them, they are 12 bytes in binary.
> > > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1
will
> > > be output as floating 110.0).  I'm not sure about decimal but if your
> > > data is that old you probably don't have any decimal columns.

> > > --

> > > K/B Computer Associates       www.kbcomputer.com
> > > Ingres, Unix, VMS             Consulting and Training

> -----------== Posted via Newsfeed.Com - Uncensored Usenet News
==----------
>    http://www.newsfeed.com       The #1 Newsgroup Service in the World!
> -----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers

=-----


Thu, 18 Nov 2004 05:55:02 GMT
 Decoding the binary format of Ingres copy-out data
The Ingres internal date format is described in
http://support.ca.com/techbases/ingres/4040.html
Quote:

> for date format , look at he ca ingres support pagese, they have got an
> article on how they work interally



> > If I remember right dates look all messed up but they are not impossible.
> I
> > spent many hours looking at binary dates in binary ways until I saw they
> are
> > really not all that binary.

> > If I remember right they are made up of several word length fields in a
> > strange order.  For example 29-June-2002 will look like 1432 16448    29
> > 5  2002    33.  Notice the 29 the 5 and 2002 it's that dumb.  The tricky
> > part is the other fields, they change depending on the data!  Some have
> > times some do not and, again it's been a while, but I remember that there
> is
> > a flag that marks the those with times with a 0 and without times is a 1.
> > The offset between Greenwich and the location where the client application
> > was running is also stored.  The time I think is in milliseconds since
> some
> > odd initial date that never made sense to me, maybe it is the initiation
> > date for GMT or something.  The most difficult part for me was getting the
> > date part was just three words that were simple integers holding the
> simple
> > values after that the rest fell into place.

> > The real problem is knowing the ORIGINAL sctructure of the table.  If you
> > don't have that it's a big mess!!  If you have a really big pile of old
> data
> > and it's worth the pain I could get out some old code that can read all
> that
> > binanry stuff and even figure out the sctrure dynamically by referencing a
> > copy of the system catalogs.  If you don't have many tables and they
> didn't
> > change much over their life it's not worth it but what you are trying to
> do
> > could become impossible to do by hand, if that is the case let me know.

> > Later,
> > KP


> message

> > > Yes,

> > > A C Strcuture should get you around the problem. You'll have to make
> sure
> > > you only read the exact nuumner of bytes into the program and be
> carefull
> > of
> > > things like nulls and the like. I thinj the C instruction is
> > > fread(*file,sizeof(struct) ir something like it.

> > > A s a test you could set a table up of exactly the same format as the
> > houst
> > > table and copy the data back in with () and see what happens. Again
> beware
> > > of nulls and table structures.

> > > kevin




> > > > >I'm wondering if there's any material available on how Ingres encodes
> > > > >the data files generated by "copy out" when the "-c" option is not
> > > > >used to generate clear text.

> > > > >We have about 10 years worth of archive data stored in this format,
> > > > >and we've moved the database to Oracle.

> > > > Are you still on the same platform?  if your hardware architecture
> > > > has changed, you may be in trouble.  Anyway, the binary output is
> > > > simply the binary data, each column following with no padding.
> > > > integer1's are output in a single byte, i2's in two bytes, etc.
> > > > char(n) takes n bytes with trailing space padding.
> > > > varchar(n) has a 2-byte leading length followed by n bytes.  (all N
> > > > bytes are always output although only "length" bytes are significant.)
> > > > A "with null" column is followed by a single byte null indicator,
> > > > 0 if not null and 1 if null.  dates are magical and your best bet is
> > > > to reinstall Ingres somewhere if you need to translate dates.  If you
> > > > can just skip them, they are 12 bytes in binary.
> > > > MONEY data goes out as floating point, but scaled by 100 (i.e. 1.1
> will
> > > > be output as floating 110.0).  I'm not sure about decimal but if your
> > > > data is that old you probably don't have any decimal columns.

> > > > --

> > > > K/B Computer Associates       www.kbcomputer.com
> > > > Ingres, Unix, VMS             Consulting and Training

> > -----------== Posted via Newsfeed.Com - Uncensored Usenet News
> ==----------
> >    http://www.newsfeed.com       The #1 Newsgroup Service in the World!
> > -----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers
> =-----



Thu, 18 Nov 2004 21:14:38 GMT
 
 [ 8 post ] 

 Relevant Pages 

1. Postgresql binary copy file format

2. COPY BINARY file format proposal

3. COPY BINARY file format proposal

4. COPY BINARY file format proposal

5. binary data; date format mask

6. Help with SQL Insert Statement of Binary Formatted data

7. decoding binary base64

8. Decoding Binary Base64

9. COPY ... INTO command and output data format

10. COPY ... INTO command and output data format

11. Saving Binary Data to a Paradox Binary Field via Code

12. help copying data into an Ingres database


 
Powered by phpBB® Forum Software