Text parsing 
Author Message
 Text parsing

Hello,

I'm having trouble parsing through a big text field in my db.  The text
field is actually the body of an email that looks like this:

-----------------------------------------------
SUBMISSION:

Name: John
Last: Doe
Company: This Company
Address: 123 Main Street
City: Hometown
State: ID
Zip: 48839
Country: United States

Notes: This is the notes field.
-----------------------------------------------

Does anybody know how I can parse these fields out so I can populate their
respective fields in the db?  Any help is greatly appreciated.

Lee
--
Remove "NOSPAM" to personally email me.



Mon, 07 Feb 2005 22:32:10 GMT
 Text parsing


Quote:
> Hello,

> I'm having trouble parsing through a big text field in my db.  The text
> field is actually the body of an email that looks like this:

> -----------------------------------------------
> SUBMISSION:

> Name: John
> Last: Doe
> Company: This Company
> Address: 123 Main Street
> City: Hometown
> State: ID
> Zip: 48839
> Country: United States

> Notes: This is the notes field.
> -----------------------------------------------

> Does anybody know how I can parse these fields out so I can populate their
> respective fields in the db?  Any help is greatly appreciated.

> Lee
> --
> Remove "NOSPAM" to personally email me.

It's a matter of using the text functions until you get what you want.  I
suggest making liberal use of the Middle function.

To extract Company, for instance:

Middle(Text, Position(Text,"Company:",1,1)+9,
Position(Text,"Address:",1,1)-Position(Text,"Company:",1,1)-10)

That is, Take the middle text starting from the 1st instance of ":"
(counting from the 1st letter of the text) plus 9 characters ("Company:" and
the space), the length of which string is the same as the number of
characters between the 1st instance of "Address:" and the 1st instance of
"Company:" (both counting from the 1st letter of the text), less 10
characters (the number of characters in "Address:", plus the space and the
return).

This function returns "This Company" from your text.

Someone probably has a more elegant solution, but this works.

Good luck.

Michael Johnson



Tue, 08 Feb 2005 00:23:36 GMT
 Text parsing
Name: John
Last: Doe
Company: This Company
Address: 123 Main Street
City: Hometown
State: ID
Zip: 48839
Country: United States

Notes: This is the notes field.

Text parsing uses markers of some sort to locate the desired text. If your
text actually includes the strings as you have posted them, i.e., "Name: ",
"Last: ", etc. then THEY are your markers. By 'markers', I mean use the
Middle() function with the Position() function, and figure out the
Position() of the marker. For 'Company', for instance, where 'Body' is the
subject field:

Middle(
   Body,
   Position(Body, "Company", 1,1) + 9,
   (Position(Body, "Address", 1,1) -1)-
   (Position(Body, "Company", 1,1) + 9)
           )

The markers are 'Company' and 'Address': you want the text between them,
minus the returns.

The first argument, "Body" simply states which field contains the text to be
parsed.

The next argument, "Position(Body, "Company", 1,1) + 9", states where to
start extracting, which, in this instance, is 9 characters after the first
letter of the word 'Company'.

The third argument:

(Position(Body, "Address", 1,1) -1)-
(Position(Body, "Company", 1,1) + 9)

...states the length of the text to be extracted as the beginning of the
next line minus 1 (the carriage return) minus the same start point given in
the second arg.

Carriage returns are another very useful marker. The Position() function's
last argument is the iteration of the string you want to use, and any list
contains many iterations of a carriage return. Knowing that, you can parse
text lists by counting returns.

--
John Weinshel
Datagrace
Vashon Island, WA
(206) 463-1634
Associate Member, Filemaker Solutions Alliance


Quote:
> Hello,

> I'm having trouble parsing through a big text field in my db.  The text
> field is actually the body of an email that looks like this:

> -----------------------------------------------
> SUBMISSION:

> Name: John
> Last: Doe
> Company: This Company
> Address: 123 Main Street
> City: Hometown
> State: ID
> Zip: 48839
> Country: United States

> Notes: This is the notes field.
> -----------------------------------------------

> Does anybody know how I can parse these fields out so I can populate their
> respective fields in the db?  Any help is greatly appreciated.

> Lee
> --
> Remove "NOSPAM" to personally email me.



Tue, 08 Feb 2005 02:48:43 GMT
 Text parsing
That works great except I can't seem to get the Notes: field.  The notes can
wrap to the next line and then after that next line of text is just a string
of "-" dashes.  Do I have to position it to exactly how many dashes there
are (74, yeah I counted!) or how is this done?  Again, this is exactly how
the text of the email comes in as:

--------------------------------------------------------------------------
SUBMISSION:

Name: John
Last: Doe
Company: This Company
Address: 123 Main Street
City: Hometown
State: ID
Zip: 48839
Country: United States

Notes: This is the notes field that may
or may not wrap to the next line.
--------------------------------------------------------------------------

Thanks again!
Lee
--
Remove "NOSPAM" to personally email me.


Quote:
> Name: John
> Last: Doe
> Company: This Company
> Address: 123 Main Street
> City: Hometown
> State: ID
> Zip: 48839
> Country: United States

> Notes: This is the notes field.

> Text parsing uses markers of some sort to locate the desired text. If your
> text actually includes the strings as you have posted them, i.e., "Name:
",
> "Last: ", etc. then THEY are your markers. By 'markers', I mean use the
> Middle() function with the Position() function, and figure out the
> Position() of the marker. For 'Company', for instance, where 'Body' is the
> subject field:

> Middle(
>    Body,
>    Position(Body, "Company", 1,1) + 9,
>    (Position(Body, "Address", 1,1) -1)-
>    (Position(Body, "Company", 1,1) + 9)
>            )

> The markers are 'Company' and 'Address': you want the text between them,
> minus the returns.

> The first argument, "Body" simply states which field contains the text to
be
> parsed.

> The next argument, "Position(Body, "Company", 1,1) + 9", states where to
> start extracting, which, in this instance, is 9 characters after the first
> letter of the word 'Company'.

> The third argument:

> (Position(Body, "Address", 1,1) -1)-
> (Position(Body, "Company", 1,1) + 9)

> ...states the length of the text to be extracted as the beginning of the
> next line minus 1 (the carriage return) minus the same start point given
in
> the second arg.

> Carriage returns are another very useful marker. The Position() function's
> last argument is the iteration of the string you want to use, and any list
> contains many iterations of a carriage return. Knowing that, you can parse
> text lists by counting returns.

> --
> John Weinshel
> Datagrace
> Vashon Island, WA
> (206) 463-1634
> Associate Member, Filemaker Solutions Alliance



> > Hello,

> > I'm having trouble parsing through a big text field in my db.  The text
> > field is actually the body of an email that looks like this:

> > -----------------------------------------------
> > SUBMISSION:

> > Name: John
> > Last: Doe
> > Company: This Company
> > Address: 123 Main Street
> > City: Hometown
> > State: ID
> > Zip: 48839
> > Country: United States

> > Notes: This is the notes field.
> > -----------------------------------------------

> > Does anybody know how I can parse these fields out so I can populate
their
> > respective fields in the db?  Any help is greatly appreciated.

> > Lee
> > --
> > Remove "NOSPAM" to personally email me.



Tue, 08 Feb 2005 04:40:11 GMT
 Text parsing


Quote:
> That works great except I can't seem to get the Notes: field.  The notes
can
> wrap to the next line and then after that next line of text is just a
string
> of "-" dashes.  Do I have to position it to exactly how many dashes there
> are (74, yeah I counted!) or how is this done?  Again, this is exactly how
> the text of the email comes in as:

> --------------------------------------------------------------------------
> SUBMISSION:

> Name: John
> Last: Doe
> Company: This Company
> Address: 123 Main Street
> City: Hometown
> State: ID
> Zip: 48839
> Country: United States

> Notes: This is the notes field that may
> or may not wrap to the next line.
> --------------------------------------------------------------------------

> Thanks again!
> Lee
> --

Might try some substitute functions to make things prettier, then in your
Middle function, instead of using the two position functions to determine
length, give a healthy number of characters for the allowable length of
Notes (200?).  Use the Substitute function to get rid of the paragraph
formatting.

 Substitute
    (
    Middle
        (
        Substitute( Body,
"--------------------------------------------------------------------------"
, "" ),
        Position( Body, "Notes:",1,1) + 7,
        200
        ),
    <paragraph symbol>,
     " "
    )

Michael Johnson



Tue, 08 Feb 2005 05:06:51 GMT
 Text parsing
That works great except I can't seem to get the Notes: field.  The notes can
wrap to the next line and then after that next line of text is just a string
of "-" dashes.  Do I have to position it to exactly how many dashes there
are (74, yeah I counted!) or how is this done?  Again, this is exactly how
the text of the email comes in as:

--------------------------------------------------------------------------
SUBMISSION:

Name: John
Last: Doe
Company: This Company
Address: 123 Main Street
City: Hometown
State: ID
Zip: 48839
Country: United States

Notes: This is the notes field that may
or may not wrap to the next line.
--------------------------------------------------------------------------

Thanks again!
Lee
--
Remove "NOSPAM" to personally email me.


Quote:
> Name: John
> Last: Doe
> Company: This Company
> Address: 123 Main Street
> City: Hometown
> State: ID
> Zip: 48839
> Country: United States

> Notes: This is the notes field.

> Text parsing uses markers of some sort to locate the desired text. If your
> text actually includes the strings as you have posted them, i.e., "Name:
",
> "Last: ", etc. then THEY are your markers. By 'markers', I mean use the
> Middle() function with the Position() function, and figure out the
> Position() of the marker. For 'Company', for instance, where 'Body' is the
> subject field:

> Middle(
>    Body,
>    Position(Body, "Company", 1,1) + 9,
>    (Position(Body, "Address", 1,1) -1)-
>    (Position(Body, "Company", 1,1) + 9)
>            )

> The markers are 'Company' and 'Address': you want the text between them,
> minus the returns.

> The first argument, "Body" simply states which field contains the text to
be
> parsed.

> The next argument, "Position(Body, "Company", 1,1) + 9", states where to
> start extracting, which, in this instance, is 9 characters after the first
> letter of the word 'Company'.

> The third argument:

> (Position(Body, "Address", 1,1) -1)-
> (Position(Body, "Company", 1,1) + 9)

> ...states the length of the text to be extracted as the beginning of the
> next line minus 1 (the carriage return) minus the same start point given
in
> the second arg.

> Carriage returns are another very useful marker. The Position() function's
> last argument is the iteration of the string you want to use, and any list
> contains many iterations of a carriage return. Knowing that, you can parse
> text lists by counting returns.

> --
> John Weinshel
> Datagrace
> Vashon Island, WA
> (206) 463-1634
> Associate Member, Filemaker Solutions Alliance



> > Hello,

> > I'm having trouble parsing through a big text field in my db.  The text
> > field is actually the body of an email that looks like this:

> > -----------------------------------------------
> > SUBMISSION:

> > Name: John
> > Last: Doe
> > Company: This Company
> > Address: 123 Main Street
> > City: Hometown
> > State: ID
> > Zip: 48839
> > Country: United States

> > Notes: This is the notes field.
> > -----------------------------------------------

> > Does anybody know how I can parse these fields out so I can populate
their
> > respective fields in the db?  Any help is greatly appreciated.

> > Lee
> > --
> > Remove "NOSPAM" to personally email me.



Tue, 08 Feb 2005 04:40:11 GMT
 Text parsing
As you work your way towards understanding the various text functions, you
should be able to answer similar questions yourself.

In this case, the last item in the list has no second marker for its end
point, so you want to use something else, which is the very fact it is the
last item, allowing you to use Right() instead of Middle():

Right(
Body,
Length(Body)-
6-
Position(Body, "Notes", 1,1)
        )

Where you know there will be a bunch of unwanted characters, use the
Substitute() function to remove them.

--
John Weinshel
Datagrace
Vashon Island, WA
(206) 463-1634
Associate Member, Filemaker Solutions Alliance


Quote:
> That works great except I can't seem to get the Notes: field.  The notes
can
> wrap to the next line and then after that next line of text is just a
string
> of "-" dashes.  Do I have to position it to exactly how many dashes there
> are (74, yeah I counted!) or how is this done?  Again, this is exactly how
> the text of the email comes in as:

> --------------------------------------------------------------------------
> SUBMISSION:

> Name: John
> Last: Doe
> Company: This Company
> Address: 123 Main Street
> City: Hometown
> State: ID
> Zip: 48839
> Country: United States

> Notes: This is the notes field that may
> or may not wrap to the next line.
> --------------------------------------------------------------------------

> Thanks again!
> Lee
> --
> Remove "NOSPAM" to personally email me.



> > Name: John
> > Last: Doe
> > Company: This Company
> > Address: 123 Main Street
> > City: Hometown
> > State: ID
> > Zip: 48839
> > Country: United States

> > Notes: This is the notes field.

> > Text parsing uses markers of some sort to locate the desired text. If
your
> > text actually includes the strings as you have posted them, i.e., "Name:
> ",
> > "Last: ", etc. then THEY are your markers. By 'markers', I mean use the
> > Middle() function with the Position() function, and figure out the
> > Position() of the marker. For 'Company', for instance, where 'Body' is
the
> > subject field:

> > Middle(
> >    Body,
> >    Position(Body, "Company", 1,1) + 9,
> >    (Position(Body, "Address", 1,1) -1)-
> >    (Position(Body, "Company", 1,1) + 9)
> >            )

> > The markers are 'Company' and 'Address': you want the text between them,
> > minus the returns.

> > The first argument, "Body" simply states which field contains the text
to
> be
> > parsed.

> > The next argument, "Position(Body, "Company", 1,1) + 9", states where to
> > start extracting, which, in this instance, is 9 characters after the
first
> > letter of the word 'Company'.

> > The third argument:

> > (Position(Body, "Address", 1,1) -1)-
> > (Position(Body, "Company", 1,1) + 9)

> > ...states the length of the text to be extracted as the beginning of the
> > next line minus 1 (the carriage return) minus the same start point given
> in
> > the second arg.

> > Carriage returns are another very useful marker. The Position()
function's
> > last argument is the iteration of the string you want to use, and any
list
> > contains many iterations of a carriage return. Knowing that, you can
parse
> > text lists by counting returns.

> > --
> > John Weinshel
> > Datagrace
> > Vashon Island, WA
> > (206) 463-1634
> > Associate Member, Filemaker Solutions Alliance



> > > Hello,

> > > I'm having trouble parsing through a big text field in my db.  The
text
> > > field is actually the body of an email that looks like this:

> > > -----------------------------------------------
> > > SUBMISSION:

> > > Name: John
> > > Last: Doe
> > > Company: This Company
> > > Address: 123 Main Street
> > > City: Hometown
> > > State: ID
> > > Zip: 48839
> > > Country: United States

> > > Notes: This is the notes field.
> > > -----------------------------------------------

> > > Does anybody know how I can parse these fields out so I can populate
> their
> > > respective fields in the db?  Any help is greatly appreciated.

> > > Lee
> > > --
> > > Remove "NOSPAM" to personally email me.



Tue, 08 Feb 2005 06:08:34 GMT
 
 [ 7 post ] 

 Relevant Pages 

1. Reply Text - parsing Text

2. SQL text parse util

3. SQL Text Parsing

4. C# RegEx text parsing???

5. SQL text parse util

6. CMS TextMap - Visual text parsing/processing system

7. text parsing on multiple line field - questiom?

8. Newbie text parsing question

9. Text Parsing Calculation Question...

10. C# RegEx text parsing???

11. Need examples of a text parsing program.

12. Parsing a text data type variable or field


 
Powered by phpBB® Forum Software