|
PDF index file converted into a SQL database
Message-ID:<hinr3k$2lvd$1@saria.nerim.net>
Subject:PDF index file converted into a SQL database?
Date:Thu, 14 Jan 2010 20:25:27 +0100
Hello,
When the Acrobat's search engine is operated on PDF files it generates an
IDX index file. Is it possible to convert it into a SQL database?
Thanks a lot for any information about this.
Daniel
Paris
Message-ID:<v7ydnZFa-cEu79LWnZ2dnUVZ_uGdnZ2d@posted.localnet>
Subject:Re: PDF index file converted into a SQL database?
Date:Thu, 14 Jan 2010 20:38:27 +0100
At Thu, 14 Jan 2010 20:25:27 +0100 "Daniel" <daniel.frydman_no_spam@metacrawler.com> wrote:
>
> Hello,
> When the Acrobat's search engine is operated on PDF files it generates an
> IDX index file. Is it possible to convert it into a SQL database?
> Thanks a lot for any information about this.
What *exactly* is an 'IDX index file'? If this is some sort of text
file, it should be possible to convert this (with tools like Perl or
awk) to a set of SQL statements, which in turn can be fed to a SQL
client program.
>
> Daniel
> Paris
>
>
>
--
Robert Heller -- 978-544-6933
Deepwoods Software -- Download the Model Railroad System
http://www.deepsoft.com/ -- Binaries for Linux and MS-Windows
heller@deepsoft.com -- http://www.deepsoft.com/ModelRailroadSystem/
Message-ID:<hipm2u$8c1$1@saria.nerim.net>
Subject:Re: PDF index file converted into a SQL database?
Date:Fri, 15 Jan 2010 13:11:58 +0100
The IDX file is the output format of the index function of Acrobat. Open as
a text file, the content of the file is full of specific data encoded by
Acrobat. That's the problem: how to convert a specific file to a standard
data file appropriate to a SQL database?
Daniel
"Robert Heller" <heller@deepsoft.com> a écrit dans le message de news:
v7ydnZFa-cEu79LWnZ2dnUVZ_uGdnZ2d@posted.localnet...
> At Thu, 14 Jan 2010 20:25:27 +0100 "Daniel"
> <daniel.frydman_no_spam@metacrawler.com> wrote:
>
>>
>> Hello,
>> When the Acrobat's search engine is operated on PDF files it generates an
>> IDX index file. Is it possible to convert it into a SQL database?
>> Thanks a lot for any information about this.
>
> What *exactly* is an 'IDX index file'? If this is some sort of text
> file, it should be possible to convert this (with tools like Perl or
> awk) to a set of SQL statements, which in turn can be fed to a SQL
> client program.
>
>>
>> Daniel
>> Paris
>>
>>
>>
>
> --
> Robert Heller -- 978-544-6933
> Deepwoods Software -- Download the Model Railroad System
> http://www.deepsoft.com/ -- Binaries for Linux and MS-Windows
> heller@deepsoft.com -- http://www.deepsoft.com/ModelRailroadSystem/
>
>
Message-ID:<PuWdnS4oiL2T783WnZ2dnUVZ_hmdnZ2d@posted.localnet>
Subject:Re: PDF index file converted into a SQL database?
Date:Fri, 15 Jan 2010 14:47:58 +0100
At Fri, 15 Jan 2010 13:11:58 +0100 "Daniel" <daniel.frydman_no_spam@metacrawler.com> wrote:
>
> The IDX file is the output format of the index function of Acrobat. Open as
> a text file, the content of the file is full of specific data encoded by
> Acrobat. That's the problem: how to convert a specific file to a standard
> data file appropriate to a SQL database?
You'll need to figure out what Adobe is doing and then write a program
(eg a Perl script or something) than converts it to SQL.
>
> Daniel
>
> "Robert Heller" <heller@deepsoft.com> a écrit dans le message de news:
> v7ydnZFa-cEu79LWnZ2dnUVZ_uGdnZ2d@posted.localnet...
> > At Thu, 14 Jan 2010 20:25:27 +0100 "Daniel"
> > <daniel.frydman_no_spam@metacrawler.com> wrote:
> >
> >>
> >> Hello,
> >> When the Acrobat's search engine is operated on PDF files it generates an
> >> IDX index file. Is it possible to convert it into a SQL database?
> >> Thanks a lot for any information about this.
> >
> > What *exactly* is an 'IDX index file'? If this is some sort of text
> > file, it should be possible to convert this (with tools like Perl or
> > awk) to a set of SQL statements, which in turn can be fed to a SQL
> > client program.
> >
> >>
> >> Daniel
> >> Paris
> >>
> >>
> >>
> >
> > --
> > Robert Heller -- 978-544-6933
> > Deepwoods Software -- Download the Model Railroad System
> > http://www.deepsoft.com/ -- Binaries for Linux and MS-Windows
> > heller@deepsoft.com -- http://www.deepsoft.com/ModelRailroadSystem/
> >
> >
>
>
>
--
Robert Heller -- 978-544-6933
Deepwoods Software -- Download the Model Railroad System
http://www.deepsoft.com/ -- Binaries for Linux and MS-Windows
heller@deepsoft.com -- http://www.deepsoft.com/ModelRailroadSystem/
Message-ID:<7rgm5mFqbtU2@mid.individual.net>
Subject:Re: PDF index file converted into a SQL database?
Date:Sun, 17 Jan 2010 15:46:14 +0100
Daniel wrote:
> The IDX file is the output format of the index function of Acrobat. Open as
> a text file, the content of the file is full of specific data encoded by
> Acrobat.
Is it a text file or some proprietary binary format?
If it's text, can you post a short chunk of it so we can see?
> That's the problem: how to convert a specific file to a standard
> data file appropriate to a SQL database?
That's not the problem: any decent programming language can do this.
The problem is knowing what the content of the .idx file is and what it
means. Unless you have access to Adobe's specification of this (maybe
it's part of the PDF Spec; I don't know), then all the conversions in
the world won't help...
///Peter
Message-ID:<hj7hql$1u2l$1@saria.nerim.net>
Subject:Re: PDF index file converted into a SQL database?
Date:Wed, 20 Jan 2010 19:25:24 +0100
> Is it a text file or some proprietary binary format?
It's a proprietary format, I'm afraid. When I open it into in text file
there are a lot of not recognized characters (squares).
That's why I'm seeking a soft whose editor had access to Adobe's
specification.
Daniel
"Peter Flynn" <peter.nosp@m.silmaril.ie> a écrit dans le message de news:
7rgm5mFqbtU2@mid.individual.net...
> Daniel wrote:
>> The IDX file is the output format of the index function of Acrobat. Open
>> as a text file, the content of the file is full of specific data encoded
>> by Acrobat.
>
> Is it a text file or some proprietary binary format?
>
> If it's text, can you post a short chunk of it so we can see?
>
>> That's the problem: how to convert a specific file to a standard data
>> file appropriate to a SQL database?
>
> That's not the problem: any decent programming language can do this.
>
> The problem is knowing what the content of the .idx file is and what it
> means. Unless you have access to Adobe's specification of this (maybe
> it's part of the PDF Spec; I don't know), then all the conversions in
> the world won't help...
>
> ///Peter
>
Message-ID:<14lfojsjoe19k.nsffchtqawxq.dlg@40tude.net>
Subject:Re: PDF index file converted into a SQL database?
Date:Wed, 20 Jan 2010 19:40:33 +0100
Daniel schrieb:
>> Is it a text file or some proprietary binary format?
> It's a proprietary format, I'm afraid. When I open it into in text file
> there are a lot of not recognized characters (squares).
>
> That's why I'm seeking a soft whose editor had access to Adobe's
> specification.
What about this one: "PDF Manager":
http://www.aks-labs.com/solutions/pdf-manager/index-pdf-file.htm
Robert
|