extraction - Can Apache Tika (or any JVM library) produce ordered extract data referencing images, as well as text, e.g. from a PDF? -

extraction - Can Apache Tika (or any JVM library) produce ordered extract data referencing images, as well as text, e.g. from a PDF? -

- August 15, 2012

basically, want able create set of 'slides' based on extracts documents such pdfs or word docs, programmatically.

for this, i need know [roughly] in text embed images placed, such, dumping image resources out disk wouldn't help*.

i'm java dev, don't fear code ;-)

*unless of course there references within [tika] extract output, @ appropriate position(s) or line(s).

Comments