C# Programming
10-08-2009, 03:40 AM
I'm working on a utility that reads files and gets the 'words'. It's a sort of indexing project that I'm working on. I've got a lot of formats covered but there are a few formats - like pdf and doc - that I am having trouble with. So as I'm playing in C# I'll ask here.
Has anyone tried to mine the text from these formats and if so was it possible without an intermidary file? Ideally I would like to be able pass the file into a StreamReader derived class and read the text out of the other end.
Any idea guys and gals?
Panic, Chaos, Destruction.
My work here is done.
Has anyone tried to mine the text from these formats and if so was it possible without an intermidary file? Ideally I would like to be able pass the file into a StreamReader derived class and read the text out of the other end.
Any idea guys and gals?
Panic, Chaos, Destruction.
My work here is done.