java - How to use Apache Tika to extract the "From", "To", and "Subject" fields by using Apache Metadata class? -
i'm using apache tika library, metadata class, extract "from" , "to" , , "subject" fields outlook exchange file (email file , i.e .msg
files).
i know need use metadata class, i'm having little trouble using it.
here code far :
import java.io.file; import java.io.*; import java.util.arrays; import org.apache.tika.tika; /* more tika imports */ public class extractfromemail { public static void main(final string[] args) throws ioexception, tikaexception , saxexception { file file = new file("message_1980.msg"); autodetectparser parser = new autodetectparser(); bodycontenthandler handler = new bodycontenthandler(-1); metadata tikametadata = new metadata(); property prop = new property("message_from"); string fromfield = tikametadata.get(prop); // use pattern inputstream input = tikainputstream.get(file, tikametadata); parser.parse(input, handler, tikametadata, new parsecontext()); string other = tikametadata.message_from ; system.out.println(fromfield); } }
i following error when run code :
extractfromemail.java:30: error: no suitable constructor found property(string) property prop = new property("message.message_from");
thanks
i try tika-1.14:
file file = new file("src/main/resources/unicode.msg"); autodetectparser parser = new autodetectparser(); bodycontenthandler handler = new bodycontenthandler(); metadata tikametadata = new metadata(); inputstream input = tikainputstream.get(file, tikametadata); parser.parse(input, handler, tikametadata, new parsecontext()); string messagefrom = tikametadata.message_from; string fromfield = tikametadata.get(messagefrom); system.out.println(fromfield);
and works. problem tried extract metadata before parsing message. in addition believe line property prop = new property("message_from");
useless , incorrect.
Comments
Post a Comment