Tim Allison
Tim has been working in content/metadata extraction (and evaluation), advanced search and relevance tuning for nearly 20 years. Tim is the founder of Rhapsode Consulting LLC, and he currently works as a data scientist at NASA's Jet Propulsion Laboratory. Tim is a member of the Apache Software Foundation (ASF), the chair/VP of Apache Tika, and a committer on Apache OpenNLP (2020), Apache Lucene/Solr (2018), Apache PDFBox (2016) and Apache POI (2013). Tim holds a Ph.D. in Classical Studies, and in a former life, he was a professor of Latin and Greek.
Tim Allison's Publications
Presentation September 3, 2022
This is a follow-on talk from our 2021 PDF Days presentation on the File Observatory. Our team built the File Observatory … Read more
Presentation June 8, 2021
PDFs in the wild offer a bewildering amount of variation in syntax, features and structure. For those building parsers or … Read more
Article November 12, 2020
The original PDF Issue Tracker corpus generated a lot of interest from the PDF technical community; now version 2 of … Read more
Presentation July 10, 2020
Apache Tika is widely used as a critical enabling technology for search in Apache Solr and other search systems. This … Read more