Structure Recognition for Information Retrieval and Layout
Article, Presentation previewApril 11, 2018
Article, Presentation previewApril 11, 2018
About Joris Schellekens, borb
The views expressed in this article are those of the author(s) and do not reflect the policies or positions of the PDF Association.
Presenter
Joris is a 29-year old software developer at iText, a global IT firm with a leadership position in pdf creation. Joris’ background is in machine learning, NLP, mathematics, graphs and NP-complete problems. After having worked in the supply-chain industry, he set his sights on document-processing and workflow-automation. At iText he focusses mostly on innovative research projects.
Session Description
Tables, list and other structural elements are found in many digital articles. These elements typically allow the authors to present information in a structured manner and to communicate and summarize key results and main facts. It allows readers to get a quick overview of the presented information, to compare items and put them into context. Knowing the physical boundaries of paragraphs can aid screen-readers for visually impaired users. Having a concept of tables will help any document-processing flow. And, aside from serving as pure input, structure is a key component when performing conversion. This talk is about bridging the gap between high-level concepts and low-level document formats.
Check out the detailed programme: https://pdfa.org/pdf-days-europe-2018-schedule-of-sessions/
Watch the recording: https://youtu.be/62kZ6MQhLZk?si=9LV8_ePIDuhjMARq



