Research‎ > ‎

Document Image Analysis for Electronic Business Documents

With IT advances and the wide spread of the Internet, document creation and management techniques have been steadily evolving from paper towards digital media since the mid 1990s. Screen-rendered documents are becoming the new standard in the business world. Such documents are richer in information than their paper counterparts, and they are easier to search and store. The complexity of screen-rendered business documents calls for new techniques for automated document analysis and management.


This project explores new directions for the analysis of electronic documents rendered in image formats. Our current research is focused on reverse engineering of business documents, change detection, aesthetic analysis, and smart libraries.


Team:

  • Alexandra Branzan Albu (PI)
  • Jeremy Svendsen (PhD student): 3D chart recognition and characterization
  • Anissa Agah St-Pierre (MASc student): Aesthetic analysis of business documents from colour and geometric standpoints
  • Melissa Cote (Postdoctoral Fellow): Development of smart libraries for business documents


Sponsors:

SAP- ARC (Academic Research Collaborations)

NSERC


Publications:


J. Svendsen and A. Branzan Albu, "Document Segmentation via Oblique Cuts", accepted to SPIE Electronic Imaging, February 2013.


  Primitive detection for a bar chart: 1) Title; 2) Vertical index name 
  3) Vertical indices; 4) Horizontal index name; 5) Horizontal indices
  6) Individual Bar