Resources Contact Us Home
Generating document templates that are robust to structural variations

Image Number 3 for United States Patent #7668942.

A template or wrapper tree for a document such as a web page is generalized from the bottom up (from leaf toward root of a logical tree structure of the template). At a given level in the tree, sub-trees are clustered and the clustered sub-trees are generalized, and the process is repeated at a next higher level in the tree, resulting in a generalized template or wrapper tree. This can be done by generating a nested pattern regular expression based on the sub-tree clusters, merging sub-trees based on the nested pattern regular expression, and then replacing sub-trees in a tree-based regular expression of the template or wrapper at the given level with the merged sub-trees. This process is repeated at a next higher level of the tree (progressing from leaf towards root) until the wrapper or tree-based regular expression that represents the template is fully generalized.

  Recently Added Patents
Electronic system auto-mute control circuit and control method thereof
Architecture and method for multi-aspect touchscreen scanning
Motor vehicle, toy and/or replica
Magnetic disk and manufacturing method thereof
Terminal box assembly
Biomarkers of gastric cancer and use thereof
Techniques for determining optimized local repair paths
  Randomly Featured Patents
Storage and display stand
Method and apparatus for anamorphically shaping and deflecting electromagnetic beams
Device for quickly elevating and micro-adjusting the workbench of a planing machine
Single crystal, dual wafer, tunneling sensor or switch with silicon on insulator substrate and a method of making same
Self-cleaning microwave convection oven
Telephone terminal discount accessory device
Noise level updating system
Powder spray coating device and powder feeding device therefor
Transfer printing of natural and natural/synthetic fibres
Microwave oven having electric heating element rotatable between horizontal and vertical positions