Skip to content

Word documents and content_type_norm #199

@tokee

Description

@tokee

If we do a search for content_type_ext:doc AND content_type:"application/msword" in the Danish Netarchive Search, we get the facet for content_type_norm:

  • other : 3577875
  • word : 17606

There seems to be a problem with deriving the normalised content type with Word documents?

Maybe a more overall issue would be to search for all records that has other as nrmalised content type and facet on the different content type fields to see if there are more heavy hitters that are not handled?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions