Default Field Map


The SearchStax Site Search solution offers a Crawler add-on for Enterprise clients. The Crawler indexes the pages of your website starting with a single root node.

The crawler maps information about a web page to Solr schema fields in the Site Search index. This page presents reference information about the crawler’s default field mappings. See Crawler for instructions on adding custom fields.

Default Field Mappings

Upon creation of a new crawler, default fields are automatically set with the following definitions for a given document type.

  • Mappings that cannot be changed are not displayed in the field mapping table.
  • All other fields are displayed and mappings can be changed by the user by deleting the field and creating a new one.
  • The following rich text formats are supported: .pdf, .docx, .xlsx, .pptx, .txt, .rtf.
  • Note that the field labels visible in the crawler settings are shown in the Drupal, Sitecore, and Custom App columns of the table.
FieldHTML, Rich TextField CategoryField Data TypeCan Change Mapping?DrupalSitecoreCustom AppPage Property / default selector
Unique IDHTML, Rich TextSystem FieldStringYesid_uniqueidid
URLHTML, Rich TextSystem FieldStringYesss_urlurl_surl
Document Type: html, txt, pdf…HTML, Rich TextSystem FieldStringYesss_document_typedocument_type_sdocument_typedocument_type
TitleHTML, Rich TextSystem FieldTextYestm_X3b_en_titletitle_txts_entitle
Content
(text extracted from document. For Rich text this field is a system field, for HTML it’s optional and configured by user)
HTML, Rich TextSystem FieldTextYestm_X3b_en_bodypagecontent_txts_encontentcontent##//text()
DescriptionHTMLMetaTextYestm_X3b_en_descriptionrenderedcontent_txts_endescriptiondescription
timestamp when document was crawledHTML, Rich TextSystem FieldDateYestimestampdisplaydate_dtsdate
part of the URL after domain, where each / is padded with spaces /HTML, Rich TextSystem FieldTextYestm_X3b_en_pathspaths_txts_enpaths
KeywordsHTMLXPathTextYestm_X3b_en__keywordskeywords_txts_enkeywordskeywords
Heading Level 1HTMLXPathTextYestm_X3b_en__headings1headings1_txts_enheadings1//h1/text()
Heading Level 2HTMLXPathTextYestm_X3b_en__headings2headings2_txts_enheadings2//h2/text()
Heading Level 3HTMLXPathTextYestm_X3b_en__headings3headings3_txts_enheadings3//h3/text()
Heading Level 4HTMLXPathTextYestm_X3b_en__headings4headings4_txts_enheadings4//h4/text()
Crawler Definition IDHTML, Rich TextCrawler Internal Fields, not mapped, added automaticallyStringNoss_exif_crawl_definition_idexif_crawl_definition_id_sexif_crawl_definition_id
Crawl Run ID (crawl job)HTML, Rich TextCrawler Internal Fields, not mapped, added automaticallyStringNoss_exif_crawlidexif_crawlid_sexif_crawlid
Crawler tenant ID (Searchstax customer ID)HTML, Rich TextCrawler Internal Fields, not mapped, added automaticallyStringNoss_exif_tenant_idexif_tenant_id_sexif_tenant_id
Crawler application ID (corresponds to studio app, but not the same)HTML, Rich TextCrawler Internal Fields, not mapped, added automaticallyStringNoss_exif_appidexif_appid_sexif_appid

Questions?

Do not hesitate to contact the SearchStax Support Desk.