Engine Creation Overview

The Engine Creation Wizard will walk you through the steps of creating an engine with detail description. There a few key steps and several optional configurations when creating an engine.

Key Steps

  1. Click on Create New Keyword Engine

  2. Type in a name for the Keyword Engine at least 5 characters

  3. Set Crawler Target

  4. Publish

Crawler Name

We recommend creating an engine name that reflects the site name and search tool, especially when Parametric Search is also used. For example, "EETech Keyword Search."

Engine names may include characters, numbers, and symbols.

If multiple engines are required for your user experience, engine names equivalent, but using names including unique location or date of setup may help distinguish them from each other.

Crawler Target - Indexable Domains

Define one or more sites to crawl. The crawler will start with site(s) entered, following links until all references are exhausted. The crawler will not follow backlinks to external domains.

Publish

When you click on "Publish" the engine gets created in the background. This operation is synchronous and blocks the user with a loading icon on the screen until finished. Then the user is returned to the dashboard where they can view new engine details. An example is shown in the image below:

Optional Configurations

URL Filtering

Pages within the target domain that engine owner don't want crawled can be blocked from the crawler. Upload a csv file including the links to block. Use of regular expressions is permitted and commended for unwanted categories and sections of sites.

Crawler Assistance

Page Load Delay

Longer page loads may affect results collected by the crawler and indexed for the Keyword Search engine. This is usually indicated by missing content in search, i.e. the page may appear as a result but is missing data in the page description, or missing pages from search. Use longer delays to improve results in these cases.

This is particularly effective when crawling staged sites that are taking longer to load.

Available options are:

  • 100 ms (default)

  • 500 ms

  • 1 s

  • 2 s

  • 4 s

  • 8 s

Target and Exclude Specific Content

Use CSS selectors to guide the crawler to or away from on-page content. Only one selector may be used for targeting content. This configuration gives a hint to the crawler for the location of page content to avoid crawling headers, footers, navigation, etc.

Additionally, better results can be yielded by excluding certain elements, such as headers, footers, navigation, etc.

CSS selectors include HTML elements, ID's, and classes.

The goal of targeting and excluding content is to index only the data that matches your page. The data crawled is indexed and eventually turns into the keywords users enter as queries - if some keywords don't match the page, then results may come up when unexpected, impacting the search experience.

Crawler Frequency

By default, the rate at which a crawler will request a page is set to 'auto.' This configuration uses an algorithm to set frequencies to recrawl pages based on how often the content on them changes.

There are also options to set a regular crawl interval.

Influence Rules

The influence page includes two tools to allow changes to be made to the order of results returned upon query.

Result Matching

Define a wildcard pattern that matches URLs you want to apply a Weight Multiplier to. Weight Multiplier

Results with matching URLs will have their scores multiplied by the weight. A weight of 1 has no effect and a weight of 100 has a large effect. In most cases, a weight between 2 and 10 is enough to cause pages with matching URLs to show up as top results.

Result Ranking

There might be URLs that you want promoted in the search results. You can upload a list of URLs for matching queries. These URLs will be placed in order at the top of the results for the associated queries.

URL Rewriting

By configuring URL rewrite rules, the links returned for search results can be altered on the fly. This is a specialized tools used only between develop stages of a site. It is used to map URLs across domains or subdomains as they change.

An alternative option is to create a new Keyword Search engine per development stage.

Usually engines are created at each stage to test how the crawler performs indexing existing site content with the set configurations.

Synonyms

These are terms seen in search often, without results. Create a better experience by adding synonyms so that users are finding valuable results. TXT and CSV file formats are accepted.

Character Mappings

Technical content often has characters that are shorthand for other things. Use this field to define character mappings.

Last updated