Using the preferred format

đŸ“˜

Important

The catalog format outlined in this guide is specific to the Proof Scheduleâ„¢ process. For an overview of the catalog format for full integrations, please refer to the Product catalog guide under the Integrating with Constructor section.

Our preferred catalog data format consists of two or three CSV files:

  • Items (products) - typically defined within an items.csv file
    Products shown as results on product listing pages (search, browse)
  • Item groups (categories) - typically defined within an item_groups.csv file
    Hierarchy definition of groups for organizing items (products)
  • Variations - optionally defined within a variations.csv file
    Not all product catalogs have variations. Variations refer to child items that can be shown as swatches of top level results on product listing pages

It is recommended to review the Catalog data concepts guide for a detailed explanation of the different concepts related to product catalogs.

Some fields are marked as required (✔), as they are essential to delivering Proof Schedule results. Other fields may be optional, but note that our AI produces stronger learnings when fed more data about your items, variations and item groups.

Facet and item groups fields are especially high leverage product data for optimizing product discovery for end users.

Items (products)

Item definitions are typically encapsulated in a UTF-8 encoded CSV file named items.csv with the data defined below:

Field nameRequiredDescription
id✔A unique item ID that is available for the beacon to track (in the DOM or on the window object) across your search results, browse results, and product detail pages. Limited to 250 characters.
item_name✔The name to display in results. Limited to 250 characters.
image_url✔An image URL for the item. We recommend image dimensions of at least 400 pixels per side.
facets✔These fields most often correspond to the filters that are displayed on product listing pages. Each facet should ideally be in a separate column. Multiple values (for instance, if an item has multiple colors) can be separated by a pipe character (`
group_ids✔A pipe-separated list of group IDs (categories) this item is associated with. Each item may belong to one group, multiple groups, or no groups at all. These must correspond to the IDs uploaded in the item groups (category) file.
urlThe page URL a user is taken to after selecting an item in autocomplete, search, browse, or recommendation results.
descriptionThe item description - optional for Proof Schedules, but helpful for Attribute Enrichment. Limited to 1,000 characters.
keywordsKey terms or phrases that help users find the item, separated by pipe characters ("|").

Item (product) data can also include variations, such as when a shirt has multiple colors or sizes. In this case, the general shirt product would appear in the item feed (items.csv) and its variations would be referenced in a variation feed (variations.csv).

Why is supplying required fields important for a Proof Schedule?

  • id links behavioral tracking data from the beacon with the data provided in this catalog
  • group_ids link items to browse pages
  • group_ids & facets improve result rankings when fed to our AI
  • item_name, image_url, url, & description are critical for building a Proof Schedule demo
  • item_name, facets, & keywords make items searchable

Example items file

id,item_name,image_url,facet:product_type,facet:material,facet:price,group_ids,url,description,keywords
cotton-t-shirt,Cotton T-Shirt,https://constructorio-integrations.s3.amazonaws.com/tikus-threads/2022-06-29/WOVEN-CASUAL-SHIRT_BUTTON-DOWN-WOVEN-SHIRT_BSH01757SB1918_3_category.jpg,Shirts|T-Shirts,Cotton,18,tops-athletic|tops-casual,https://constructor.com/,Treat yourself to a comfy upgrade with this Short Sleeve Shirt from Etchell's Emporium. This short-sleeve T-shirt comes with a classic crew-neck, giving you style and comfort that can easily be paired with a variety of bottoms.,gym|casual|athletic|workout|comfort|simple

An example items.csv file can also be viewed and downloaded from GitHub.

Item groups (categories)

Items (products) are organized into groups - sometimes referred to as "categories." Users generally view these groups of items on browse result pages.

Each row in this file represents a single group (and corresponding browse page). A valid group hierarchy has a single top level group (generally named "All") that has a blank parent_id, and all other groups reference a parent group.

Item group definitions are encapsulated in a UTF-8 encoded CSV file item_groups.csv with the data defined below:

Field nameRequiredDescription
id✔A unique group ID that is available for the beacon to track (in the DOM or on the window object) for each browse result page. Limited to 250 characters.
parent_id✔The ID for the item group this group belongs to (must be defined in the same file).
name✔The name of the browse page. Limited to 250 characters.
data:urlBrowse page URL where this group of items is displayed to end users in production.

Example item groups file

parent_id,id,name,data:url
,all,All,/browse/all
all,jackets,Jackets,/browse/jackets
all,tops,Tops,/browse/tops
tops,tops-athletic,Athletic Tops,/browse/tops/athletic
tops,tops-casual,Casual Tops,/browse/tops/casual
all,bottoms,Bottoms,/browse/bottoms

An example item_groups.csv file can be viewed and downloaded from GitHub.

Variations

Defining a variations file is optional and may not apply to all catalogs, though it may be helpful to separate the data for an item definition into multiple variations. Variation definitions are encapsulated in a UTF-8 encoded CSV file named variations.csv with the data defined below:

Field NameRequiredDescription
variation_id✔A unique ID for the item variation. Limited to 250 characters.
item_id✔The ID for the item this variation belongs to (must be defined in items.csv).
image_url✔The image URL for the item variation. We recommend image dimensions of at least 400 pixels per side.
facets✔These fields most often correspond to the filters that are displayed on product listing pages. Each facet should ideally be in a separate column. Multiple values (for instance, if an item has multiple colors) can be separated by a pipe characters (`
item_nameThe name to display in results. When provided, it will be used in favor of the item_name defined within the items.csv file. Limited to 250 characters.
urlThe page URL a user is taken to after selecting an item in autocomplete, search, browse, or recommendation results.
descriptionThe item variation description. Limited to 1,000 characters.
keywordsKey terms or phrases that help users find the item variation, separated by pipe characters (`

A shirt that comes in multiple colors would have a single item in the items file with multiple variations defined in the variations file. Each variation item has different values for fields named facet:color and image_url, but has the same value for the item_id field.

Example variations file

variation_id,item_id,item_name,image_url,facet:color,url
cotton-t-shirt-red,cotton-t-shirt,Cotton T-Shirt - Red,https://constructor.com/media/cotton-t-shirt.jpg,Red,https://constructor.com/products/cotton-t-shirt-red
cotton-t-shirt-blue,cotton-t-shirt,Cotton T-Shirt - Blue,https://constructor.com/media/cotton-t-shirt-blue.jpg,Blue,https://constructor.com/products/cotton-t-shirt-blue

An example variations.csv file can be viewed and downloaded from GitHub.