Using the preferred format
Important
The catalog format outlined in this guide is specific to the Proof ScheduleĀ® process. For an overview of the catalog format for full integrations, please refer to the Product catalog guide under the Integrating with Constructor section.
Our preferred catalog data format consists of two or three CSV files:
- Items (products) - typically defined within an
items.csv
file
Products shown as results on product listing pages (search, browse) - Item groups (categories) - typically defined within an
item_groups.csv
file
Hierarchy definition of groups for organizing items (products) - Variations - optionally defined within a
variations.csv
file
Not all product catalogs have variations. Variations refer to child items that can be shown as swatches of top level results on product listing pages
It is recommended to review the Catalog data concepts guide for a detailed explanation of the different concepts related to product catalogs.
Some fields are marked as required (ā), as they are essential to delivering Proof ScheduleĀ® results. Other fields may be optional, but note that our AI produces stronger learnings when fed more data about your items, variations and item groups.
The two most essential kinds of product data for optimizing product discovery for end users are:
- Item groups, which are defined within an
item_groups.csv
file and assigned to items in theirgroup_ids
fields. - Facets, which are special metadata that has to be specified on item and/or variation level in
metadata:[name]
field, and then configured to be used as a facet. Please refer to our guide on global facet configuration for more information.
Items (products)
Item definitions are typically encapsulated in a UTF-8 encoded CSV file named items.csv
with the data defined below:
Field name | Required | Description |
---|---|---|
id | ā | A unique item ID that is available for the beacon to track (in the DOM or on the window object) across your search results, browse results, and product detail pages. Limited to 250 characters. |
item_name | ā | The name to display in results. Limited to 250 characters. |
image_url | ā | An image URL for the item. We recommend image dimensions of at least 400 pixels per side. |
metadata:[name] | ā | Custom metadata indicating data points like inventory levels, promotional details, and other product-specific information. Each field can be individually configured to be searchable, displayable (that is, returned with the search results), or both. The fields should be prefixed with the string metadata:[name] , for example: metadata:price . There is not a pre-defined list of metadata names; rather, they can be arbitrary according to what is most useful.If the values contain JSON data, use the prefix metadata:json:[name] , for example: metadata:json:notes .The maximum length of the names of metadata fields is 1,000 characters._ There is no limit on the size of the values, but please work with your integration contact if you plan to upload large values. Please refer to the Facets & metadata guide for more information. |
group_ids | ā | A pipe-separated list of group IDs (categories) this item is associated with. Each item may belong to one group, multiple groups, or no groups at all. These must correspond to the IDs uploaded in the item groups (category) file. |
url | The page URL a user is taken to after selecting an item in autocomplete, search, browse, or recommendation results. | |
description | The item description - optional for Proof ScheduleĀ® but helpful for Attribute Enrichment. Limited to 1,000 characters. | |
keywords | Key terms or phrases that help users find the item, separated by pipe characters (| ) |
Item (product) data can also include variations, such as when a shirt has multiple colors or sizes. In this case, the general shirt product would appear in the item feed (items.csv
) and its variations would be referenced in a variation feed (variations.csv
).
Why is supplying required fields important for a Proof ScheduleĀ®?
id
links behavioral tracking data from the beacon with the data provided in this cataloggroup_ids
link items to browse pagesgroup_ids
&facets (ingested as metadata:[name] and then configured)
improve result rankings when fed to our AIitem_name
,image_url
,url
, &description
are critical for building a Proof ScheduleĀ® demoitem_name
,facets (ingested as metadata:[name] and then configured),
, &keywords
make items searchable
Example items file
id,item_name,url,image_url,group_ids,metadata:product_type,metadata:json:material,keywords,description,metadata:price
cotton-t-shirt,Cotton T-Shirt,https://constructor.com/,https://constructorio-integrations.s3.amazonaws.com/tikus-threads/2022-06-29/WOVEN-CASUAL-SHIRT_BUTTON-DOWN-WOVEN-SHIRT_BSH01757SB1918_3_category.jpg,tops-athletic|tops-casual,Shirts|T-Shirts,"Cotton","[""gym"",""casual"",""athletic"",""workout"",""comfort"",""simple""]","Treat yourself to a comfy upgrade with this Short Sleeve Shirt from Etchell's Emporium. This short-sleeve T-shirt comes with a classic crew-neck, giving you style and comfort that can easily be paired with a variety of bottoms.",18.00
An example items.csv
file can also be viewed and downloaded from GitHub.
Item groups (categories)
Items (products) are organized into groups - sometimes referred to as "categories." Users generally view these groups of items on browse result pages.
Each row in this file represents a single group (and corresponding browse page). A valid group hierarchy has a single top level group (generally named "All") that has a blank parent_id
, and all other groups reference a parent group.
Item group definitions are encapsulated in a UTF-8 encoded CSV file item_groups.csv
with the data defined below:
Field name | Required | Description |
---|---|---|
id | ā | A unique group ID that is available for the beacon to track (in the DOM or on the window object) for each browse result page. Limited to 250 characters. |
parent_id | ā | The ID for the item group this group belongs to (must be defined in the same file). |
name | ā | The name of the browse page. Limited to 250 characters. |
data:url | Browse page URL where this group of items is displayed to end users in production. |
Example item groups file
parent_id,id,name,data:url
,all,All,/browse/all
all,jackets,Jackets,/browse/jackets
all,tops,Tops,/browse/tops
tops,tops-athletic,Athletic Tops,/browse/tops/athletic
tops,tops-casual,Casual Tops,/browse/tops/casual
all,bottoms,Bottoms,/browse/bottoms
An example item_groups.csv
file can be viewed and downloaded from GitHub.
Variations
Defining a variations file is optional and may not apply to all catalogs, though it may be helpful to separate the data for an item definition into multiple variations. Variation definitions are encapsulated in a UTF-8 encoded CSV file named variations.csv
with the data defined below:
Field Name | Required | Description |
---|---|---|
variation_id | ā | A unique ID for the item variation. Limited to 250 characters. |
item_id | ā | The ID for the item this variation belongs to (must be defined in items.csv ). |
image_url | ā | The image URL for the item variation. We recommend image dimensions of at least 400 pixels per side. |
metadata:[name] | ā | Custom metadata indicating data points like inventory levels, promotional details, and other product-specific information. Each field can be individually configured to be searchable, displayable (that is, returned with the search results), or both. The fields should be prefixed with the string metadata:[name] , for example: metadata:price . There is not a pre-defined list of metadata names; rather, they can be arbitrary according to what is most useful.If the values contain JSON data, use the prefix metadata:json:[name] , for example: metadata:json:notes .Note that the maximum length of the names of the metadata fields is 1,000 characters. There is no limit on the size of the values, but please work with your integration contact if you plan to upload large values. Please refer to the Facets & metadata guide for more information. |
item_name | The name to display in results. When provided, it will be used in favor of the item_name defined within the items.csv file. Limited to 250 characters. | |
url | The page URL a user is taken to after selecting an item in autocomplete, search, browse, or recommendation results. | |
description | The item variation description. Limited to 1,000 characters. | |
keywords | Key terms or phrases that help users find the item variation, separated by pipe characters (| ) |
A shirt that comes in multiple colors would have a single item in the items file with multiple variations defined in the variations file. Each variation item has different values for fields named metadata:color
and image_url
, but has the same value for the item_id
field.
Example variations file
variation_id,item_id,item_name,image_url,metadata:color,url
cotton-t-shirt-red,cotton-t-shirt,Cotton T-Shirt - Red,https://constructor.com/media/cotton-t-shirt.jpg,Red,https://constructor.com/products/cotton-t-shirt-red
cotton-t-shirt-blue,cotton-t-shirt,Cotton T-Shirt - Blue,https://constructor.com/media/cotton-t-shirt-blue.jpg,Blue,https://constructor.com/products/cotton-t-shirt-blue
An example variations.csv
file can be viewed and downloaded from GitHub.
Updated about 1 month ago