API Documentation – Adopt Partners

Overview

Pendo Reseller Adopt End Users (AEUs) may want to utilize the API in order to push data from their Reseller Adopt Accounts to another source.

Only Reseller Adopt Subscription Admins have the ability to create API keys. They can create API keys by navigating to Settings > Integrations. If the Integrations tab is unavailable, the API has not been enabled for your Account. Please contact your Pendo Reseller Adopt Partner to inquire about API access.

Note: If you sign into your account from somewhere other than adopt.pendo.io you will need to update the endpoint URLs to match the custom hostname.

Aggregation API Documentation

The Aggregation API can be reached via the following endpoint:

https://adopt.pendo.io/api/v1/aggregation

Aggregations are a query language for accessing your data. They take sources of data and apply operators to do computations, and are written in JSON.

It is worth noting, the aggregations endpoint is not intended to be a bulk export feature, so breaking up aggregations by different time ranges may be necessary if you hit sizing or timeout limits.

Data querying is performed using a flexible aggregation pipeline modeled after MongoDB. Each step of the pipeline takes rows as its input and output zero or more rows. The basic unit is a row which contains named fields (possible nested) fields. At a high level, an aggregation consists of

A row source and time specification
A pipeline for row processing

All input data rows for the specified time period are fed into the aggregation pipeline. The rows which those pipeline yields are the result of the aggregation; a pipeline may result in no rows or in many rows depending on the pipeline and the input.

If the time specification expands into 24 separate time periods (one for each hour of a day, perhaps) then 24 independent pipelines are run, and the aggregation will have 24 sets of results. Each of those results in independent of the others.

Nested Fields

Example structure

{ "name": { "first": "bugs", "last": "bunny" } }

Wherever a field name is needed, a dotted list of field names can be used to specify nested objects.

Nested specifiers can be used to create new fields as well; any missing intermediate objects will be initialized.

In the example to the right, the field name.last has the value bunny.

Data Stores

Data Stores
Visitors	General information on all visitors ever seen.
Events	Each interaction with your application by a visitor. This includes click, page load, metadata, and guide events. The definitions of pages and guides.

Expressions

A number of pipeline operators support expressions. Expressions follow basic C expression syntax along with function calls. Supported operators are ==, !=, &&, ||, <, >, <=, >=, +, -, /, *, %, and !.

All numeric value are converted to float's for computational purposes, and C order of operation is used.

Parenthesis () may be used for grouping expressions.

The comparison operators can be used to compare strings, the logical operators and comparison work on boolean values, and the equality operators can compare two lists.

For the || and && operators, null is treated equivalently to false.

The == and != operators treat null as a distinct value (so null == null but null != false) and any other operator which has null as a parameter returns null as its value (so 1 + null == null). Looking up a field that does not exist in a row results in null.

Values may be extracted from list by using normal JavaScript style indexes such as listField[index]; sublists can be created using JavaScript style slices such as listField[start:end]. Additionally negative indexes can be used to look up items from the end; listField[-2] will return the next to last item in the list. This follows Python style syntax.

Arrays of values are supported, such as ["a","b","c"] or [1,1+2,sum(a)]. All array elements must be of the same type. You may select an element from the array using normal Javascript style indexes (i.e. ["a","b","c"][2]) as defined in the preceding paragraph.

Expression primitives are numbers (with or without a decimal point), strings enclosed in double quotation marks, true, false, and field names (using dotted notation to specify nested fields). The currently supported functions are listed below.

Functions
`ceil(val)`	returns the ceiling of numeric value val `null` if `val == null` `error` if `val` is non-numeric/non-null
`contains(listValue, item)`	returns `true` if `item` is in the specified list
`contains(haystackString, needleString)`	returns `true` if the `needle` is contained in the haystack (both as strings)
`floor(val)`	returns the floor of numeric value `val` `null` if `val == null` error if `val` is non-numeric/non-null
`formatTime(goFormatString, timeInMillis)`	formats the given time using the go-style go format string examples.
`hash(string)`	returns a 32 bit hash value for the given string
`if(testExpression, ifTrue, [ifFalse])`	returns the `ifTrue` value if `testExpression` is `true` returns the `ifFalse` value if `testExpression` is `false`
`isNull(field)`	returns `true` if `field` is null or does not exist returns `false` otherwise
`len(listField)`	returns the length of the specified list
`list(field1, field2, field3)`	returns a list containing the named fields
`listInRange(listField, minValue, maxValue)`	returns a list containing the elements in `listField` which are between `minValue` and `maxValue` (inclusive)
`map()`	explained in greater detail below
`now()`	returns the current time in milliseconds past the epoch
`reverse(list)`	returns the specified `list`, reversed
`round(val)`	returns the the numeric value `val` rounded to the nearest integer returns `null` if `val == null` returns `error` if `val` is non-numeric/non-null
`split(s, sep)`	splits the string `s` into a list of strings using string `sep` as the field separator
`startOfPeriod(periodName, timeStamp)`	returns the first `timestamp` in the period which includes the passed `timestamp` `period` can be one of "hourly", "daily", "weekly", or "monthly"
`startsWith(haystack, needle)`	returns `true` if the string `haystack` begins with the string `needle` returns `false` if `haystack` is `nil`
`sum(listValue)`	returns the `sum` of the items in the specified list, ignoring non-numeric elements

The `map()` function

Two parameter function

map(users, users.name)

When applied to the row

{ 
  "users": [
    { "id": 5, "name": "mickey" }, 
    { "id": "2", "name": "minnie" }
  ] 
}

Example response

["mickey", "minnie"]

The map() function comes in two variants, which take either two or three parameters. The simpler form is map(listField, expression), and it applies the given expression to each item in the specified list, and returns a list of the results.

The listField must specify a list of objects. This means a list of integers will not work.

> Three parameter function

map(x, names, len(x))

When applied to the row

{ "names": ["bugs", "daffy"] }

Example response

[ 4, 5 ]

The three parameter form of map() is more general and works on lists of any type. While it works on any type, it's important to note that it processes lists of objects slower than the two parameter form.

The first parameter is an identifier which the third parameter (the expression) can use to reference the particular value being mapped.

Both versions will return null if the list parameter is null.

Source specification

Definition

{ 
  "source": { 
    "sourceType": {SOURCE PARAMS} 
  }, 
  "timeSeries": { 
    "first": timeInMsOrExpression, 
    "count": numberOfPeriods, 
    "period": periodName 
  } 
}

The source key must be specified as the first step in the pipeline unless spawn is used; then no source can exist. This tells the aggregation what data to feed into the pipeline.

Details
sourceType	Source of data for the aggregation.

Row source specification

Think of row sources as tables, to borrow a term from relational databases. Once you understand what’s available in the row sources, you’ll be well on your way to knowing what’s possible overall.

When you return row source data, you can slice and dice it in a number of ways.

The short answer to “what can we access with the API” is: all the major entities, day/hour summaries for page usage, and event level data for guide activity and polls.

The sourceType specifies what kind of data is being aggregated. Each specific type has specific parameters. If no parameters are provided, the empty {} can be replaced with null.

Row sources specify what types of rows should go into the pipeline, and are normally parameterized. The time period for the rows is not specified here; it is part of the time series specification instead.

Sources
`events`	Returns summary data for all events on the system in a time period. Accepts an event class or list of event classes `{ "eventClass" : "web"\|"ios"\|["web","ios"] }` to return, "web" by default.
`guideEvents`	Returns all guide events (`guideSeen`/`dismissed`/`advanced`) for requested time period. Can be limited by `guideId` or `guideId` and `guideStepId`.
`guidesSeen`	Returns `firstSeenAt`, `lastSeenAt`, `lastState`, and `seenCount` for each visitor that saw the requested guide(s) in a time period. Parameter must be `{ "guideId" : guideId, "guideStepId" : guideStepId }`; leaving out guide (blank/null/unsupplied) shows rows for all steps on all guides, and leaving out step shows all steps on the requested guide.
`pageEvents`	Covers all pages unless `pageId` is provided to narrow the results to a single pageId. Returns results for requested time period. Accepts an event class or list of event classes `{ "eventClass" : "web"\|"ios"\|["web","ios"] }` to return, "web" by default.
`pollEvents`	Returns all poll responses for requested time period. Can be limited by `guideId` or `guideId` and `pollId`.
`pollsSeen`	Returns poll responses in a time period. If visitor has multiple responses to the same poll, only the most recent is returned. Parameter must be `{ "guideId" : guideId, "pollId" : pollId }`
`visitors`	All visitors with their metadata. Parameter may be `{ "identified" : true }` to eliminate anonymous visitors

Time Series

Example time series for July 10, 2015 through July 12, 2015

{ 
  "timeSeries": { 
    "period": "dayRange", 
    "first": 1436553419000, 
    "count": 3 
  } 
}

Example time series for three hours ago to (and including) the current hour.

{ 
  "timeSeries": { 
    "period": "hourRange", 
    "first": "now() - 3 * 60601000", 
    "last": "now()" 
  }
}

The timeSeries consists of the length of each time period and a timestamp for the first period in milliseconds since the epoch. If first is a string it will be evaluated as an expression to derive a timestamp.

It also requires either a count of the number of periods to include in the aggregation or an expression which evaluates to the last time period to include in the aggregation.

Details
period	`dayRange` or `hourRange`
first	The timestamp in milliseconds after the epoch for the start of period.
count	Number of periods to include in the aggregation

Full time periods are always evaluated. Even if the first time does not coincide with the beginning of a period, all of the data for the period which contains time first will be used.

For the dayRange period, the day aligns with the time zone for your account. This does mean that there will occasionally by dayRange periods which include 23 or 25 hours of data because of the switch to and from daylight savings time. If this is undesirable, hourRange may be used, though there will be a significant performance impact when large amounts of data are being evaluated.

Events - grouped

The following row sources require a date range (relative or absolute) and period (day). They return a row of data for each unique combination of day/hour, visitorId, accountId, server name, and IP address.

Attributes
`events`	All recorded click and pageview events (tagged or untagged)
`pageEvents`	All recorded pageviews matching tagged pages

The `events` row source

Definition

{
    "source": {
        "events": null,
        "timeSeries": {
            "first": "{timeInMsOrExpression}",
            "count": "{numberOfPeriods}",
            "period": "{periodName}"
        }
    }
}

Example request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"events":null,"timeSeries":{"first":"1506977216000","count":-30,"period":"dayRange"}}}]}}'

Example response

{
    "results": [
        {
            "accountId": "Kappa Trust",
            "visitorId": "59",
            "numEvents": 12,
            "numMinutes": 1,
            "server": "www.some-domain.com",
            "remoteIp": "59.89.251.103",
            "parameters": null,
            "userAgent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36",
            "day": 1506830400000,
            "pageId": "allevents"
        }
    ],
    "startTime": 1504324800000
}

Returns a row for each unique combination of day/hour, visitorId, accountId, server name, IP address, and user agent.
Encompasses all events, whether tagged or untagged (note the pageId “allevents”)
Useful for measuring total time, days active, etc.
Optionally pass an eventClass of web or iOS to specify which event bucket to return.

The `pageEvents` row source

Definition

{
    "source": {
        "pageEvents": {
            "pageId": "{pageId}"
        },
        "timeSeries": {
            "first": "{timeInMsOrExpression}",
            "count": "{numberOfPeriods}",
            "period": "{periodName}"
        }
    }
}

Example request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"pageEvents":{"pageId":"aJ0KOADQZVL5h9zQhiQ0q26PNTM"},"timeSeries":{"first":"1506977216000","count":-30,"period":"dayRange"}}}]}}'

Example response

{
    "results": [
        {
            "accountId": "Acme Corp",
            "visitorId": "10",
            "numEvents": 1,
            "numMinutes": 1,
            "server": "www.some-domain.com",
            "remoteIp": "110.227.245.175",
            "parameters": null,
            "userAgent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36",
            "day": 1506484800000,
            "pageId": "aJ0KOADQZVL5h9zQhiQ0q26PNTM"
        }
    ]
}

Includes events which can be matched to tagged page rules
Returns a row for each unique combination of page ID, day/hour, visitorId, accountId, server name, IP address, and user agent.
Accepted parameters are null for all page data or pageId:PAGE_ID for a specific page.

Events - ungrouped

For some events, we expose ungrouped events. These are visible for a specified time period.

Sources
`guideEvents`	Interactions your visitors have with guides
`guidesSeen`	Summary of when visitors see guides
`guidesSeenEver`	Summary of when visitors see guides
`pollEvents`	Interactions your visitors have with polls
`pollsSeen`	Summary of when and how visitors respond to polls
`pollsSeenEver`	Summary of when and how visitors respond to polls

The `guideEvents` row source

Definition

{
    "source": {
        "guideEvents": null,
        "timeSeries": {
            "first": "{timeInMsOrExpression}",
            "count": "{numberOfPeriods}",
            "period": "{periodName}"
        }
    }
}

Example request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"guideEvents":null,"timeSeries":{"first":"1506977216000","count":-10,"period":"dayRange"}}}]}}'

Example response

    { "accountIds": [ "Acme" ], "browserTime": 1462796392383, "country": "US", "type": "guideAdvanced", "guideId": "He3BLNt44i0xwO5AEJVIijEswro", "guideStepId": "4TaMV6TLY1JmKIFkprqOcQllU8k", "latitude": 35.823483, "loadTime": 0, "longitude": -78.825562, "region": "nc", "remoteIp": "209.136.209.34", "serverName": "your.app.com", "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/601.5.17 (KHTML, like Gecko) Version/9.1 Safari/601.5.17", "visitorId": "johndoe@acme.com", "accountId": "Acme" }

Returns one event row for every guide event that occurs. This is not a summary or bucket structure like previous examples.
Type field values include guideAdvanced, guideSeen, and guideDismissed
Accepted parameters include guideId and guideStepId to drill down into a specific guide or step.
pollEvents behaves similiarly and provides one row for every poll response that occurs. You can specify pollId as a parameter to that source.

The `guideSeen` row source

Definition

{
    "source": {
        "guidesSeen": {
            "guideId": "{guideId}",
            "guideStepId": "{guideStepId}"
        },
        "timeSeries": {
            "first": "{timeInMsOrExpression}",
            "count": "{numberOfPeriods}",
            "period": "{periodName}"
        }
    }
}

Exmaple request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"guidesSeen":{"guideId":"nWtgylnMqWeB4DynO5duF9ako-M","guideStepId":"mHO55y6WJvJ0CnsmUgAtiGtgobA"},"timeSeries":{"first":"1506977216000","count":-10,"period":"dayRange"}}}]}}'

Example response

{
    "firstSeenAt": 1489673759222,
    "guideId": "nWtgylnMqWeB4DynO5duF9ako-M",
    "guideStepId": "mHO55y6WJvJ0CnsmUgAtiGtgobA",
    "lastAdvancedAutoAt": 0,
    "lastDismissedAutoAt": 0,
    "lastSeenAt": 1489674347693,
    "lastState": "advanced",
    "seenCount": 5,
    "visitorId": "avery"
}

Returns a row for each unique combination of Visitor ID, guide ID, and step ID
You can additionally supply guideId and guideStepId to drill into a specific guide or step.
The guidesSeenEver row source can also be used for the same output, but is time insensitive and does not require a timeSeries.

The `pollEvents` row source

Definition

{
    "source": {
        "pollEvents": null,
        "timeSeries": {
            "first": "{timeInMsOrExpression}",
            "count": "{numberOfPeriods}",
            "period": "{periodName}"
        }
    }
}

Example request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"pollEvents":null,"timeSeries":{"first":"1506977216000","count":-10,"period":"dayRange"}}}]}}'

Example response

{
    "accountIds": [
        "Acme"
    ],
    "browserTime": 1462796392383,
    "country": "US",
    "type": "pollResponse",
    "guideId": "He3BLNt44i0xwO5A201IijEswro",
    "latitude": 35.823483,
    "loadTime": 0,
    "longitude": -78.825562,
    "pollId": "z7y94y62v3c",
    "region": "nc",
    "remoteIp": "209.136.209.34",
    "serverName": "your.app.com",
    "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36",
    "visitorId": "user@acme.com",
    "accountId": "Acme",
    "pollResponse": "You guys rock!"
}

Returns one row for every poll response that occurs.
Accepted parameters include guideId and pollId to drill down into a specific poll.

The `pollsSeen` row source

Definition

{
    "source": {
        "pollsSeen": {
            "guideId": "{guideId}",
            "pollId": "{pollId}"
        },
        "timeSeries": {
            "first": "{timeInMsOrExpression}",
            "count": "{numberOfPeriods}",
            "period": "{periodName}"
        }
    }
}

Example request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"pollsSeen":{"guideId":"1wz09ZEAFp-HVl78Iw8oxpwbRZA","pollId":"3isny1qp7t2"},"timeSeries":{"first":"1506977216000","count":-10,"period":"dayRange"}}}]}}'

Example response

{
    "accountId": "1",
    "guideId": "1wz09ZEAFp-HVl78Iw8oxpwbRZA",
    "pollId": "3isny1qp7t2",
    "response": 0,
    "time": 1500305598629,
    "visitorId": "1"
}

Returns poll responses in a time period
Only the most recent poll response for a visitor is displayed i.e. if a visitor has responded to the same poll multiple times, pollsSeen will only return most recent response. To return all responses, use pollEvents.
You must supply guideId and pollId as parameters to drill into specific poll events
The pollsSeenEver row source can also be used for the same output, but is time insensitive and does not require a timeSeries.

Entities

Entities are the nouns of Aggregations.

Entities
`guides`	In-app messages you show your visitors
`pages`	Sets of rules that identify individual pages in your product (defined by a URL rule)
`visitors`	Your product’s users

The `guides` row source

Definition

{ "source": { "guides": null } }

Example request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"guides":null}}]}}'

Example response

{
    "attributes": { ... },
    "audience": [ ... ],
    "createdAt": 1440449676899,
    "createdByUser": {
        "first": "Someone",
        "id": "5269565705551872",
        "last": "Lastname",
        "role": 8,
        "username": "someone@yourcompany.com"
    },
    "expiresAfter": 1440472740000,
    "id": "BF_p40WvYWC2o81pvsdt1kjcIkI",
    "isMultiStep": false,
    "lastUpdatedAt": 1460734696429,
    "lastUpdatedByUser": {
        "first": "",
        "id": "4985730428305408",
        "last": "",
        "role": 8,
        "username": "someone@yourcompany.com"
    },
    "launchMethod": "auto",
    "name": "2015-08-24 - Maintenance tonight.",
    "state": "disabled",
    "stateHistory": [ ... ],
    "steps": [
        {
            "advanceMethod": "button",
            "attributes": {
                "height": 395,
                "width": 440
            },
            "content": "",
            "createdByUser": {
                "first": "Someone",
                "id": "5269565705551872",
                "last": "Lastname",
                "role": 8,
                "username": "someone@yourcompany.com"
            },
            "elementPathRule": "",
            "guideId": "BF_p40WvYWC2o81pvsdt1kjcIkI",
            "id": "Itfe6aDRNtgx9J1XisQwaCxDS-A",
            "lastUpdatedAt": 1460734696429,
            "lastUpdatedByUser": {
                "first": "",
                "id": "4985730428305408",
                "last": "",
                "role": 8,
                "username": "someone@yourcompany.com"
            },
            "name": "",
            "rank": 5000000,
            "resetAt": 0,
            "type": "lightbox"
        }
    ]
}

steps include a single item for lightboxes, tooltips, banners, etc. and multiple items for walkthroughs.

The `pages` row source

Definition

{ "source": { "pages": null } }

Example request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"pages":null}}]}}'

Example response

{
    "results": [
        {
            "color": "gray",
            "createdAt": 1505938241596,
            "createdByUser": {
                "first": "Adam",
                "id": "5197072786522112",
                "last": "Lohner",
                "role": 8,
                "username": "somePerson@email.com"
            },
            "dirty": false,
            "group": {
                "color": ".groupColor01",
                "createdAt": 1505937447295,
                "createdByUser": {
                    "first": "",
                    "id": "",
                    "last": "",
                    "role": 0,
                    "username": ""
                },
                "description": "",
                "id": "DEFAULTGROUP_01_",
                "items": [

                ],
                "lastUpdatedAt": 1505938327653,
                "lastUpdatedByUser": {
                    "first": "Adam",
                    "id": "5197072786522112",
                    "last": "Lohner",
                    "role": 8,
                    "username": "somePerson@email.com"
                },
                "length": 6,
                "name": "Dashboard"
            },
            "id": "aJ0KOADQZVL5h9zQhiQ0q26PNTM",
            "kind": "Page",
            "lastUpdatedAt": 1505938241596,
            "lastUpdatedByUser": {
                "first": "Adam",
                "id": "5197072786522112",
                "last": "Lohner",
                "role": 8,
                "username": "somePerson@email.com"
            },
            "name": "Dashboard",
            "rootVersionId": "aJ0KOADQZVL5h9zQhiQ0q26PNTM",
            "rules": [
                {
                    "designerHint": "http://some-domain.com/",
                    "parsedRule": "^https?://[^/]/?(?:;[^#])?(?:\?[^#])?(?:#.)?$",
                    "rule": "//*"
                }
            ],
            "rulesjson": "",
            "stableVersionId": "aJ0KOADQZVL5h9zQhiQ0q26PNTM-20170920201041.596687476",
            "validThrough": 1506981600000
        }
    ]
}

Tagged pages. Returns one row for each tagged page.

Visitors

Definition

{ "source": { "visitors": null } }

Example request

curl -X POST 
https://adopt.pendo.io/api/v1/aggregation 
-H 'content-type: application/json' 
-H 'x-pendo-integration-key: [add_your_integration_key_here]' 
-d '{"response":{"mimeType":"application/json"},"request":{"pipeline":[{"source":{"visitors":null}}]}}'

Example response

{
    "results": [
        {
            "metadata": {
                "agent": {
                    "email": "ac@Cras.co.uk",
                    "ipaddress": "184.169.45.4",
                    "language": "en_US"
                },
                "auto": {
                    "accountid": "Epsilon united",
                    "accountids": [
                        "Epsilon united"
                    ],
                    "firstvisit": 1506965912312,
                    "id": "84",
                    "idhash": 2012371633,
                    "lastbrowsername": "Chrome",
                    "lastbrowserversion": "61.0.3163",
                    "lastoperatingsystem": "Mac OS X",
                    "lastservername": "some-domain.com",
                    "lastupdated": 1506976690418,
                    "lastuseragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36",
                    "lastvisit": 1506976690112,
                    "nid": 84
                }
            },
            "visitorId": "84"
        }
    ]
}

The visitors source returns one row per visitor ever seen and contains various metadata about the visitor. Each visitor has a unique visitorId.

Accepted parameters are null or {"identified":true} to eliminate anonymous visitors.

Metadata
auto	data automatically calculated
custom	custom field data
salesforce	data synced from Salesforce
segment	data synced from Segment

Pipeline

Example sytnax #1

{ "cat": null }

Example sytnax #2

{ "field": "firstName", "operator": "==", "value": "John" }

Each step is of the form { "name" : parameterObject }.

The source rows are transformed into result rows by the pipeline; the final rows are the result of the aggregation step. Pipelines can be arbitrarily long, and are represented as an ordered list of individual steps.

Operators

Operators
`cat`	Takes no parameters, and returns each row without changing it.
`count`	Counts the number of input rows and returns a single row of the form `{ "count" : 456 }`
`accumulate`	Keeps a running total of more than one field, and includes the current value of the total in each row. Parameters are of the forum `{ "totalField1" : "sourceField1" }`
`eval`	Inserts new fields based on one or more expressions. Format is an `{ "fieldName1" : "expression" }`
`identified`	Eliminates rows whose named field looks like an anonymous visitorId `{ "identified" : "visitorId" }` where visitorId is the field name to check.
`limit`	Returns the first N rows and discards the rest `{ "limit" : 5 }`
`select`	Builds a new row (while discarding the old one) containing one or more fields. Format is an `{ "fieldName1" : "expression" }`
`set`	Sets fields to constant values `{ "set" : { "field1" : 5, "field2.nested" : 20 } }`
`sort`	Sorts the rows by the specified field list `{ "sort" : [ "fieldName", "anotherFieldName" ] }` Prepend `a -` to the front of a field name to reverse sort by that field.
`unmarshal`	Expands an embedded JSON string into sub-objects.
`useragent`	Expands a user agent string into a parsed set of strings.

The `spawn` operation

Definition

{
    "spawn": [
        [
            {
                "source": {
                    "events": null,
                    "timeSeries ": {
                        "period": "dayRange",
                        "first": 1404745200000,
                        "count": 7
                    }
                }
            },
            {
                "cat": null
            }
        ],
        [
            {
                "source": {
                    "events": null,
                    "timeSeries ": {
                        "period": "month",
                        "first": 1404745200000,
                        "count": 7
                    }
                }
            },
            {
                "count": null
            }
        ]
    ]
}

The spawn operator behaves like fork, but each nested pipeline must specify its own source. This allows multiple sources to be used to create a single stream of rows which are processed by the remainder of the pipe.

Only sources which cause a single pipeline to be instantiated may be used; this means dayRange, hourRange, or any source whose count is 1 is available.

The timeSeries for the source is part of the source specification for the spawn pipeline.

The `switch` operation

Definition

{
    "switch": {
        "newField": {
            "existingField": [
                {
                    "value": "small",
                    "<": 10
                },
                {
                    "value": "medium",
                    ">=": 10,
                    "<": 20
                },
                {
                    "value": "large",
                    ">=": 20,
                    "<": 50
                }
            ]
        }
    }
}

The switch operation allows a field to checked against any number of conditions, and have a new field set in the row based on which condition the original field matches.

For each row the existingField will be checked to see which set of conditions its value matches.

It must match all of the conditions for newField to be set to the value indicated. If no conditions match, or if existingField is not set at all, the row is skipped.

For the above input, the row { "existingField" : 15 } will become { "existingField" : 15, "newField" : "medium" }

The `filter` operation

Definition

{ "filter": "fieldName == 5 || len(anotherField) > 3" }

The filter pipeline operator evaluates an expression and eliminates rows that for which the expression is false.

The `fork` operation

The following calculates the total time for each accountId and visitorId

{
    "fork": [
        [
            {
                "group": {
                    "group": [
                        "visitorId"
                    ],
                    "fields": [
                        {
                            "time": {
                                "sum": "numMinutes"
                            }
                        }
                    ]
                }
            }
        ],
        [
            {
                "group": {
                    "group": [
                        "accountId"
                    ],
                    "fields": [
                        {
                            "time": {
                                "sum": "numMinutes"
                            }
                        }
                    ]
                }
            }
        ]
    ]
}

The fork operator allows the pipeline to be split into multiple pipelines, with each of the child pipelines receiving all of the rows for that step.

Note that the same row is sent to all of the child pipelines; it is not a copy of the row and any changes to the row made in one pipeline will be visible in all pipelines.

The rows which result from each pipeline are all sent through the remainder of the parent pipeline. Forks may be nested, however this has not been well-tested (aka - no promises).

Empty pipelines (which would pass through the rows without modifying them) are not permitted, but a pipeline with a single cat operation performs identically.

The `join` operation

Definition

{ "join": { "fields": [ "firstName", "lastNme" ], "width": 3 } }

The join operator merges multiple rows together based on one or more keys. If there are field conflicts, the values from rows encountered later in the join will win. The optional widthargument allows the user to tell join how many rows are expected to make up the final result. Once that many rows with the same join key have been seen, the merged row is passed to the next operation in the pipeline. If more rows are later seen with that join key, they are treated as a new join operation. Any join keys which have had less than width rows are processed once all input rows have been seen.

The `group` operation

Example request to calculate how much time each identified visitor has spent on the site over 30 days

{
    "requestId": "last30daysVisitorTime",
    "timeSeries": {
        "period": "dayRange",
        "count": 30,
        "first": 1404745200000
    },
    "source": {
        "events": null
    },
    "pipeline": [
        {
            "identified": "visitorId"
        },
        {
            "group": {
                "group": [
                    "visitorId"
                ],
                "fields": [
                    {
                        "minutes": {
                            "sum": "numMinutes"
                        }
                    }
                ]
            }
        }
    ]
}

NOTE: The count value of 180 has to be a long enough period to cover the entire time the guide has been in existence.

The group operator supports flexible grouping (similar to GROUP BY in SQL) and simple aggregations of fields. It supports the following parameters:

Parameters
`group`	List of fields whose values match in each grouping
`fields`	Set of fields whose values are aggregated across the group
`stringSort`	Result rows will be sorted by the group key. The group key is treated as a string for sort purposes (so "2" > "10"). This provides a stable ordering for the result.

Each resulting row consists of all of the fields specified by both the group and fields specification.

Aggregation fields consist of field name to use in the output row, an aggregation name, and a field name from the input row. For example, to count the rows in a group use { "outputField" : { "count" : "inputField" } } will count the number of rows which match (here the field name must be specified, but is irrelevant). The following row aggregators are supported:

Aggregators
`count`	Counts the number of rows in the group. If a field name is it counts the unique occurrences of that field.
`listAverage`	Average the individual values in a list arguement and return a list of those averages.
`max`	Returns the largest value of the specified (numeric) field.
`min`	Returns the smallest value of the specified (numeric) field.
`sum`	Adds up the values of the specified field.
`first`	Returns the first value of the specified field (note: "first" is only defined as deterministically as the input order is).

Count the number of unique visitorIds use the following pipeline:

[
    {
        "group": {
            "group": [
                "visitorId"
            ]
        }
    },
    {
        "visitorCount": {
            "count": null
        }
    }
]

To get the number of events for each visitorIduse:

[
    {
        "group": {
            "group": [
                "visitorId"
            ],
            "fields": {
                "eventCount": {
                    "count": "event"
                }
            }
        }
    }
]

The `reduce` operation

Count the number of rows and add up the minutes fields in the rows:

[
    {
        "reduce": {
            "eventCount": {
                "count": null
            },
            "totalMinutes": {
                "sum": "numMinutes"
            }
        }
    }
]

The reduce operator collects all rows into a single row, and performs aggregations on fields. The aggregators and format are identical to the fields parameter for the group pipeline operation.

Example request to calculate the total number of events and the total number of minutes for a thirty day period.

{
    "requestId": "last30DaysEventAndMinuteTotals",
    "timeSeries": {
        "period": "dayRange",
        "count": 30,
        "first": 1404745200000
    },
    "source": {
        "events": null
    },
    "pipeline": [
        {
            "reduce": [
                {
                    "events": {
                        "sum": "numEvents"
                    }
                },
                {
                    "minutes": {
                        "sum": "numMinutes"
                    }
                }
            ]
        }
    ]
}

Count the number of unique visitors in a dataset

{
    "requestId": "numberOfUniqueVisitors",
    "timeSeries": {
        "period": "dayRange",
        "count": 30,
        "first": 1404745200000
    },
    "source": {
        "events": null
    },
    "pipeline": [
        {
            "reduce": [
                {
                    "reduce": {
                        "visitors": {
                            "count": "visitorId"
                        }
                    }
                }
            ]
        }
    ]
}

Example request using group and reduce to count the number of visitors in a set of events and the total number of minutes used by those visitor:

{
    "requestId": "lastDaysEventAndMinuteTotals",
    "timeSeries": {
        "period": "dayRange",
        "count": 30,
        "first": 1404745200000
    },
    "source": {
        "events": null
    },
    "pipeline": [
        {
            "group": {
                "group": [
                    "visitorId"
                ],
                "fields": {
                    "visitorMinutes": {
                        "sum": "numMinutes"
                    }
                }
            },
            "reduce": {
                "visitorCount": {
                    "count": "numVisitors"
                },
                "totalMinutes": {
                    "sum": "visitorMinutes"
                }
            }
        }
    ]
}

The `segment` operation

Definition

{ "segment": { "id": "id" } }

Segments are used to preselect groups of visitors. Preexisting segments filter rows via the segment operator.

An input row is passed if the visitorId in the input row match an item in the segment.

The `bulkExpand` operation

Expand visitors into a subdocument called info

{ "bulkExpand": { "info": { "visitor": "visitorId" } } }

The following data types support expand: account and visitor.

The `unwind` operation

To unwind the following:

{ 
  "name": "fred", 
  "items": [
    { "v": "first" }, 
    { "v": "second" }, 
    { "v": "third" } 
  ] 
}

Try:

{ 
  "unwind": { 
    "field": "items", 
    "index": "itemIndex" 
  } 
}

Response would yield:

[{ 
  "name": "fred", 
  "items": { "v": "first" }, 
  "itemIndex": 0 
}, { 
  "name": "fred", 
  "items": { "v": "second" }, 
  "itemIndex": 1 
}, { 
  "name": "fred", 
  "items": { "v": "third" }, 
  "itemIndex": 2 
}]

The unwind operator expands a list of items into separate rows. All other items are preserved, and an index may optionally be added.

Add "prefix" : true and you will get:

[
  { 
    "name": "fred", 
    "items": [{ "v": "first" }], 
    "itemIndex": 0 
  }, { 
    "name": "fred", 
    "items": [{ "v": "first" }, { "v": "second" }], 
    "itemIndex": 1 
  }, { 
    "name": "fred", 
    "items": [{ "v": "first" }, { "v": "second" }, { "v": "third" }], 
    "itemIndex": 2 
  }
]

Alternatively, if everything to the left of each items is important, unwind can preserve those items.

If the list to unwind is empty then no rows will be emitted unless { "keepEmpty" : true } is specified, in which case the original row will be preserved.

The `useragent` operation

Definition

{ "useragent": { "output": "input" } }

The useragent operation parses the user agent string in the field input and renders it into a object in output.

Output
`name`	the browser's name (e.g. "MSIE", "Chrome", etc.)
`version`	the version of the browser
`os`	the operating system (e.g. "Windows", "Mac OS X", "GNU/Linux", etc.)
`type`	the type of user agent (e.g. "Browser", "Crawler", etc.)

Multi-App

Using the pages source as an example, without querying for a specific app, the source looks like this:

{ "source": { "pages": null } }

To execute Aggregations specific to an app, you'll need to change the source value within the aggregation from null - which is an indicator to return all records for all apps associated with the sub - to include an appId value.

Including the appId scopes the results to the data associated with just that application within the subscription.

For all multi-app Aggregations, use the primary app's Integration API key for the value for x-pendo-integration-key in the Header.

appId

To query on a specific app, source will look like this:

{
    "source": {
        "pages": {
            "appId": 5629499534213120
        }
    }
}

PendoAdopt_apiDocumentation.md
40 KB Download

How Can We Help?

Overview

Aggregation API Documentation

Nested Fields

Data Stores

Expressions

The map() function

Source specification

Row source specification

Time Series

Events - grouped

The events row source

The pageEvents row source

Events - ungrouped

The guideEvents row source

The guideSeen row source

The pollEvents row source

The pollsSeen row source

Entities

The guides row source

The pages row source

Visitors

Pipeline

Operators

The spawn operation

The switch operation

The filter operation

The fork operation

The join operation

The group operation

The reduce operation

The segment operation

The bulkExpand operation

The unwind operation

The useragent operation

Multi-App

appId

Related articles

The `map()` function

The `events` row source

The `pageEvents` row source

The `guideEvents` row source

The `guideSeen` row source

The `pollEvents` row source

The `pollsSeen` row source

The `guides` row source

The `pages` row source

The `spawn` operation

The `switch` operation

The `filter` operation

The `fork` operation

The `join` operation

The `group` operation

The `reduce` operation

The `segment` operation

The `bulkExpand` operation

The `unwind` operation

The `useragent` operation