Day 11 - Sales Attribution
On February we allowed visitors on the website to sign up for a 14-day trial. At that time we decided to get rid of Hubspot because:
- Hubspot.js payload. Every lighthouse audit is complaining about the sheer size of that bundle (which includes attribution, chat, forms...)
- Segment.com and Hubspot don't work together. If you want all your tracking calls and identify traits to show up in Hubspot (preferably as custom fields) you need to upgrade to a Hubspot enterprise subscription. $$$
- Attribution has now been fixed. We enjoyed hubspot reports on where a visitor originally came from and which sources helped in the final conversion.
We needed something simpler that we could extend so decided to use airtable for a backend to power the forms on the website (demo, contact form, trial form).
🔗Airtable as backend
All forms on the website are now submitted to an /api/form endpoint. The most common form on the website is the request a trial flow:
The good part about creating the license first (Prezly Admin API) is that we receive a customer_id and a user_id which can later be used to link sales data back to the original form submission. In a way, you could say that Airtable has the mapping information to link visitor sessions (anonymousId, userId) to Prezly database information.
Turns out Airtable is super friendly to anyone that wants to organise those form submissions in a way that is useful for them. Take this view for example:
Which is an overview of expired customers that have healthy activity.
🔗Extra data pushed into Airtable
Without going into details on how we did it (Lambda functions, Airtable automation and webhooks) here is the information we additionally store into the form submissions base:
- userId, licenseId: see above
- Channel Attribution Data:
- First Attribution Channel
- Last Attribution Channel
- Nr Attributions: How many channels contributed
- Qualification: Manual qualification low, medium, high.
- Health Score: Score calculated by Vitally around activity (nr stories, nr session, ...)
- Customer Status: Cancelled, Trial, Customer
- MRR: Monthly Recurring Revenue
It would be nice to get some of that information back into segment.com (and mixpanel) so I create additional reports (already did some in day 7) answering questions:
- Which channels bring in high-quality prospects?
- How much MRR did Twitter bring in in the last 3 months?
- How many channels on average contribute to a sale?
🔗The Code
The first thing we do is add the handlers
Notice how I add a GET and POST endpoint? They are both doing the same thing but the GET endpoint is easier to use to debug/test and backfill the sales data for the last X months of trials.
The code to fetch data from airtable + fire identify calls is wrapped in a method trackUserSales.
I also added some variables to the .env files so make sure to have a look (or check out main) to make sure you have all config variables. Using a quick 'sls deploy' should publish that function.
🔗Feed historic
To overwrite/fill in the history of our website data I have copied the list of all userIds from airtable. Other ways to find userIds:
- Segment Audience builder > Download CSV (only userId)
- Create mixpanel segment > Download > Remove columns
- Use the data warehouse to find the userId for all users that did a specific event
That gave me a list of a few thousand userIds (I went with 3 months) which I stored in 'trial_userids.csv'. Using some bash I called the GET endpoint to backfill that data.
The code to trigger the calls (one call for every line in CSV):
> cat trials_alltime.csv| while read line; do https --print=b --ignore-stdin [LAMBDA_ENDPOIN]/prod/segment/identify/user/$line/sales; done
After I ran that command I could see that sales attribution data in mixpanel:
🔗Real-time sales attribution
With the data now being backfilled, I need a way for this to automatically be triggered. I was planning to use Segment.com Personas > Journey for that with the following criteria:
- After requesting a trials
- T+24hours
- T+3 days (qualification will be done)
- T+7 days (trial usage data, health score)
- T+14 days (trial over, final health score and usage)
- T+21 days (trials that activate within a week after expiry)
- Based on lastSyncedSales + 30 days ago
That combination will make sure that we get pretty accurate reporting on new trials while the continuous sync (based on lastSyncedSales) will ensure that even older trials get refreshed every 30 days.
So I set this up using a Journey that looks like this
I am recycling the webhook from Day 10 which I modified to support the new steps (by key). The code now looks like this:
I set up a similar journey based on the lastSyncedSales property to ensure old trials are kept up to date at least once every month.
Tomorrow I will have a look at the different questions in this post, and if they can now be answered in Mixpanel just using this new data.
Other articles in the series
05/07/2021
Day 11 - Sales Attribution
03/07/2021
Day 10 - Six months later
03/06/2020
Day 9 - Dealing with tracking/ad blockers
18/05/2020
Day 8 - Feeding in sales data
06/05/2020
Day 7 - Reporting on visitor sources
01/05/2020
Day 6 - Feeding source attribution data back to Segment.com
27/04/2020
Day 5 - Feed old events
24/04/2020
Day 4 - Run in production + API
22/04/2020
Day 3 - Cleanup & Identify Visitor Source
21/04/2020
Day 2 - Capture segment events
20/04/2020
Day 1 - The Masterplan
19/04/2020
Solving marketing attribution (using segment)