Day 6 - Feeding source attribution data back to Segment.com
So in the first 5 days we have created and deployed some lambda functions that:
- store incoming segment events in a dynamoDB table (webhook)
- store mapping between anonymousId and userId in DynaoDB table (webhook)
- process URL and HTTP referrer and detect where the traffic is from using this github library
- allow us to query (for debugging purposes) that data by userId or anonymousId
Additionally I have imported a 3 year archive of page() and identify() calls to not have to start from scratch.
Next up: Feeding this data back to segment.com.
Triggering the right segment events🔗
Over time we want to trigger those events in real-time but I will start by providing an endpoint to manually trigger them so we can debug in segment, mixpanel and customer.io if everything is coming in.
In the first version of the below code I was using segment analytics node SDK (https://github.com/segmentio/analytics-node). It worked fine when triggered through command line (node script.js) but working with Lambda/Async architecture I found some calls not to be triggered. Then I found this bug so switched to using the HTTP tracking API specifically the batch API to feed my events.
For now we’ll start with 2 attribution models: First and Last touch. To allow the different reporting tools to use that information in different reports we will write user traits using identify()
The information coming out of the visitor source detection (see Day 3 — Identify User Source) looks like this
Not sure if segment supports nested properties but I tried it in the past and didn't get it to work. A quick google search thought me that it’s better to flatten the properties. As we’re sending these properties for both first and last touch the end result will look something like this
This is the file that does most of the heavy lifting. Not too proud about this file but I told you I’m not the best JS programmer. It works though 😊
I have refactored the codebase a bit to use dotenv in combination with serverless. Here are some of the changes:
Create an .env file
Update your serverless.yml file
Notice how different handlers are now in different serverless.yml files. Create the following file:
To make the segment part work create the following handlers in these 4 files:
A quick sls deploy will result in a few new endpoints:
Let me explain the endpoints:
- POST — $ENDPOINT_PROD/events → Segment to send every event payload and store it in DynamoDB
- GET — $ENDPOINT_PROD/segment/identify/user/{id} → Fire identify() call with source_first_* and source_last_* properties
- GET —$ENDPOINT_PROD/segment/track/anonymous/{id} → Fire track('Source Identified') calls with flattened visitor detection properties for anonymous users.GET — $ENDPOINT_PROD/segment/track/user/{id} → Fire track('Source Identified') calls with flattened visitor detection properties for identified visitors.
Here is a quick command to update your endpoint so you can run curl/HTTPie commands after deploying or destroying the serverless application
Changed the name from BASE_DOMAIN to ENDPOINT_PROD
🔗Testing User Properties (with attribution data).
Let’s trigger the identify for the user with the new attribution data:
Watching the segment event debugger you’ll see identify() calls coming in:
If all is well you will see that information appear in any integrations you have linked up. In our case CustomerIO and Mixpanel:
🔗Source Identified Events
For other reports (Funnels, Flow Diagram, …) it will be useful to spawn events whenever the user visitor source is detected.
Let’s trigger those calls for the user with the new attribution data:
Watch the segment debugger and see new events (you can specify the event name in .ENV) coming in
This is possible because I have imported all old events. Every event will have a historic timestamp property which will feed the events as old events. More information in segment Importing Historic Events documentation.
This will make a profile in Mixpanel look like this. Notice both the event information on the left, and the user trait information on the right.
That’s it. Tomorrow we’ll be exploring how to fire those tracking and identify calls for the last months visits and explore how to report on this newly available information.
Other articles in the series
05/07/2021
Day 11 - Sales Attribution
03/07/2021
Day 10 - Six months later
03/06/2020
Day 9 - Dealing with tracking/ad blockers
18/05/2020
Day 8 - Feeding in sales data
06/05/2020
Day 7 - Reporting on visitor sources
01/05/2020
Day 6 - Feeding source attribution data back to Segment.com
27/04/2020
Day 5 - Feed old events
24/04/2020
Day 4 - Run in production + API
22/04/2020
Day 3 - Cleanup & Identify Visitor Source
21/04/2020
Day 2 - Capture segment events
20/04/2020
Day 1 - The Masterplan
19/04/2020
Solving marketing attribution (using segment)