In our online world, we usually analyze going forward in time. We define the starting event in the user journey, and examine the events or actions that followed. The same logic lies at the base of Cohort analysis, which is used to identify behavioral patterns or trends within a pre-defined period of time, starting from the first user event and analyzing the actions that follow.
Cohort analysis on User Retention or… Churn
Cohort analysis looks at the recency of which users perform a recurring action such as visits or page views. Cohort is ideal for analyzing user retention analysis, or it’s opposite – churn.
For retention you would measure the recency of visits to the site, or purchases and for churn the same, just looking at when visits, plays or purchases stopped recurring for a specific cohort of users.
That’s all about predictions and trends, looking at one specific event that the sequence STARTS WITH and analyzing forward in time, the FOLLOWED BY events.
But if I need to start the analysis from the last event, like the purchase moment, how can I do that?
If I want to look at all the users who purchased a product and examine how many days prior to the purchase they saw the promotion ad? Or, in the case of casual gaming, I want to look at users who converted to paying customers and examine my assumption that completing the previous levels within the same day, led to the conversion. To answer such questions you would need to analyze backwards.
Analyzing backwards from the winning goal
As a behavioral analyst that uses Cohorts often, I repeatedly find myself in need to flip the analysis backwards, to look at the sequence from when the final action occurred, and then identify patterns of behaviors that preceded it. Think of it as an event like a running city marathon. To analyze the success of the runners who completed the marathon, you must start with the ending event of the marathon finish line, and look back on events that preceded the marathon, to analyze patterns of behaviors of the runners. You might look at when they joined the running club or how persistent they were in attending it. You could cohort the runners by the running club they belong to, their age or gender, but the idea is to start analyzing from the winning goal, backwards.
Two extensions were added to our agile CQL querying language to enable the Reverse Cohort. The query starts with the ENDING event then define the event that PRECEDED it.
This is how a reverse cohort (in this case analyzing GitHub behavior) would look like:
SELECT cohort_name, cohort_id, cohort_size, bucket_id, COUNT(DISTINCT user_id) FROM cooladata WHERE date_range(last 7 days) CLUSTER COHORT BY 1 DAYS EACH ENDING WITH event_name = 'PushEvent' PRECEDED BY event_name = 'PullRequestEvent' BUCKET BY 1 DAYS ALL GROUP BY cohort_name, cohort_id, cohort_size, bucket_id
Reverse Cohort Visualization
Reverse cohort visualization is just the same as cohort, but follows an inverted logic.
For example, in this case, taken from an online trading site, There was a mass of withdraws on a specific day and we wanted to examine if the reason was deals that were closed in the days prior to that. As the cohort shows, The number of withdraws is much higher than the number of users who closed deals on the preceding days. This cohort urged the trading business to examine other reasons for the withdraws panic…
Analyzing the events that led to a purchase
eCommerce companies would often want to start analyzing from the event of purchase, analyzing the events that led to purchase backwards. Like in this example from home appliances eCommerce site, selling seven different brands of washing machines. In this reverse cohort analysis, they wanted to examine the correlation between views of the product page and the final conversion of the purchase. Column 0 shows the users who viewed brand A product pages on the same day of the purchase, for 86% of them it led to a purchase. With Brand E on the other hand, with triple the amount of views on the day before, only 12% ended up purchasing. In this reverse cohort analysis we see that brand E is an attractive product gaining a lot of views, but not converting as well as brand A.
The answers to such business questions starting with the final goal going backwards to previous events is only possible using reverse cohort analysis.
Try it yourself, with the CoolaData advanced behavioral analytics solution.
Prefer a private demo? Sign Up