big data |

Things to consider for Fractional Revenue Attribution

May 18, 2011 Leave a comment

Revenue attribution is the hottest topic these days. Proliferation of online media, requires reshuffling marketing spend across many more spend categories. Traditional funnel-engineering type work is good, but static, and doesn’t address a few key issues.

1) The transient nature of marketing spend effectiveness that comes and goes with changing keywords, banners offers

2) It does not address the problem in a customer-centric manner (in fact orders are placed by customers who clicked on a keyword, or received a catalog)

The new marketing spend effectiveness paradigm involves understanding causality of relationship between marketing and sales at a transactional level using statistical methods to fractionally attribute. There are five elements at play;

Order of events: what sequencing (0rder) of actions lead to sales transactions
Combined effects: what is the joint effects of marketing touches
Frequency: how many touches are required to convert a prospect to a buyer
Time decay: How the effects of marketing->sales decay with time passed
Effectiveness: what is the relative efficacy of each vehicle is different (e.g., banner view is not the same effectiveness as a 52 pg. catalog)

How this problem can be expressed in mathematical terms and the solution is quite sophisticated and i can not get into it since this is our core IP at Agilone.

Once an attribution could be made, the next issue is how to measure the effects of overspending, which i will get into in the next post. The inherent problem in fractional attribution is how to make sure that by increasing marketing spend on one vehicle will most likely (and not by causality) reduce the effectiveness of other existing spend elements.

Filed under AB Testing, big data, Campaign Measurement, Structured Testing

Big Data question: what to save for how long?

March 26, 2011 Leave a comment

As tools in the big data world emerges and mature, question is how much of the data to save in high versus low resolution. The answer depends on the uses of this data. Recently, i’ve had lunch with someone from Yahoo, where they were doing modeling on full-resolution data and claimed that you need big-data tools (hadoop, mahout) to build predictive models.

The problem with predictive algorithms requiring more data only arises if the number of independent variables that are predictive is large. Higher number of variables require larger datasets to train classification models (see Richard Bellman’s curse of dimensionality, the godfather of dynamic programming). In any case, the big data gives us a 1-2 orders of magnitude higher processing power, which only allows for a few more variables, as the volume of data required increases exponentially with new variables.

Perhaps the more important question to ask is why we need and how much data we need to do what we need to do. In our focus, we provide marketing analytics to our clients, so our focus is marketing. In the case of mining web analytics logs, there are apparently four uses

Revenue Attribution
Modeling
Triggering marketing actions
Building temporal statistics on customer actions

These four topics require data to be saved for various

Length of time
Resolution

Here is a simple depiction of the uses by resolution and data retention.

Determining how much to keep after then initial 90 days or so depends on the modeling uses. If the models being built have a natural 3-4% response rate, you need data that is approximately double that, so you are properly representing negative outcome events (actually oversampling success events). This level of data retention is enough for doing most propensity and event modeling exercises, since the data is actually pretty large.

Filed under big data

Things to consider for Fractional Revenue Attribution

Big Data question: what to save for how long?

Categories

Recent Posts

Archives

Blogroll

Meta