Data mining made easy with DMF
Data Mining has a certain reputation...
Data mining has the reputation of not being for the faint-hearted. If you don't have a university degree in statistics then don't even think about it. Also data mining software seems to fall into 2 categories - good but expensive and cheap but unreliable/unsupported.
I was looking for a data mining tool which was exceptionally easy to use, very robust and in the low/medium cost bracket.
Why Data Mining?
In IBM's excellent free 550-page Redbook - Dynamic Warehousing: Data Mining Made Easy the authors suggest an excellent short list of typical questions that businesses are asking today which a data mining approach could provide the answer to:
1. What do my customer look like?
2. Which customers should I target in a promotion?
3. Which products should I use for the promotion?
4. How should I lay out my new stores?
5. Which products should I replenish in anticipation of a promotion?
6. Which of my customers are most likely to churn?
7. How can I improve customer loyalty?
8. What is the most likely item that a customer will purchase next?
9. Who is most likely to have another heart attack?
10.What is the likelihood of a part failure?
11.When one part fails, what other part(s) are most likely to fail soon?
12.How can I identify high-potential prospects (lead generation)?
So data mining has become increasingly relevant in today's harsh economic climate as enterprises seek every extra bit of competitive advantage to differentiate themselves from their competition.
Enter stage left .....DMF
DMF (Data Mining Fox) is developed by a German start-up company EasyDataMining. I spoke with the founder and lead developer, Erich Steiner, who told me that it was his ambition to create a Data Mining system which did not require a PhD in mathematics (which incidently Erich has)!
Normally Data Mining requires you to define a number of settings and chose which algorithm you think would be appropriate for the specific task. DMF is different in that it removes all this and attempts to automatically pick the right settings and algorithms for the task at hand without you having to tell it.
DMF in a Nutshell
So what does DMF actually allow you to do?
Lets say you have a excel spreadsheet extracted from your live data. The screenshot below shows how DMF allows you to pick a column in your data known as the dependent variable. For example, Customer Status "Live" or "Cancelled".
You can then identify any number of other columns, e.g. Customer Account Balance, Customer Interest Rate and Customer Type which you think might have some kind of relationship with the dependent column. These are the independent variables.
DMF will automatically prepare a report using the most appropriate statistical methods of what extent a relationship exists between any combination of the independent variables and the dependent variable and how reliable these relationships would be in terms of predicting future behavior.
In other words with DMF you can use existing data to uncover relationships which can be used to predict the future.
Example DMF Projects
The DMF website includes a full range of case studies
I asked Erich to talk me through two customer projects where DMF has been particularly successful:
Example 1: Online Visitor Buying Prediction
An online services company was able to use DMF to identify the probability of web visitors buying their premium product based on an analysis of a number of independent variables. Before using DMF the company only had broad indications of overall probability but using DMF they were able to identify much more specific probabilities including a particular set of visitor variables which meant the probablity of the visitor buying was as high as 60%! This enabled the company to focus their advertising efforts at the right group of visitors and allowed them to adjust their pricing depending on the buying probability.
Example 2: Retail Customer Churn Prevention
A major retail company used DMF to predict which customers would churn over the next 2 years unless they did something to stop them. They were able to access 2 year old customer data which taken in conjunction with current customer data trained the DMF algorithm in which customer characteristics cause churn. They were then able to apply the alogorithm looking forward and with a good degree of confidence in the results. DMF found customer segments with a churn probability of over 60% whereas the overall churn probability was about 9%. The current customers with a high churn probability could then be targeted with special offers in order to prevent their churn.
So what really differentiates DMF?
I asked Erich what else differentiates DMF in addition to it being novice friendly, robust and competitively priced.
"The thing that customers love about DMF is that once you have a good dataset in excel format it is amazingly quick for DMF to start identifying relationships in the data and making predictions. For example the initial work to identify the factors making customers purchase online or making retail customers churn only took a couple of days. Obviously moving forward and fully integrating with live customer data is more involved but finding the initial relationships is amazing fast. "
DMF provide a fully functional free version which is limited to 300 records. The full version is currently 1000 Euro but this is under review along with the possibility of removing the 300 record restriction from the free version. A copy of the free version can be downloaded from here.
Conclusions about DMF
DMF is a unique product from an impressive German start-up company which does what it says on the tin. With very little effort or mathematical expertise DMF allows you to easily extract valuable predictions about future customer (and other) behaviors in time to do something about them!
Its definitely worth checking out DMF if you have a requirement for data mining but you don't have the skills or the time to invest in understanding the science behind the discipline. The company also offers to technical expertise and facilities to integrate DMF functionalities inside customer application software in order to automate data mining tasks on a real-time basis.
Its also worth getting in touch if you are a consulting organisation working in data-rich environments such as retail, banking and telecoms as DMF are now entering discussions with potential international delivery partners.
For more information about DMF
To find out more about DMF visit the website at http://www.easydatamining.com/
To download a 28-page PDF mini user guide which introduces the full DMF functionality.
About Ken ThompsonKen Thompson delivers keynote conference speeches, workshop facilitation and in-house consultancy in four key business areas:
- Creating High Performing Teams in enterprises including Virtual and Mobile Teams (based on the Bioteams Book)
- Establishing effective Collaborative Business Networks enabling companies to co-operate effectively in areas such as sales and product development (based on the book - The Networked Enterprise)
- How to use the latest social media technologies including blogging and online communities to promote enterprises, brand, organisation or event
- Development of graphical on-line interactive Business Dashboards and What-if Simulators for organisations to support Performance Improvement, Strategy Development and Executive Team Development.
Bioteams Books Reviews
Just because we might have selfish genes it does not mean we have to behave selfishly; nature knows when to be nice as well as nasty and nepotism occurs in the biological world too with equal destructiveness as our world. This is according to Richard Conniff author of The Ape in the Corner Office and reviewed in the UK Guardian Newspaper (27 May).