Auburn Hills, MI Community

Was this article helpful?

Be the first one to rate it!
0 votes

How to Buy Data Mining

This article introduces the applicaitons, risks, rewards, capabilities and limitations of data mining. Readers are presented with a framework for avoiding costly project pitfalls in predictive analytics, and making the most of valuable information hidden within their existing transactional data.
Written Mar 10, 2009, read 202 times since then.

 

Data Mining?!  I'll Take Three!
How does someone purchase an intangible, cryptic, seemingly immeasurable technology? Beyond the inherent up-front risks of engaging in what is essentially a discovery process, just identifying a starting point can be intimidating and mystifying. Despite its elusive nature, data mining technology has surpassed the flash-in-the-pan "miracle tool" stigma with widespread and sustained success stories highlighted in mainstream publications, along with recurring case studies of improved operational efficiencies, enhanced business intelligence and residual payback. For any organization with annual revenues more than $50 million, employing data mining technology is not a matter of whether, but when.

Data mining has been seeping into mainstream business applications for more than two decades. Numerous case studies may be quickly referenced via a simple Internet or publication search. Its progress is unstoppable, propelled by sustained value justifications- yet stinted by the complexities of development, interpretation, integration and adoption. This article will suggest how to properly approach the starting line and how to implement a purposefully flexible framework for establishing an efficient and effective organizational data mining process.

Cutting Through The Buzz
Let's first make sure that we're on the same page when talking about data mining. It is not wholly incorrect to label data mining as retrospective searches on a large database for specific criteria, otherwise known as online analytical processing

(OLAP) or SQL queries. An example of OLAP or SQL queries would entail mining a large repository to identify females between the ages of 28 and 45 from New York, New Jersey and Delaware with incomes between $65,000 to $90,000 who purchased blue slacks between July 1 and August 15. For this query, we know the exact question to ask of the database. This practice typically explores just 5 to 15 percent of a large database.

For the purposes of this article, data mining shall refer to computer-aided pattern discovery of previously unknown interrelationships and recurrences across seemingly unrelated attributes in order to predict actions, behaviors and outcomes. Simply put, when referring to data mining in this article, we are looking at prediction derived from information hidden within large volumes of data rather than retrospection drawn from an OLAP or SQL query.

It is important to recognize and relate much of the popular terminology that is thrown about in order to provide context going forward. Data mining technology is not new. Methods for automating pattern discovery and prediction have existed for decades. Despite a considerable level of hype and strategic misuse, data mining has not only persevered but also matured and adapted for practical use in the business world. How could a community that is so data-rich, yet information-poor and profit-driven abandon a tool that can validate its own ability to predict customer behavior?

Alongside the technology, terminology has evolved over the last four decades. Names from 40 years ago are still recognizable as common phrases today. In the '70s and '80s, names such as artificial intelligence and machine learning that implied the computer had its own consciousness were somewhat oversold (perhaps even "over-souled"). The names of various data mining building blocks such as neural networks, genetic algorithms and evolutionary computing deservedly carry Darwinist tones of natural selection, as the underlying mathematics emulates biological processes. From a mathematician's perspective, these processes may be viewed as statistics on steroids.

In the '90s through the early '00s, the technology has been commonly referred to as data mining and knowledge discovery. However, due to the duality of the term data mining often referring to both OLAP and pattern discovery, a shift is rapidly moving toward far more descriptive and accurate nomenclature such as predictive modeling and predictive analytics. In fact, this will probably be one of the last articles I write using the label data mining - which is too mainstream to abandon just yet.

Just What is Data Mining?
Is data mining considered a service? Is it hardware? Software? A scored file? A system or a process? A customized solution? There does not seem to be a consensus, which makes data mining all the harder to visualize, define, manage ... and purchase. Two people may discuss data mining and have entirely different concepts in mind. Of course, all of the previously mentioned descriptions are technically correct. While the business community may appropriately view data mining as a productive, value-driven solution, that perspective focuses on the destination, not the journey. If credit were given to the best definition of data mining, process would score the point.

Viewing data mining as a process encompasses all the hard and soft resources, and implies a structured yet ongoing approach to an evolving optimization problem. When viewed as a process, data mining projects may be planned and implemented in a procedural way that all but ensures success. As well, expectations should be inherently leveled to never expect a "final answer" nor anticipate a single pass. When implemented properly, productive results should be expected early and continually improved.

------------------------------------------------------------------------

Copyright for this article is maintained by Information Management with reproduction license extended to The Modeling Agency.  You may view the remainder of the article on-line at Information Management or download a .pdf version:

Jump to the on-line version at Information Management,

or

Download an Adobe Acrobat (".pdf") reprint
 

Learn more about the author, Eric King.

Comment on this article

No one has posted a comment yet. Be the first!


Tweet this article!

Article tags

  • data mining
  • predictive modeling
  • predictive analytics
  • data mining training
  • predictive analytics consulting
  • crm analytics
  • business intelligence
  • information discovery

Related Articles

Biznik Shop Now Open