Data Mining with Microsoft ® SQL Server™ 2000
Technical Reference

Our Pages Are Best Viewed At 800 x 600 Resolution

Return to Main Menu

Back One Page

Place Order by Mail

Contact Us

Search

Book Catagories

Professional Computing

   Certification
   Computer
   Science
   Database & ERP
   Internet
   Management
   Information Systems
   Networking
   Operating Systems
   PC Hardware
   Programming
   Security
   Telecommunications
   Video & Audio
   Web Developement

Computer Science
Academic Disciplines

Intro to Computer Science
Introduction to Programming
Data Structures
Algorithms/Advanced Data Structures
Artificial Intelligence
Compilers
Computer-Organization/Architecture
Computer Graphics
Human-Computer Interaction
Database
Internet and World Wide Web
Electronic Commerce
Mathematics for Computer Scientists
Operating Systems
Networking
Programming Languages
Software Engineering
Theory of Computation
Signals and Systems
Miscellaneous


 
 

Author: Claude Seidman 

ISBN: 0-7356-1271-4 
Pages: 384 
Disk: N/A 
Stolin-Softwares Price: $39.99
Release: 05/09/2001 
Level: Int/Adv 

About the Book 

Learn how to turn mountains of raw data into useful information with this guide.

The amount of information stored in corporate databases is exploding exponentially. Data mining—finding meaningful patterns in all that data—can give any organization a competitive advantage. This book is the in-depth reference from Microsoft® for anyone who wants to take full advantage of the powerful data-mining features in SQL Server™ 2000. It examines the SQL Server 2000 Analysis Services architecture and shows how data mining fits into its complete suite of information-extraction technologies. Then it demonstrates how to structure and mine large databases with the algorithms included with SQL Server 2000 to find nuggets of useful information. It even shows how to create a practice data-mining model using data downloaded from a database. Coverage includes:

• INTRODUCTION TO DATA MINING: What data mining is and isn’t, plus important principles and definitions behind data-mining methodologies, including the role of data-mining models, statistics, and algorithms
• SQL SERVER 2000 ARCHITECTURE: How data mining fits into the SQL Server 2000 Analysis Services architecture and how it builds on the SQL Server 2000 relational database and its embedded online analytical processing (OLAP) engine
• DATA-MINING METHODS: How to choose the best data-mining method for the job—decision trees or clustering 
• EASE OF USE FEATURES: How to use the Mining Model Wizard and the OLAP Mining Model Editor to simplify creating, training, and processing a model
• PROGRAMMING THE DATA-MINING SERVICES: How to use data-mining models and Data Transformation Services, PivotTable® Services, decision-support objects (DSO), PERL, Visual Basic®, Scripting Edition, XML, and other tools and languages to work with the data-mining engine

Related Books


Microsoft® SQL Server™ 2000 Resource Kit
Microsoft® SQL Server™ 2000 Administrator's Pocket Consultant
Microsoft® SQL Server™ 2000 Administrator's Companion

Table of Contents


Acknowledgments  xi xi 
Introduction  xiii xiii 
PART I INTRODUCING DATA MINING   
1 Understanding Data Mining   3 
    What Is Data Mining? 3 
    Why Use Data Mining? 4 
    How Data Mining Is Currently Used 6 
    Defining the Terms 7 
    Data Mining Methodology 9 
        Analyzing the Problem 10 
        Extracting and Cleansing the Data 10 
        Validating the Data 10 
        Creating and Training the Model 10 
        Querying the Data Mining Model Data 10 
        Maintaining the Validity of the Data-Mining Model 10 
    Overview of Microsoft Data Mining 11 
        Data Mining vs. OLAP 11 
        Data-Mining Models 11 
        Data-Mining Algorithms 12 
        Using SQL Server Syntax to Data Mine 14 
    Summary 14 
2 Microsoft SQL Server Analysis Services Architecture 15 
    Introduction to OLAP 16 
        MOLAP 18 
        ROLAP 18 
        HOLAP 19 
    Server Architecture 20 
        Data Mining Services Within Analysis Services 20 
    Client Architecture 21 
        PivotTable Service 22 
        OLE DB 23 
        Decision Support Objects (DSO) 24 
        Multidimensional Expressions (MDX) 25 
        Prediction Joins 25 
    Summary 26 
3 Data Storage Models    27 
    Why Data Mining Needs a Data Warehouse 27 
        Maintaining Data Integrity 28 
    Reporting Against OLTP Data Can Be Hazardous to Your Performance 31 
    Data Warehousing Architecture for Data Mining 33 
        Creating the Warehouse from OLTP Data 33 
        Optimizing Data for Mining 36 
        Physical Data Mining Structure 42 
        Three-Tier Architecture 43 
    Relational Data Warehouse 43 
        Advantages of Relational Data Storage 44 
        Building Supporting Tables for Data Mining 45 
    OLAP cubes 46 
        How Data Mining Uses OLAP Structures 46 
        Advantages of OLAP Storage 47 
        When OLAP Is Not Appropriate for Data Mining 49 
    Summary 49 
4 Approaches to Data Mining   51 
    Directed Data Mining 51 
    Undirected Data Mining 52 
        Data Mining vs. Statistics 52 
        Learning from Historical Data 57 
        Predicting the Future 59 
    Training Data-Mining Models 61 
        Evaluating the Models and Avoiding Errors 62 
    Summary 65 
PART II DATA-MINING METHODS   
5 Microsoft Decision Trees   69 
    Creating the Model 69 
        Analysis Manager 70 
    Visualizing the Model 87 
        Dependency Network Browser 94 
        Inside the Decision Tree Algorithm 97 
        How Predictions Are Derived 109 
        Navigating the Tree 109 
        Navigation vs. Rules 112 
        When to Use Decision Trees 113 
    Summary 114 
6 Creating Decision Trees with OLAP    115 
    Creating the Model 115 
        Select Source Type 116 
        Select Source Cube and Data-Mining Technique 116 
        Select Case 118 
        Select Predicted Entity 119 
        Select Training Data 121 
        Select Dimension and Virtual Cube 121 
        Completing the Data-Mining Model 123 
    OLAP Mining Model Editor 125 
        Content Detail Pane 126 
        Structure Panel 126 
        Prediction Tree List 126 
    Analyzing Data with the OLAP Data-Mining Model 126 
        Using the Generated Virtual Cube 128 
        Using the Generated Dimension 129 
    Summary 133 
7 Microsoft Clustering    135 
    The Search for Order 136 
    Looking for Ways to Understand Data 136 
    Clustering as an Undirected Data-Mining Technique 137 
    How Clustering Works 138 
        Overview of the Algorithm 138 
        The K-Means Method Clustering Algorithm 138 
        What Is Being Measured Exactly? 142 
        Clustering Factors 142 
        Measuring "Closeness" 143 
    When to Use Clustering 146 
        Visualize Relationships 146 
        Highlight Anomalies 146 
        Create Samples for Other Data-Mining Efforts 148 
        Weaknesses of Clustering 148 
    Creating a Data-Mining Model Using Clustering 149 
        Select Source Type 150 
        Select the Table or Tables for Your Mining Model 150 
        Select the Data-Mining Technique 151 
        Edit Joins 152 
        Select the Case Key Column for Your Mining Model 152 
        Select the Input and Predictable Columns 152 
    Viewing the Model 154 
        Organization of the Cluster Nodes 154 
        Order of the Cluster Nodes 156 
    Analyzing the Data 156 
    Summary 158 
PART III CREATING DATA–MINING APPLICATIONS WITH CODE   
8 Using Microsoft Data Transformation Services (DTS) 161 
    What Is DTS? 162 
    DTS Tasks 162 
        Transform 162 
        Bulk Insert 163 
        Data Driven Query 163 
        Execute Package 164 
    Connections 167 
        Sources 167 
        Configuring a Connection 168 
    DTS Package Workflow 169 
        DTS Package Steps 169 
        Precedence Constraints 170 
    DTS Designer 171 
        Opening the DTS Designer 171 
        Saving a DTS Package 172 
    dtsrun Utility 174 
    Using DTS to Create a Data-Mining Model 177 
        Preparing the SQL Server Environment 178 
        Creating the Package 182 
    Summary 208 
9 Using Decision Support Objects (DSO)    209 
    Scripting vs. Visual Basic 210 
        The Server Object 211 
        The Database Object 219 
    Creating the Relational Data-Mining Model Using DSO 221 
    Creating the OLAP Data-Mining Model Using DSO 230 
        The DataSource Object 232 
        Data-Mining Model (Decision Support Objects) 233 
    Adding a New Data Source 233 
    Analysis Server Roles 234 
        Data-Mining Model Roles 235 
    Summary 236 
10 Understanding Data-Mining Structures 237 
    The Structure of the Data-Mining Model Case 237 
        Data-Mining Models Look Like Tables 237 
    Using Code to Browse Data-Mining Models 238 
    Using the Schema Rowsets  243 
        MINING_MODELS Schema Rowset 243 
        MINING_COLUMNS Schema Rowset 249 
        MINING_MODEL_CONTENT Schema Rowset 259 
        MINING_SERVICES Schema Rowset 262 
        SERVICE_PARAMETERS Schema Rowset 266 
        MODEL_CONTENT_PMML Schema Rowset 268 
    Summary  269 
11 Data Mining Using PivotTable Service  271 
    Redistributing Components 272 
    Installing and Registering Components 273 
        File Locations  274 
        Installation Registry Settings 275 
        Redistribution Setup Programs 275 
    Connecting to the PivotTable Service  276 
        Connect to Analysis Services Using PivotTable Service 276 
        Connect to Analysis Services Using HTTP 280 
    Building a Local Data-Mining Model 280 
        Storage of Local Mining Models 284 
        SELECT INTO Statement 286 
        INSERT INTO Statement 286 
        OPENROWSET Syntax 287 
        Nested Tables and the SHAPE Statement 289 
    Using XML in Data Mining 290 
        The PMML Standard 290 
    Summary 296 
12 Data-Mining Queries 297 
    Components of a Prediction Query 297 
        The Basic Prediction Query 298 
        Specifying the Test Case Source  298 
        Specifying Columns 300 
        The PREDICTION JOIN Clause 300 
        Using Functions as Columns 304 
        Using Tabular Values as Columns 304 
        The WHERE Clause 306 
        Prediction Functions 307 
        Predict 307 
        PredictProbability 308 
        PredictSupport 308 
        PredictVariance 309 
        PredictStdev 310 
        PredictProbabilityVariance 310 
        PredictProbabilityStdev 310 
        PredictHistogram 310 
        TopCount 313 
        TopSum 313 
        TopPercent 314 
        RangeMin 314 
        RangeMid 314 
        RangeMax 314 
        PredictScore 314 
        PredictNodeId 315 
    Prediction Queries with Clustering Models 315 
        Cluster 315 
        ClusterProbability 316 
        ClusterDistance 316 
    Using DTS to Run Prediction Queries 317 
    Summary 322 
APPENDIX     325 
GLOSSARY   349 
INDEX   359 



Have a special request? Send inquires to Customer Service

 

 Business Software | Operating Systems & Servers | Development Tools | Internet Technologies |  Home Productivity
Reference Software | Microsoft Press | Home Page