Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Setting Up a Data Source Name

The key to using ODBC is to know (or to set up) the data source name (DSN) for your databases. The setting up of DSNs is outside the scope of Rattle, being a configuration tosk through your operating system. Under GNU/Linux, for example, using the unixodbc package, the system DSNs are often defined in the file /etc/odbcinst.ini and in /etc/odbc.ini. Under MS/Windows the control panel provides access to a DSN tool.

Within Rattle we specify a known DSN by typing the name into the text entry. Once that is done, we press the Enter key and Rattle will attempt to connect. This may require a username and password to be supplied. For a Teradata Warehouse connection you will be presented with a dialog box.

Figure 4.6: Netezza ODBC connection
If the connection is successful we will find a list of available tables in the Table combobox.

We can choose a Table, and also include a limit on the number of rows that we wish to load into Rattle. This allows us to get a smaller sample of the data for testing purposes before loading up a large dataset. If the Row Limit is set to 0 then all of the rows from the table are retrieved. Unfortunately there is now SQL standard for limiting the number of rows returned from a query. For the Teradata and Netezza warehouses the SQL keyword is LIMIT and this is what is used by Rattle.

Copyright © Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010