MySQL User Guide Part VI Tutorials

Back to MySQL User Guide

Contents

[edit] Part Part VI. Tutorials

[edit] Other MySQL Utilities

[edit] Using <mysqlutility>

[edit] Migrating a Spreadsheet to MySQL *Complete*

[edit] Introduction

The most common "database" format, especially for small- to medium-sized businesses, is the spreadsheet. The reason for this is fairly obvious -- no special skills are required either for design or for data entry. Not only that, a spreadsheet may well be the best format for presenting and maintaining some kinds of information. If the file is not complicated, it's easy to get a quick overview of the data and sorting on a specific field is usually just a matter of clicking a column heading.

However, as the volume or complexity of information increases, this format becomes more and more cumbersome. Information becomes more difficult to retrieve and you run into the kinds of problems usually associated with flat-table databases -- data duplication, for example.

This chapter deals with migrating a spreadsheet to a MySQL database. The solution presented here is operating system (OS) neutral; it works on Mac, Windows, or any Unix-like OS.

Excel is probably the most commonly used spreadsheet format but the procedure described here applies to any spreadsheet. The only requirement is that the spreadsheet data be exported as a text file so that it can be imported into MySQL.

To help facilitate things Query Browser, one of the open source MySQL GUI Tools, will be used. Creating database objects is made especially easy using the Table Editor, a feature of the Query Browser also common to other GUI Tools. By pointing and clicking you can quickly build a table without knowing anything about data definition language (DDL). Not only will the table editor help you work more quickly, it's also a good way to learn MySQL's implementation of SQL. Any alterations made to a table using the graphical interface are shown as SQL statements, making it easy to learn the appropriate SQL commands. We'll take advantage of this feature to document as we go.

The example spreadsheet that we'll be importing contains information about the accreditations of members of a professional association. It's not complicated so the process should be fairly easy to follow but at the same time it does highlight the major issues you might encounter and provides general guidelines for importing spreadsheets into MySQL.

The steps we'll take are as follows:

  1. Export the spreadsheet to a text file
  2. Import this file wholesale into a temporary table
  3. Create and populate permanent tables
  4. Use the mysqldump utility to export the tables and data
  5. Upload these tables and data to a production server

[edit] The Spreadsheet File

We want to import the data directly into a table that mirrors the structure of the spreadsheet. The sample spreadsheet has the following fields with maximum required lengths as shown:

If you wish to follow along, create a spreadsheet with these fields and enter some sample data as shown in the table below.

Table 30.1. Spreadsheet format

firstname lastname certification expirydate street1 street2 city state zip certnum
John Doe RAC 10-Jan-08 10 Mulberry St New York NY 30263 C-12345
John Doe ARM-1 28-Feb-09 10 Mulberry St New York NY 30263 A-44456
Jane Doe DAC 10-Dec-09 10 Mulberry St New York NY 30263 D-4567
Bob Smith RAC 02-Jan-07 10 Main St Apt 10 Detroit MI 20789 C-6785

The only requirements are:

If you don't have a spreadsheet program at hand, you can create a tab-separated or comma-separated text file to match the structure defined above. Of course, if you do this, you won't need to export the data.

[edit] Converting a Spreadsheet to a Text File

Before attempting to convert any spreadsheet, it is best to review the data for consistency. For example, make sure that every row has the same number of columns and check that all dates are formatted in the same way. This may save major headaches by helping to spot errors early.

How you convert a spreadsheet to a text file may depend upon the OS you are using. If you are working under Windows with an Excel spreadsheet, open the spreadsheet in Excel and choose the Save As menu option under the File menu. From the Save as type list box choose the Text(Tab delimited) option.

Under Unix, Windows, or Mac you can use OpenOffice Calc to open a variety of spreadsheet formats, including an Excel spreadsheet. Export the spreadsheet from the Calc application by choosing the File and Save as menu options. Next choose Text CSV from the Filter list box. This opens a dialog box for further refining your choice. Choose the character set Unicode or a platform-specific format, if appropriate. As a field delimiter choose the {Tab} option and no text delimiter at all. The drop-down list box only offers single or double quotation marks as alternatives; to choose no delimiter simply delete the quotation mark.

If you are using an application that won't let you save the spreadsheet in tab-separated format, a Google spreadsheet for example, then simply save the file as a CSV file. Tab-separated text files are the easiest to import into MySQL but importing a CSV file is almost as simple. Before saving the file, review the contents first and ensure that no commas appear anywhere in the data. A stray comma can cause data corruption or complete failure when importing data in CSV format.

Save the file as data.tsv (or as data.csv if the format is comma-separated), have a look at the exported data in a text editor. Each record should appear on a separate line. Don't be concerned if each line is not a uniform length. Many programs will export the column headings as the first row of the text file. Delete this row and resave the file, making sure that you save it as a text file and don't introduce any formatting.

[edit] Creating a Table with Query Browser

Creating a table to match the fields as described in [excel-mysql.html#spreadsheet-file Section 30.2, "The Spreadsheet File"] is a fairly straightforward matter. For importing the data our principal concern is to get the right information in the right fields without truncating data. By treating all fields as VARCHAR we can keep things simple and only need to worry about the order of the fields and their length.

As promised we'll use the MySQL Query Browser until we're ready to create a database dump. Query Browser is a fairly intuitive tool but for a quick overview find the documentation online at http://dev.mysql.com/doc/.

Start up Query Browser and enter your credentials and the server hostname and port -- we haven't created a database yet so don't worry about the Default Schema text box. When the application opens, you'll find a list of schemata (databases) on the right. The cursor should be active in the text area at the top of the screen. This text area is used for entering queries, which are executed using the Execute button on the right. If a result set is returned, it shows in the main area in the center of the screen.

The first thing to do is create a database. Make sure that the Schemata tab on the right is selected, right click anywhere in this window, and choose the Create Schema option from the pop-up menu. Name the database association. To refresh the databases shown in the Schemata window, right click in this window and choose the refresh menu option. Next open a script window -- we'll use this window as a scratch pad to save copies of the queries we create. Open a script tab by choosing the New Script Tab option from the File menu. After doing this two tabs, one labeled Resultset1 and the other New Script, should be visible on the left below the tool bar.

To create a table, right click the association database in the Schemata panel and choose Create Table from the pop-up menu.

This opens the table editor, in the default view with the Columns and Indices tab active. Enter the name alldata in the text box at the top of the table editor. Refer to the values shown in [excel-mysql.html#spreadsheet-file Section 30.2, "The Spreadsheet File"], enter a name for each column, choose VARCHAR as the data type, and specify a field length. You needn't worry about making any other changes at this point. After all, the alldata table is only temporary.

When you are finished, use the Apply Changes button. This button opens a dialog box showing the SQL code that will execute. Before executing this code, copy it and paste it into the script window. The code should look something like this:

CREATE TABLE `alldata` (
  `lastname` VARCHAR(50) NOT NULL,
  `firstname` VARCHAR(50) NOT NULL,
  `certification` VARCHAR(10) NOT NULL,
  `expirydate` VARCHAR(10) NOT NULL,
  `streetaddress1` VARCHAR(50) NOT NULL,
  `streetaddress2` VARCHAR(50) NOT NULL,
  `city` VARCHAR(50) NOT NULL,
  `state` VARCHAR(2) NOT NULL,
  `zipcode` VARCHAR(10) NOT NULL,
  `certificationnumber` VARCHAR(10) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Right click the association database in the Schemata pane and choose the Refresh Schemata option. The newly created table should appear beneath the association database, ready for imported data.

[edit] Loading the Data into a MySQL Database Table

To import the spreadsheet data we'll use the LOAD DATA INFILE syntax. Security considerations can sometimes make this a frustrating exercise, so as we go, we'll try to anticipate any problems that may arise.

Click on the Resultset1 tab and enter the following statement into the query text box (using a path appropriate to your circumstances):

LOAD DATA INFILE "/home/peter/Documents/spreadsheet/data.tsv"
  INTO TABLE alldata;

Windows pathnames are also specified using forward slashes rather than backslashes. If you do use backslashes, you must double them.

To import a comma separated file on the Windows platform use the following syntax:

LOAD DATA INFILE "C:/Documents and Settings/peter/My Documents/spreadsheet/data.csv"
  INTO TABLE alldata
  FIELDS TERMINATED BY ",";

The default field terminator is a tab character so if you use a different terminator you must specify it as shown in the preceding statement.

There are other possible pitfalls when executing a LOAD DATA INFILE statement. The rules for using a relative path are a bit tricky so always specify the complete path to the file. Also, a data file must be readable by all. This is usually not an issue under Windows; on Unix operating systems, if you need to adjust the file permissions, you can readily do this using the GUI. To make a file world-readable from the command prompt type:

shell> chmod 755 data.tsv

Finally, the user who is executing the LOAD DATA statement must have the FILE privilege. If you need to grant this privilege, log in as root and execute the command:

GRANT FILE ON *.*
  TO user@hostname
  IDENTIFIED BY password;

You can do this from the command line or from within Query Browser.

Note: The FILE privilege is a global privilege and cannot be restricted to a specific database.

So far so good, but the syntax shown to this point only works if the text file is located on the same system as the server. If your MySQL server is remote, you must add the keyword LOCAL to the LOAD DATA INFILE syntax as in the following example:

LOAD DATA LOCAL INFILE "/home/peter/Documents/spreadsheet/data.tsv"
  INTO TABLE alldata;

Using LOCAL is not much different syntactically but servers are sometimes started up with the ability to LOAD DATA LOCAL disabled. If the server supports LOCAL, you can start up the MySQL client with the --local-infile option. Another approach is to copy the text file to the server before executing the LOAD DATA statement.

Note: Further complications can ensue. For files created on a Windows system, you might have to add LINES TERMINATED BY '\r\n' to read the file properly, because Windows programs typically use these two characters as a line terminator. If you need to add this clause, it follows immediately after the table name or, if a FIELD TERMINATED BY clause is present, immediately after this clause.

If you run into problems and require more information about LOAD DATA INFILE refer to the manual http://dev.mysql.com/doc/refman5.0/en/sql-syntax.html.

Before you continue, paste the appropriate version of the LOAD DATA INFILE statement into the script window below the alldata table definition.

After executing this statement and loading the data you can check that it has been copied to the alldata table using Query Browser. To inspect the data, double click the alldata table and find the following statement in the query text box:

SELECT * FROM alldata LIMIT 0,1000
Note: A LIMIT clause may not appear when using Query Browser under Windows.

Click the Execute button and you should be able to view the data in the query window.

You might want to review the integrity of the data again at this point. A visual inspection is fine but you might also want to automate the process with an SQL statement such as the following:

SELECT * FROM alldata PROCEDURE ANALYSE();

(Note the spelling of ANALYSE.)

Among other things, this query shows actual minimum and maximum values for data in the various fields. If any of the maximum field length values equal the field length, then you have probably truncated data. Empty or NULL values in some fields may also indicate problems.

[edit] Creating A Temporary Table of Members

Now that the data has been copied into a MySQL database, we need to split it up into different tables -- we want a proper relational database and not another flat database. The most obvious entity is a member, having the attributes name and address. The following fields from the alldata table belong to this entity exclusively:

lastname VARCHAR(50) NOT NULL
firstname VARCHAR(50) NOT NULL
streetaddress1 VARCHAR(50) NOT NULL
streetaddress2 VARCHAR(50) NOT NULL
city VARCHAR(50) NOT NULL
state VARCHAR(2) NOT NULL
zipcode VARCHAR(10) NOT NULL

Removing any fields that relate to certification gives us the basis for a members table.

We need to transfer data from the alldata table into a members table but, since members can have more than one certification and so appear more than once in the alldata table, we can't just copy all records over to a members table. To make sure that we have unique records we need a way of uniquely identifying each member. We can do this by combining a number of fields together to create a unique value -- a combination of the firstname, lastname, and streetaddress1 columns fits the bill. The combination of these fields could form a primary key, but it would be a very cumbersome one. For this reason, we're also going to add a numeric key value -- an integer AUTO_INCREMENT field. The two new fields are:

unique_value VARCHAR(150)
id INT(11)

Create this table using the table editor in the same way that you created the alldata table. The only new element is an integer, auto increment field. To create this field select INTEGER as the data type and ensure that all three check boxes in the column options frame, Primary Key, Not NULL, and Auto Increment, are checked. Make sure the size of the unique_value column is adequate and add the other columns exactly as you did before.

When you're ready apply your changes and copy the SQL from the dialog box. It should look something like the following:

CREATE TABLE `tempmembers` (
   `unique_value` VARCHAR(150) DEFAULT NULL,
   `id` INT(11)  NOT NULL AUTO_INCREMENT,
   `firstname` VARCHAR(30) NOT NULL DEFAULT '',
   `lastname` VARCHAR(40) NOT NULL DEFAULT '',
   `streetaddress1` VARCHAR(60) NOT NULL DEFAULT '',
   `streetaddress2` VARCHAR(60) NOT NULL DEFAULT '',
   `city` VARCHAR(60) NOT NULL DEFAULT '',
   `state` VARCHAR(10) NOT NULL DEFAULT '',
   `zipcode` VARCHAR(10) NOT NULL DEFAULT '',
   PRIMARY KEY (`id`)
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
 

Paste this table definition into the script window and switch to the Resultset1 tab to create a query to populate this table. As described above, we want to concatenate three columns to create a unique value and also add an auto increment column. The remaining columns come directly from the alldata table.

To populate the tempmembers table enter the following SQL into the query text box and execute it:

INSERT INTO tempmembers
  SELECT DISTINCT CONCAT(firstname, lastname, streetaddress1) AS unique_value,
  NULL AS id, firstname, lastname,
  streetaddress1, streetaddress2, city, state, zipcode
  FROM alldata;

Using DISTINCT with the unique_value field should guarantee that we don't have duplicate members and selecting NULL as the id field generates a unique auto increment value for each record. Look at the records in the tempmembers table to confirm that unique id numbers have been generated.

This is fairly close to what a final version of a members table would look like -- removing the unique_value field would be the next step to take but as you'll see shortly, we still need this field.

[edit] Creating a Temporary Member Accreditations Table

The entire spreadsheet that we've imported could be described as a table of members' different accreditations. In the previous section we extracted the member information from the alldata table and created a unique id number for each member. The task now is to replace the duplicated member information with a single unique field. In other words, we're going to create a member accreditations table with a foreign key.

The fields in the alldata table that apply solely to a member accreditations table are readily identified:

`certification` VARCHAR(10) NOT NULL
`expirydate` VARCHAR(10) NOT NULL,
`certificationnumber` VARCHAR(10) NOT NULL

So far we've treated the expirydate field as text. While we're creating a member accreditations table we can convert this field to the DATE data type. The new definition for this field is:

`expirydate` DATE DEFAULT NULL

Again we want to concatenate three columns to create a unique value and also add an integer column for the member id -- so we can relate the member certifications to their matching records in the members table. The two additional columns are as follows:

`unique_value` VARCHAR(150) DEFAULT NULL
`memberid` INT(11) NOT NULL DEFAULT '0',

Right click the association database in the Schemata pane and open the table editor.

You've already added VARCHAR and INTEGER fields so adding a DATE type field should present no problems. Create a table named tempmemberaccreditations and apply your changes. The resulting table should look something like this:

CREATE TABLE `tempmemberaccreditations` (
  `unique_value` VARCHAR(150) DEFAULT NULL,
  `certification` VARCHAR(10) NOT NULL,
  `memberid` INT(11) NOT NULL DEFAULT '0',
  `certificationnumber` VARCHAR(10) NOT NULL,
  `expirydate` DATE DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Don't forget to paste it into the script window before proceeding. If you do forget, retrieving the table structure is a simple matter of executing the SQL statement:

SHOW CREATE TABLE tempmemberaccreditations;

Again we need to populate this table from the alldata table. We are going to select all the records from the alldata table but only selected fields. Click on the Resultset1 tab and enter the following query into the query text box:

INSERT INTO tempmemberaccreditations
  SELECT CONCAT(firstname,lastname,streetaddress1) AS unique_value,
  certification, 0 AS memberid,
  certificationnumber,
  STR_TO_DATE(expirydate, "%d-%b-%y")
  FROM alldata;

Converting a string value to a date is done using the STR_TO_DATE function. This function takes two string arguments; the first is a string expression of the date and the second specifies the date format. In the alldata table dates are in the form '12-Dec-07'. The format specifier tells MySQL exactly how to interpret the string representation of the date. In this case the specifier, "%d-%b-%y", means the day of the month comes first and is expressed as two digits -- it will have a leading zero even if the value is less than 10, the month is expressed as an abbreviated name, and the year numerically with two digits. All values are separated by a "-".

Note: The complete list of specifiers is given in the manual immediately following discussion of the DATE_FORMAT function.

Review the data after executing the INSERT statement. You'll see that dates are now expressed in the default MySQL format, the year has four digits followed by a two digit month, and a two digit day.

At this point, reviewing the data to ensure consistency is a good idea. Any dates that were improperly formatted in the original spreadsheet will not convert to the DATE data type.

When reviewing the data you'll also see that the memberid field is set to '0' for all records. Let's update this field using the values in the tempmembers table:

UPDATE tempmemberaccreditations t2
  INNER JOIN tempmembers t ON  t.unique_value=t2.unique_value
  SET t2.memberid =  t.id;

That's the last time we'll need the unique_value field for either of our transitional tables. We can now relate these two tables on the numeric id field.

[edit] The Final Tables

With both tables populated with data it's time to get rid of the unique_value field and while doing so we should also change the name of our tables since they are no longer temporary or transitional tables.

Select the tempmembers table in the Schemata pane and open the table editor. Rename the table to members, by changing the table name in the text box in the top left of the table editor. Select the unique_value field and press the Delete key to remove it. Choose Apply Changes to view a dialog box with the following content:

ALTER TABLE `tempmembers`
  RENAME TO `members`,
  DROP COLUMN `unique_value`;

Making similar changes to the tempmemberaccreditations table will result in the following DDL statement:

ALTER TABLE `tempmemberaccreditations`
  RENAME TO `memberaccreditations`
  DROP COLUMN `unique_value`;

Copy the SQL version of these table alterations to the script window.

Adding indexes to tables is also easily accomplished using the Query Browser. Since we expect searches on the lastname and the city fields these two columns are ideal candidates for indexing. Again this can be done using the table editor. Open the table editor and click the Indices tab. Click the + button on the bottom left and a new index called new_index appears in the list of indexes. Change the name to lastname_idx and drag and drop the lastname column to the Columns text area on the right.

Create an index on the city column in the same way. When you apply your changes you should see something similar to the following:

ALTER TABLE `members`
  ADD KEY `lastname_idx` (`lastname`),
  ADD KEY `city_idx` (`city`);

The memberaccreditations table still lacks a primary key. To remedy this, open the table editor again so that we can add a primary key. To do this click the + button on the lower left and ensure that PRIMARY is selected in the Kind drop-down list box. Create a primary key composed of two columns by dragging the memberid column and the certification column to the Columns list. When applying your changes you should see:

ALTER TABLE `memberaccreditations`
  ADD PRIMARY KEY (`certification`, `memberid`);

After altering database objects, it's always an idea to refresh the view in the Schemata pane. Do this by right clicking the association database and choosing the Refresh option (Under Unix this option is called Refresh Schemata.)

Looking at the data there is yet one more change we could apply. The certification field may indicate another database entity. Let's create a table of accreditation acronyms with their corresponding descriptions.

One of the simplest ways to create a table and populate it using MySQL is to issue a CREATE TABLE statement in conjunction with a SELECT statement. For instance we could create our final version of the members table in the following way:

CREATE TABLE  accreditations
   SELECT DISTINCT certification AS acronym, '' AS description
   FROM alldata;
 

At this point we don't have the information necessary to add a description so we populate this field with an empty string.

Creating and populating a table in this way is a quick and easy way to create a populated table. The downside to creating a table in this way is that the resulting table has no primary key or indexes. I'll leave it to you to add an index to this table.

At this point we've created all the necessary tables and migrated the data to those tables. We just need to check the integrity of the data before copying it to a production server.

[edit] Confirming Data Integrity

It's always wise to check the state of your transformed data. There's no substitute for visual inspection but there are a variety of ways to check your data using SQL.

For example, there should be no orphaned member records. Since we've migrated from a flat-table database that contained all the original data, finding an id in the members table with no corresponding record in the member accreditations table would indicate that something was wrong. The following SQL statement will return all records in the members table that don't have matching records in the member accreditations table:

SELECT `t`.`id` ,`t`.`firstname`,`t`.`lastname`
  FROM `members` `t`
  LEFT JOIN `memberaccreditations` `tma`
  ON `t`.`id` = `tma`.`memberid`
  WHERE ISNULL(`tma`.`memberid`);

If the above SELECT statement returns an empty set, there are no orphaned member records.

An easy way to reuse this SQL statement is to save it as a view. To do this using Query Browser, make sure the association database is active, then right click on any one of the tables in the Schemata window and choose the Create View option. Clicking Okay after entering a view name opens a new tab displaying the basic syntax for creating a view. Paste the preceding SQL statement into the AS clause and execute the query. After refreshing the schemata the new view should show up. You can view the record set associated with this view in exactly the same way that you would view the record set associated with a table.

To check that there are no orphaned records in the member accreditations table execute the following query:

SELECT `tma`.`certification`,
  `tma`.`memberid`,`tma`.`certificationnumber`,
  `tma`.`expirydate`
  FROM (`memberaccreditations` `tma`
  LEFT JOIN `members` `t` ON ((`tma`.`memberid` = `t`.`id`)))
  WHERE ISNULL(`t`.`id`);

Again, to save this SQL statement, convert it to a view using the procedure described above.

There are also various other ways of querying your records to verify the data. For example, if all certification numbers in the member accreditations table are meant to be unique, executing the following query would determine if there are duplicates:

SELECT COUNT(t.`certificationnumber`), t.`certificationnumber`
  FROM memberaccreditations t
  GROUP BY (t.`certificationnumber`)
  HAVING COUNT(t.`certificationnumber`) > 1;

Checking the number of records in the memberaccreditations table provides further assurance of the integrity of your data. The number should exactly match the number of records in the alldata table.

If you notice discrepancies in the data and wish to update records you can do this from within Query Browser. Click the Start Editing button and then select the record you wish to change and place the cursor in the column you wish to change. When you are finished editing click the Apply Changes button.

Note: If a record set is created from a single table having a primary key, it is editable. A disabled Start Editing button indicates that the record set is not editable.

Once you're satisfied with the integrity of the data, drop the alldata table. This is easily done by right clicking the table and choosing the Drop option. Before exiting Query Browser make sure that you save the script file of all the queries.

[edit] The Production Database

Once you are convinced of the validity of your data, you can move the tables to your production server. We'll do that by first using the mysqldump utility. To export only the final versions of the tables, go to the command line and type:

shell> mysqldump -u username -p --databases association > newdb.sql
Note: You can open a MySQL console window from within Query Browser. Find this option under the Tools menu.

The mysqldump utility takes many of the same switches as the MySQL client; as you can see, you specify your user name and password in the same way. You also need to specify the database name you wish to dump. In this case, the output is redirected to a script file named newdb.sql. If you do not wish to create a database and only want to dump the tables in the association database, execute the preceding command without the --databases option. For more information about the many options available with mysqldump see http://dev.mysql.com/doc/5.0/en/mysqldump.html.

Have a look at the contents of the script file so that you understand what it does. Any existing tables with the specified table names will be dropped and recreated and then the data will be inserted. If you are overwriting existing data, you may want to back up your data before running the script file.

How you execute the dump script file depends upon how you access your production MySQL server. If you have direct access to the server or access through ssh, transfer the script file to the machine hosting the server, and then issue the command:

shell> mysql -u username -p  < newdb.sql
Note: If you saved only the database tables, you must specify a database when issuing the preceding command.

If you have remote access to your production server simply add the -h hostname option to the preceding command. You may also upload your script using an application such as phpMyAdmin. Finally, you can open and execute the script file from within Query Browser -- but more about this in the next section.

[edit] Updating a MySQL Database from a Spreadsheet

Migrating spreadsheet data to a MySQL database can be a relatively simple task especially when using Query Browser. However, migrating users to that database can be much more difficult. You may find that you continue to receive database updates in the form of complete but modified spreadsheets.

It may seem counterintuitive, but such updates can be handled most easily by recreating the entire database again. If we script this process then updates can be done in a matter of seconds. All we need are a few modifications to the script file that we saved as we worked.

The only additions to this script are DROP TABLE statements -- making it much easier to reuse the database that's already there. This script can be run from the command line as described in the previous section or you can open it within Query Browser.

To open a script file from within Query Browser choose the Open Script option under the File menu. Find the script file and select it. A script file tab will open showing the contents of the file. Syntax highlighting is one of the advantages of executing a script from within Query Browser -- errors are much more easily spotted. Any errors that occur during execution are displayed in a pop-up dialog, specifying the nature of the error and also the line number. You can also set break points and step through the code one line at a time if you wish.

Using Query Browser made it easy to document our actions in migrating a spreadsheet to MySQL. This documentation is easily turned into a script file so that we can recreate the process. It can also serve as a reference for techniques to use in future migrations. Find a copy of the script file in the next section.

[edit] The Migration Script

#use database
 USE association;
 
 #First make copy of Excel data
 #treat all fields as text
 DROP TABLE IF EXISTS `alldata`;
 
 CREATE TABLE `alldata` (
   `lastname` VARCHAR(50) NOT NULL,
   `firstname` VARCHAR(50) NOT NULL,
   `certification` VARCHAR(10) NOT NULL,
   `expirydate` VARCHAR(10) NOT NULL,
   `streetaddress1` VARCHAR(50) NOT NULL,
   `streetaddress2` VARCHAR(50) NOT NULL,
   `city` VARCHAR(50) NOT NULL,
   `state` VARCHAR(2) NOT NULL,
   `zipcode` VARCHAR(10) NOT NULL,
   `certificationnumber` VARCHAR(10) NOT NULL
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
 
 #get data from tab-separated file created from spreadsheet
 #Unfortunately, variable below won't work with LOAD DATA
 #SET @filename = "/home/peter/Documents/spreadsheet/data.tsv";
 #so hard code
 LOAD DATA INFILE "/home/peter/Documents/spreadsheet/data.tsv"
   INTO TABLE alldata;
 #Our server and client are on the same machine
 #Don't need "LOCAL" if file is on the server (local means local to the client)
 #will need the FILE privilege though
 #but if it's there you can execute this script from somewhere else on your network
 #Syntax with comma separated fields -- not as safe as tabs
 #LOAD DATA INFILE "C:/Documents and Settings/peter/My Documents/spreadsheet/data.csv"
 #  INTO TABLE alldata
 #  FIELDS TERMINATED BY ",";
 
 #create temporary tables
 #Association members information
 DROP TABLE IF EXISTS `tempmembers`;
 
 CREATE TABLE `tempmembers` (
   `unique_value` VARCHAR(150) DEFAULT NULL,
   `id` INT(11)  NOT NULL AUTO_INCREMENT,
   `firstname` VARCHAR(30) NOT NULL DEFAULT '',
   `lastname` VARCHAR(40) NOT NULL DEFAULT '',
   `streetaddress1` VARCHAR(60) NOT NULL DEFAULT '',
   `streetaddress2` VARCHAR(60) NOT NULL DEFAULT '',
   `city` VARCHAR(60) NOT NULL DEFAULT '',
   `state` VARCHAR(10) NOT NULL DEFAULT '',
   `zipcode` VARCHAR(10) NOT NULL DEFAULT '',
   PRIMARY KEY (`id`)
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
 
 INSERT INTO tempmembers
   SELECT DISTINCT CONCAT(firstname,lastname,streetaddress1) AS unique_value,
   NULL AS id, firstname, lastname,
   streetaddress1, streetaddress2, city, state, zipcode
   FROM alldata;
 
 #Member accreditations
 DROP TABLE IF EXISTS `tempmemberaccreditations`;
 CREATE TABLE `tempmemberaccreditations` (
   `unique_value` VARCHAR(150) DEFAULT NULL,
   `certification` VARCHAR(10) NOT NULL,
   `memberid` INT(11) NOT NULL DEFAULT '0',
   `certificationnumber` VARCHAR(10) NOT NULL,
   `expirydate` DATE DEFAULT NULL
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
 
 INSERT INTO tempmemberaccreditations
   SELECT CONCAT(firstname,lastname,streetaddress1) AS unique_value,
   certification, 0 AS memberid,
   certificationnumber,
   STR_TO_DATE(expirydate, "%d-%b-%y")
   FROM alldata;
 #Above format for 31-Dec-07
 #for 12/1/2007 use "%c/%e/%Y"
 
 #now relate the two tables and insert ids into the tempmemberaccreditations
 
 UPDATE tempmemberaccreditations t2
   INNER JOIN tempmembers t ON  t.unique_value=t2.unique_value
   SET t2.memberid =  t.id;
 
 #create final version of tables
 DROP TABLE IF EXISTS `members`;
 ALTER TABLE `tempmembers`
   RENAME TO `members`,
   DROP COLUMN `unique_value`;
 
 DROP TABLE IF EXISTS `memberaccreditations`;
 ALTER TABLE `tempmemberaccreditations`
   RENAME TO `memberaccreditations`,
   DROP COLUMN `unique_value`;
 
 #add indices
 ALTER TABLE `members`
   ADD KEY `lastname_idx` (`lastname`),
   ADD KEY `city_idx` (`city`);
 
 ALTER TABLE `memberaccreditations`
   ADD PRIMARY KEY (`certification`, `memberid`);
 
 #create accreditations table
 DROP TABLE IF EXISTS accreditations;
 CREATE TABLE  accreditations
   SELECT DISTINCT certification AS acronym, '' AS description
   FROM alldata;
 
 ALTER TABLE `accreditations`
   ADD PRIMARY KEY (`acronym`);
 #now add views
 DROP VIEW IF EXISTS vwOrphanedMembers;
 CREATE VIEW `vwOrphanedMembers` AS
   SELECT `t`.`id` ,`t`.`firstname`,`t`.`lastname`
   FROM `members` `t`
   LEFT JOIN `memberaccreditations` `tma`
   ON `t`.`id` = `tma`.`memberid`
   WHERE ISNULL(`tma`.`memberid`);
 
 DROP VIEW IF EXISTS vwOrphanedAccreditations;
 CREATE VIEW `vwOrphanedAccreditations` AS
   SELECT `tma`.`certification`,
   `tma`.`memberid`,`tma`.`certificationnumber`,
   `tma`.`expirydate`
   FROM (`memberaccreditations` `tma`
   LEFT JOIN `members` `t` ON ((`tma`.`memberid` = `t`.`id`)))
   WHERE ISNULL(`t`.`id`);
 #remove spreadsheet-based table
 DROP TABLE IF EXISTS `alldata`;
 

[edit] Migrating an Access Database to MySQL

[edit] Using PHP Data Objects (PDO) With MySQL

[edit] Using mysqlnd

[edit] Ruby and MySQL

[edit] MyAdmin

Retrieved from "http://forge.mysql.com/wiki/MySQL_User_Guide_Part_VI_Tutorials"

This page has been accessed 3,220 times. This page was last modified 15:30, 29 February 2008.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...