Logo

Home
Seabrooke House
Shop
 
 

Create your own professional web site or have our professionals do it for you...
Author: Webpublisher
• Tuesday, March 16th, 2010

One of the crucial steps in building an application that handles a database, is without doubt the design of the database. If the tables are not properly defined, we can have a lot of headaches when running queries to the database to try to get some information. 

Whether our database has only 20 entries, or some few thousands, it is important to make sure our database is properly designed to have efficiency and usability over time.

In this article, we mention some basic principles of database design and handled some rules to follow when creating databases. Depending on the requirements of the database, the design can be somewhat complex, but with a few simple rules we have in the head will be much easier to create a perfect database for our next project.

MySQL build large applications is easy with tools such as Apache, Perl, PHP, and Python. Ensure that are fast, however, requires more than insight. MySQL has a well deserved reputation as a database server very quickly which is also very easy to configure and use, besides that in recent years its popularity has grown significantly due to that used on countless Web sites that require make use of a database. However, few users know anything more than creating a database and write some searches against it.

After reading this article must be able to understand some techniques that will help you design MySQL databases to build better applications. Suppose you have a basic knowledge of SQL language, and MySQL, but we will not assume that they have much experience with either one.
Storing the information necessary

It seems common sense, but many people often take the approach of “kitchen sink” for the design of databases. We often think of anything we wanted that were stored in a database and design the database to store data. We must be realistic about our needs and decide what information is really necessary. Often we can generate some data on the fly without having to store them on a table in a database. In these cases also makes sense to do this from the viewpoint of application development.

For example, a product table for an online catalog may contain names, descriptions, sizes, weights and prices of various products. Besides the price, you may want to keep taxes and shipping costs associated with each product. But really there is no need to do this. First, both taxes and shipping costs can be calculated on the fly (either by our application, or MySQL). Second, if we change taxes or shipping costs, would have to write the necessary searches to update taxes and shipping costs in every product registration.

Sometimes we have to add fields to tables in a database once they have been created is too difficult, so we are impelled to define as many columns as possible. Well, this is simply a misconception, because in MySQL can use the ALTER TABLE command to change the definition of a table at any time to suit our changing needs.

For example, if at some point we realize that we need to add a column to our table popular products (we may want our customers to qualify the products in our catalog), we could do the following:

ALTER TABLE products ADD popularity INTEGER;

Ask only what is necessary and be explicit

Just to say “store only what is necessary”, this may seem a little more common sense, however, this is usually not considered very often. Why?. Because when an application is in development requirements often change, so that many of the searches end up looking like this:

SELECT * FROM algunaTabla;

Get all the columns in a table is simply the best thing we can do when we’re not sure what areas need. However, as the tables grow and change, this can become a performance issue. In the long run is much better late extra time after our initial development and decide exactly what we need in our searches. In particular, it is much better to specify the columns explicitly:

SELECT name, price, description FROM products;

This relates to a point that has more to do with keeping the code with performance. Most programming languages (Perl, Python, PHP, Java, etc) allow us access to the results of a query by the field names and their numerical position. For the above example, we can access the field 0, or field name and get the same results.

In the long run is better to use the column names to their numeric positions. Why? Because the relative positions of columns in a table or a query result can vary. For example, a table may vary as a result of an ALTER TABLE, or change in a query as a result of someone rewriting the search and forget to update the application logic properly.

Of course, we still need to be careful when changing the names of the columns! But if you use names instead of numerical positions, we can use the search capability and replacement of our publisher to find the code we have to change if you change the name of a column.
Normalizing table structure

If you’ve never heard of the “normalization of data,” we should not fear. While standardization may seem a complex topic, we can benefit greatly by understanding the most basic concepts of standardization.

This seems reasonable. However the problem is that the number of tracks that have a CD is quite variable. This means that this method would have to have a really big spreadsheet to hold all the data, which in the worst cases could be up to 20 tracks. This definitely is not good.

One goal of a standardized table structure is to minimize the number of “empty cells”. In the case of the CD table before us, would have a lot of CDs these cells if we allowed 20 runs or more. In the event that the lists of fields can expand, “right” as in this example of the CDs, gives us a clue that we need to divide the data into two or more tables that we can then get together to obtain the data we need.

Many people new to the systems of relational databases do not really know what “relational” in RDBMS (Relational Database Management System). In simple terms, similar groups of information are stored in different tables that can then be “coupled” (related) based on data they have in common. Unfortunately this sounds quite academic and vague, however, in our CD database we can illustrate a specific situation in which we will see how to normalize the data.

The realization that each list of CDs is a fixed set of attributes (title, artist, year, genre) and a variable set of attributes (the number of tracks) gives us an idea of how to divide the data into multiple tables then we relate to each other. We can create a table containing a list of all albums and their attributes fixed and the other containing a list of all the tracks of these albums. Thus, instead of thinking horizontally (as with the spreadsheet), and vertically think we left a table structure like the one shown below.

CREATE TABLE album (
id_album INTEGER NOT NULL PRIMARY KEY,
title VARCHAR (80) NOT NULL);

CREATE TABLE track (
id_album INTEGER NOT NULL,
number INTEGER NOT NULL,
title VARCHAR (80) NOT NULL);

The album identifier (id_album) is what links the various tracks of a given album. The field in the table id_album track id_album coincides with the field in the album chart, so that for a list of all tracks on a given album, we could perform a query like this:

Pista.numero SELECT, FROM pista.nombre album, track
WHERE album.titulo = “The title of the album”
AND album.id_album = pista.id_album

This structure is both flexible and efficient. The flexibility lies in the fact that we can add data to the system later without having to rewrite what we already have. For example, if we add the information of the artists on each album, all you have to do is create a table that is related to artist’s album chart in the same way that the track table. Therefore, we need not modify the structure of our existing tables, just add what you need.

Efficiency refers to the fact that we have no duplication of data, and we have no large amounts of “empty cells”. In this way MySQL does not have to store more data than necessary, or spend resources to check the empty areas in our charts.

The main objective of database design is to generate tables that model the records, which will keep our information. It is important that this information is stored without redundancy so that you can have a fast and efficient retrieval of data. Through standardization try to avoid certain shortcomings that will lead to poor design and leading to a less efficient processing of data.

We could say that these are the main objectives of standardization:

* Controlling the redundancy of information.
* To avoid data loss.
* Ability to represent all information.
* Maintain data consistency.

If you’re new to the environment of relational databases might think that with the normalization of our data have a strange appearance, however, it allows MySQL to be very efficient when storing and retrieving data from tables, plus which gives us the flexibility to grow and scale our applications without the need to restructure a database each time.
Select the appropriate data type

Having identified all the tables and columns that the database needs, we must determine the data type of each field. There are three main categories that can be applied to virtually any database application:

* Text
* Numbers
* Date and Time

Each of these has its own variations, so choosing the correct data type not only influences the type of information that can be stored in each field, but affects the overall performance of the database.

Here are some tips that will help us choose a data type suitable for our tables:

* Identify whether a column must be of type text, numeric, or date.

This is often a step too simple. Numerical values eminently zip codes or dollar amounts should be treated as text fields if we decide to include your punctuation, but will get better results if you store them as numbers and solve the issue of format otherwise.
* Choose the most appropriate subtype for each column.

The fixed-length fields (such as CHAR) are generally faster than variable length (such as VARCHAR), but take up more disk space.

The size of each field should be restricted to a minimum in terms of what might be the largest entry. For example, if the value in an integer column is less than a thousand, it is best to set this column as a three-digit SMALLINT unsigned (allowing exactly 999 different values).
* Set the maximum length for text and numeric columns, and other attributes.

Perhaps we have different preferences, but the most important factor is always set to maximum information in each field instead of always using generic types TEXT and INT (and inefficient).

Use appropriate indicators

The indices are a special system that use databases to improve their overall performance. When defining indexes on the columns of a table, tells MySQL to pay special attention to those columns.

MySQL allows you to define up to 32 indexes per table and each index can incorporate up to 16 columns. Although a multiple-column index can be useful not obvious at first glance, the fact is that it is very useful in making frequent searches on the same set of columns.

Since the indexes make queries run faster, we can be encouraged to index all the columns of our tables. However, what we need to know is that using a price index. Each time you do an INSERT, UPDATE, REPLACE, or DELETE on a table, MySQL has to update any index on the table to reflect changes in the data.

So, how we decided to use indexes or not?. The answer is “it depends”. So simple, it depends what kind of queries run and how often we do, but it really depends on many other things.

The reason for having an index on a column is to allow MySQL to run the search as quickly as possible (and avoid full table scan). We think that an index contains an entry for each unique value in the column. In the index, MySQL must have any value doubled. These duplicate values decreases the efficiency and usefulness of the index.

So before you index a column, we must consider what percentage of table entries are duplicates. If the percentage is too high, probably will not see any improvement with the use of an index.

Another thing to consider is how often the rates will be used. MySQL can use an index for a particular column if this column appears in the WHERE clause in a query. If you rarely use a column in a WHERE clause certainly does not have much sense to index that column. Thus, more efficient probably suffer the complete scan of the table the rare occasions that this column is used in a query, the index to be updated every time data changes in the table.

When in doubt, we have no alternative but to try. We can always run some tests on the data from our tables with and without indexes to see how we get the results faster. The only thing to consider is that the tests are as realistic as possible.
Use REPLACE queries

There are times when you want to insert a record unless it is already in the table. If the record already exists, what we would do is an update of the data. Instead of writing code that complies with this logic, and having to run several queries, it is best to use MySQL REPLACE statement.
Using temporary tables

When working with very large tables, it often happens that occasionally need to run some queries on a small subset of a large amount of data. Rather than run these queries on the entire table and make MySQL more and find the few records we need, can be much faster to select those records into a temporary table and then run our consultations on this table.

Create a temporary table is as simple as adding the word TEMPORARY in CREATE TABLE typical sentence:

CREATE TEMPORARY TABLE tabla_temp
(
Datatype field1,
field2 datatype,

);

A temporary table exist as long as the connection to MySQL. When the connection is broken automatically removes MySQL table and free the space it used. We can of course delete this table while connected to MySQL. If a table named tabla_temp already exists in your database when creating a temporary table with the same name, the temporary table is not hidden from the temporary table.

MySQL also allows you to specify a temporary table is created in memory if the table is declared of type HEAP:

CREATE TEMPORARY TABLE tabla_temp
(
Datatype field1,
field2 datatype,

) TYPE = HEAP;

Because HEAP type tables are stored in memory, consultations on these tables are implemented much faster than on disk temporary tables. However HEAP tables are slightly different from a normal table and have some limitations.

As with previous suggestions, the only thing left is to test whether temporary tables with our queries run faster than using the table that contains a large amount of data. If the data are well indexed temporary tables can not be of much use to us.
Use a recent version of MySQL

The recommendation is simple and concrete, provided it is in our hands, we must use the latest version of MySQL that is available. In addition to the new versions often include many improvements are becoming more stable and faster. Thus, while we take advantage of new features in MySQL, we will see significant increases in the efficiency of our database server.
Concluding remarks

The last step of designing the database is to take certain naming conventions. While MySQL is very flexible about how to assign names to the databases, tables and columns, here are some rules you should observe:

* Use alphanumeric characters.
* Limit the names to fewer than 64 characters (it is a restriction of MySQL).
* Use the underscore (_) to separate words.
* Use lowercase words (this is more a personal preference than a rule).
* The table names should be pluralized, and the names of the columns in the singular (it is a personal preference).
* Use the letters ID in the primary key columns and foreign.
* In a table, placing the primary key followed by the foreign keys.
* The field names should be descriptive of its contents.
* The field names must be unique across tables, except for the keys.

The above points are many personal preferences rather than rules that we should meet, and consequently many of them can be overlooked, however, the most important is that the nomenclature used in our databases is consistent and consistently in order to minimize the chance of mistakes when creating a database application.

Category: Design Facts | Tags: , , , , , , ,


Related Articles:

  Apache Security
  Customize the 404 page
  Web pages without any difference in capital letters in Apache / Linux
  Simulate domain on localhost with Apache
  Apache Rewrite Manual
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Leave a Reply

 

.............................................
Toll Free:
+88029336307
Support:
info@maxworkpublishing.com
Sales:
info@maxworkpublishing.com

 

 
All Rights Reserved Publish A Web terms & condition | Privacy policy