A computer application is nothing more than a bundle of source code that manipulates a set of data. That data is key to the operation and functionality of the application. The data, however, can come from a variety of places. Be it user entered, in-memory defaults, or offline stored databases, these databases and their relationship to Java are the subject of this article.
Databases come in all different shapes and sizes. They can be flat files of ASCII data (like Q&A) or complex binary tree structures (Oracle or Sybase). In any form, a database is a datastore, or a place that holds data. The type of data that is contained in the store is irrelevant.
If a database is simply a collection of data, then what keeps track of changes to this data? That is the job of the Database Management System, or DBMS. Some DBMSs are relational. Those are RDBMSs. The relational part refers to the fact that separate collections of data within the reaches of the RDBMS can be looked at together in unison. The RDBMS is responsible forensuring the integrity of the database. Sometimes, things will get out of whack and the RDBMS will keep all that data in line.
Very few DBMSs are not RDBMSs these days. We will refer to any database, be it DBMS or RDBMS as RDBMS for the remainder of this article and book.
There are so many different types of RDBMSs available today that it would probably take two full books to give a summary of each one. This overview is a quick start guide for those of you who are not familiar with some newer concepts in data storage.
In the days of yore, programming database applications was pretty simple. There were mainframe databases and there were very few microcomputer databases available. The ones that were availablecost an arm and a leg. The cheap ones were, well, you got what you paid for. But you always had database files, records, and fields.
The database terms of yesteryear, however, have been replaced by new ones. Some of the bigger database companies like Oracle and Sybase have redefined database terminology. The main thrustof this redefining is most likely in response to the larger customer base that is not "programming-literate."
A programmer can deal with files, records, and fields. But more and more non-technical people are creating database applications and queries these days. Their formal training is through the use of applications. As you'll see, some of the new terms are commonly found in spreadsheet and word processing programs.
The following is a list of current database terms that will be used in the rest of this article and throughout the book:
- Client\Server. Client\server is more of an architecture than a tangible entity. The client is a computer system that requests the services provided by an entirely different computer system. On a smaller scale, the client and server may be separate processes running on the same computer system. The distinction is that there is a service provider (the server), and a consumer of that service (the client). For your purposes here, the server would be the RDBMS, and the client is your application that is requesting data from the server.
- Database or Instance. The database or instance is the entity, or collection of data, that is created and stored for retrieval and modification. Depending on the RDBMS, severalof these entities can exist on a single machine. For instance, multiple Oracle instances can exist on a UNIX server. Each has a distinct area for data, and unless properly configured, theyhave no knowledge of each other. A database or instance is comprised of schemas.
- Schema. A schema is a collection of database objects that belongs to a single user of the database. Databases have many users.
- Table. A table is a database object that contains a single set of data. Like things are stored in tables. For instance, at a company a normal table to have is an employee table. This would store all kinds of information about an employee. A table contains rows of data.
- Row. A row of data is a single record in a table. A row is divided into columns.
- Column. A column is the smallest unit of data in a table. It is a part of the row. When data is displayed in a spreadsheet-like fashion, a column would be the up/down slice of data.
Databases can exist in various places. Larger databases require the horsepower of a multiple CPU server. Smaller databases can get away with only a microcomputer serving data. But where thedata is stored is important to the application programmer. There are only two options for database location: local and remote.
Local and Remote
A local database is one that resides on the machine on which client applications run. Local databases offer the fastest response time because there is no network traffic between the client (your application) and the server (the RDBMS engine). Some examples of local databases are Paradox from Borland, Access from Microsoft, and Personal Oracle from Oracle.
A remote database, on the other hand, is one that resides on a machine that the client software does not run on. This is an important distinction for two reasons:
- Response time will be slower. Even the fastest network connections will not get you the same response time as a local database. Bear in mind, the difference might be seconds or less.
- An additional software layer is needed to communicate with the database.
The first item is the general case. It is also only relevant for performance-critical applications. A well-tuned RDBMS server can out-perform a poorly tuned local server in some cases.
The second item, however, might cause grief and headaches that weren't expected. With some database server and client products, a second software layer is necessary to transparently interact with the remote database. This software might be an optional software package that is not included with the server software. I'll get to this layer in the next topic, "Database Access."
Here are some of the database software vendor Web sites, and some excellent sources of database information:
There is one more topic I'd like to touch upon before I get into accessing databases: it is the client-server concept of multi-tiering. Unfortunately, I have heard about ten different explanations of this concept and not one of them ever is the same. The following is my take on the single-, two- and three-tiered architectures.
The application and the data reside together logically. These are not usually database programs. An example would be a calculator program. The logic and its data reside together.
The application resides in a different logical location than the data. These are usually database applications. Most client/server applications fit into this category. Figure 6.3 shows a model of a two-tier application.
In a three-tiered system, the application resides in a different logical location than the logic of the application and the data.
To put it another way, the client software makes a call to a remote service. That remote service is responsible for interacting with the data and responding to the client. The client has no knowledge of how and where the data is stored. All it knows about is the remote service. Conversely, the remote service has no knowledge of the clients that will be calling it. It only knows about the data.
This partitioning of logic allows for better data control and reuse of existing code. Three-tier architecture is becoming more widespread because more and more tools are being created that handle the tiering automatically.
Related Online Articles:
No comment yet. Be the first to post a comment.