Friday, January 29, 2010

MySQLI_Real_Escape_String()---A Useful Function

If you are collecting information through an html form or providing user access to a MySQL database, the mysqli_real_escape_string() function can be a very useful tool to employ. This function can be used to prevent database attacks from malicious users by "escaping" special characters such as "\r,"\n",!\,etc.

Situations where such special characters cause problems are usernames and passwords. The code below shows an example of what could happen.

// We didn't check username and password.
// Could be anything the user wanted! Example:
$_POST['user'] = 'jerry';
$_POST['pwd'] = "' OR ''='";

The SQL sent would be:
SELECT * FROM users
WHERE user='jerry' AND password='' OR ''=''
This means that anyone could log in without a valid password!

To prevent this, you "escape" special characters with the mysqli_real_escape_string() function, which takes two arguments, the database connection variable and the string variable to be escaped. The function returns a string on success or FALSE for failure. Depicted below is the syntax:

$string=mysqli_real_escape_string($dbc,$escapeString)--where $dbc is the database link connection and $escapeString is the string to be escaped.

Using the Post method, the username and password would be "escaped" like this:

$user=mysqli_real_escape_string($dbc,$_POST['user']);
$pwd=mysqli_real_escap_string($dbc,$_POST['pwd']);

Tuesday, January 26, 2010

SQL Grouping Sets

This post in a continuation of a series discussing OLAP SQL. OLAP stands for Online Analytical Processing, often used in data warehousing transactions. In Oracle, Grouping Sets are an extension of the Group By clause that enables specifying multiple groups of data.

With Grouping Sets, you just specify the desired groups and the Oracle database computes all groupings specified in the Grouping Sets clause and combines the results in a Union All operation. The Union All operation allows duplicate rows in the result set. The syntax is depicted below:

SELECT col1,col2,col3,count(*);
FROM table1
GROUP BY GROUPING SETS(col1,col2,col3);

Without the Grouping Sets syntax, you would have to run separate queries and then do a Union operation. This can be inefficient if several groupings are required and can make your SQL code more complex. With Grouping Sets, this is can be done in one step and is more efficient as the underlying table is accessed only once.

Wednesday, January 20, 2010

Oracle SQL Decode Function

The SQL Decode function offers a way to to write expressions. In addition, sometimes you want to write complex programming logic. The problem is that SQL does not provide this logic, as SQL is a command based language. This is where vendor specific functions such as the Oracle Decode come into play.

Functions in SQL create links between the SQL programming language and the database programming environment, allowing for multi-line constructs. The Decode function is one such example. This function acts like an If..Else...End..If construct with some variation. The syntax for the Decode function is:
DECODE(col/expr, search1, result1[, search2, result2 ]...[, default ])
Decode compares each column/expression for the search value one by one. If the column/expression matches the search value, then the result1 is returned. If no match is found, then Oracle returns default. If default is omitted, then Oracle returns null.

The example below decodes the value region_id. If region_id is 1, then the function returns 'Northeast'; if region_id is 2, then it returns 'South'; and so forth. If region_id is not 1, 2, 3, or 4, then the function returns 'Other'.
SELECT employees_id,last_name,first_name,
DECODE (region_id, 1, 'Northeast',
                                  2, 'South',
                                  3, 'Midwest',
                                  4, 'West',
                                   'Other')
as "Region" FROM employees;
You can test for many more search values if needed. The maximum number of components in the DECODE function, including expr, searches, results, and default, is 255.

Tuesday, January 19, 2010

SQL GROUPING SETS

This post is a continuation of a series on OLAP SQL.  OLAP refers to Online Anayltical Processing used in data warehousing transactions.  The Grouping Sets clause in SQL allows you to see the results of a query in different groups.

For example, suppose you wanted to see a report of sales by sales reps but also wanted to see sales grouped by week and territory.  You could write two queries to obtain two reports.  Or you could write one query that used the Union All operator on the two groupings.

This is where the Grouping sets clause in SQL comes in handy.  Grouping Sets allows you to define multiple groups in one data set.  The advantage is that the underlying table is accessed only once which speeds up query performance on large data sets.  The syntex for Grouping Sets is:
SELECT col1,col2,count(*)
FROM table1
WHERE.....
GROUP BY GROUPING SETS((col1,col2),(col3,col4));
Here is an example using the hr schema in Oracle:
SELECT department_id,job_id,count(*)
FROM employees
GROUP BY GROUPING SETS(department_id,job_id);
The result set would like this:
DEPARTMENT_ID
JOB_ID
COUNT(*)

IT_PROG
5

AC_MGR
1

AC_ACCOUNT
1

ST_CLERK
20

HR_REP
1
100

6
30

6
90

3
20

2
70

1
110

2
As you can see, the Grouping By Sets function offers efficient processing of large data sets than querying two  tables separately.

Monday, January 18, 2010

jQuery oEmbed Content

At times, it may be appropriate to embed video content directly in your website.  The oEmbed API enables you to dispaly embeded content such as Vimeo videos, Flickr photos, and YouTube movies.  OEmbed is a format for allowing embedded comtent from third-party sites.  Facebook is a poplar user of the oEmbed format where you can copy and past a link and the link is converted into an embedded movie.  To use oEmbed, you have to first download the jQuery oEmbed plugin . Then you enter the following files in script tags in the head section of your HTML.
src="jquery.min.js" type="text/javascript"
src="jquery.oembed.js" type="text/javascript"
Next, you add a div and and input tag to hold the embedded content in the link:
input id="olink" type="text"
div id="oembed"
Now add some functionality with Javascript. This should be placed in javascript tags right after the body tag in your HTML:
$(document).ready(function() {
$("#olink").blur(function (){
olink = $("#olink").val();
$("#oembed").oembed(olink);
})
});
Next copy and paste a link into the text box and when you move your cursor away from the box, the link will convert to embedded content into your web page. A nice technique to have in your web design toolkit. Visit this link jQuery documentation for the full documentation of the oEmbed plugin.

Saturday, January 16, 2010

OLAP IN SQL

OLAP stands for Online Analytical Processing, a new feature in SQL that is being implemented in database products by various vendors. OLAP SQL is typically used in data warehouses. OLAP differs from OLTP which is Online Transaction Processing used in transaction systems. The key differences are:
OLTP:
OLAP:
Data changes often
Data is static
Current data stored
Stores historic data
Few indexes
Many indexes to speed up complex queries
Fast and optimized
Accurate and complete
Highly normalized
Highly denormalized
There are several OLAP SQL functions. This post will discuss the RANK function, aggregate and analytical. The RANK function calculates the rank of a value in a group of values and returns a number. An example is shown below:
SELECT RANK(45500, .05) WITHIN GROUP
(ORDER BY salary, commission_pct) "Rank"
FROM employees;

Rank
----------
55
As an analytic function, RANK computes the rank of each row returned from a query with respect to the other rows returned by the query, based on the values of the value_expression in the order_by_clause. Here is an example of an analytical RANK:

SELECT department_id, last_name, salary, commission_pct,
RANK() OVER (PARTITION BY department_id
ORDER BY salary DESC, commission_pct) "Rank"
FROM employees WHERE department_id = 80;

DEPARTMENT_ID LAST_NAMESALARY COMMISSION_PCTRank
80
Brown
14000
.4
1
80
Lewis
13500
.3
2
80
Smith
12000
.3
3
80
Carlson
11500
.25
4
80
Chambers
11000
.3
5
80
Adams
11000
.3
5
There is another version of the RANK function, called DENSE RANK which computes the rank of a row in an ordered group of rows and returns the rank as a NUMBER. The ranks are consecutive integers beginning with 1. Rank values are not skipped in the event of ties. Rows with equal values for the ranking criteria receive the same rank. This function is useful for top-N and bottom-N reporting.

The following example computes the ranking of a hypothetical employee with the salary $45,500 and a commission of 5%.
SELECT DENSE_RANK(45500, .05) WITHIN GROUP
(ORDER BY salary DESC, commission_pct) "Dense Rank"
FROM employees;

Dense Rank
-------------------
3
There it is the OLAP RANK SQL. I'll discuss other OLAP SQL functions in future posts.

Thursday, January 14, 2010

Hierarchical SQL

In a previous post, I showed an example of a self-join query. In this post, I will show examples of using Hierarchical SQL is to display self-join data in a more intuitive way. Hierarchical SQL is an extension of Oracle PL/SQL. This extension to the standard SQL provides a way to loop through all the branches and levels of multi-level self-join data. Consider the following SQL query:
SELECT last_name,||' '||'reports'||' '|| to'||' '||
nvl(PRIOR last_name, 'no one') "Reporting Relationship"
FROM employees
START WITH employee_id=100
CONNECT BY PRIOR employee_id=manager_id;
Here, the hierarchical query clause allows you to select rows in a hierarchical order. The nvl function returns the value of 'no one' if the last_name is null. The Prior keyword is used to link the current record to the "prior" record, or next higher in the hierarchy. The Connect By key words are used to establish a link between the primary and foreign key. The execution of this query would return the result set as seen below:
Reporting Relationship
--------------------------------------------------------------
King reports to no one
Kochhar reports to King
Greenberg reports to Kochhar
Faviet reports to Greenberg
Chen reports to Greenberg
Sciarra reports to Greenberg
Urman reports to Greenberg
Popp reports to Greenberg
Using hierarchical SQL, you could also create an organizational chart from this query statement:
SELECT lpad(last_name,Length(last_name)+(LEVEL*4)-4,'-') as "Organization Chart"
FROM employees
START WITH employee_id=100
CONNECT BY PRIOR employee_id=manager_id;
The lpad function returns a character string padded on the left side;the Length function returns the length of the string. The Level keyword indicates the level of the hierarchy(number of nested levels). Here we are creating an expression that creates more hyphens the deeper the nesting level. This query would produce the following result:
Organization Chart
King
----Kochhar
--------Greenberg
------------Faviet
------------Chen
------------Sciarra
------------Urman
------------Popp
--------Whalen
--------Mavris
--------Baer
--------Higgins
------------Gietz
----De Haan
--------Hunold
------------Ernst
------------Austin
------------Pataballa
------------Lorentz
As shown in these examples, hierarchical SQL is a nice tool to have in your database query toolkit.

Monday, January 11, 2010

Multi-Table Inserts

Oracle has many useful extensions to the SQL programming language.  One of them is the multi-table insert statement used primarily in data warehousing.  There are 2 forms of the multi-table insert, unconditional and conditional.  The unconditional insert allows you to insert data into multiple tables using one DML statement. An example is depicted below:
INSERT [ALL]
                 INTO clause1
                 INTO clause2
SELECT FROM column1,column2,...
FROM source_table;
Here, the multi-table insert is unconditional and derived from the row-set returned from the evaluation of a subquery of one or more tables. A conditional multi-table insert uses conditions to insert data into multiple tables. If the multiple conditions are true, each insert is processed.  An example of the syntax is shown below:
INSERT [ALL][FIRST]
          WHEN condition1 THEN
                            INSERT INTO clause
          WHEN ELSE
                            INSERT INTO clause
SELECT column1,column2,...
FROM source_table;
If you specify ALL(the default), the database evaluates each WHEN clause in the order in which it appears in the statement. For each WHEN clause that evaluates to true, the database executes the INSERT INTO clause.

If you specify FIRST, the database executes the INSERT INTO for first WHEN clause that evaluates to true, and skips the subsequent WHEN clauses for the given row.  If no WHEN clauses evaluate to true, the database executes the INSERT INTO clause associated with the ELSE clause for a given row.

Per the Oracle extension, multi-table inserts can only performed on tables, not views.  In addition, you cannot insert into a remote table.  Nonetheless, multi-table inserts can be a valuable tool in data warehousing transactions.   

Sunday, January 10, 2010

CSS3

The CSS Working Group of WC3 has introduced several new functions and features to Cascading Style Sheets that you can use in web design. You are looking at one below, rounded corners which is implemented using the border-radius property. This feature allows you to create rounded corners on box items. The code for the effect shown below is: -moz-border-radius: 5px;-webkit-border-radius: 5px; padding: 5px;. The box is coded with the following properties: style="background-color:#e8e8e8;border-style:solid;border-width: 1px;

This is a box with rounded corners

Another new feature introduced in CSS is box-shadow. The code for box-shadow is
-webkit-box-shadow:5px -5px 10px #000;
-moz-box-shadow:5px -5px 10px #000000;

This is a box with rounded corners + box shadow
Here the shadow is above the box. If you want to position the shadow below the box, then you place the box-shadow code after the border-radius code in your CSS script. Both Mozilla/Firefox and Safari3 have implemented border-radius and box-shadow.

Pretty nice...

Should Users Be Involved in Database Design?

Database design usually begins with user views of existing data, and in the relational model, is about providing each User their own unique view of the same data. As such, the Developer needs to verify the model meets User requirements, and an effective way to achieve this is by seeking User input early in the design process. An analogy would be the Architect who is designing a kitchen remodel, interviewing the homeowner about their current and future use of the space. The Architect needs to be able to design a plan to fit the homeowner requirements.

The benefit to Users in requirements definition is a database product that enables them more productive in their jobs. For the Developer, the benefit of User involvement is the production of a data model that contributes to the most cost effective solution for the organization.

For Users who are locally available, the benefit to the developer is the ability to conduct interviews and surveys, review documents and observe operations. However, in the case of Users who are not local, with the exception of surveys, these methods may add travel expenses and other costs which may deter an Organization from proceeding with the database project in a timely manner.

Through interviews and observation, the Database Developer may be able to obtain additional information about User requirements from side topics during the course of an interview, or actions or methods employed by Users in normal processing. In addition, through observation, the Developer may be able to garner important information about current processes that management did not know was actually being performed.

The costs are that interviews and observation methodologies are time and labor intensive activities for both the Developer and the Organization. In addition, interviews may generate a “garbage-in-garbage-out” result, if the Interviewer is not skilled or if interview questions are poorly crafted. In addition, when using an observation methodology, the Users may improve their behavior or processes because they realize they are being observed. As a result, no significant improvements in design may be readily available to the Database Developer.

With non-local Users, the Developer can utilize techniques such as surveys, questionnaires, and document review. Surveys and questionnaires offer the Developer an opportunity to cover a vast amount of information in a short amount of time. In addition, if crafted properly, surveys and questionnaires can present information in a standard format as opposed to unstructured, unfocused interview sessions. In other words, the Developer may be able to obtain more accurate results if each User is presented with the same questions.

However, the cost of using surveys and questionnaires for non-local Users is that historical response rates are typically low, unless closely followed by Department Managers or Supervisors. In addition, preparing “unbiased” surveys and questionnaires can be hard to develop and expensive to produce.

Document review is another viable option for the Developer in both local and non-local scenarios. Document review typically takes less time than other methodologies and can perhaps document processes more accurately than obtained by interviews or observation. However, the documentation in place may not reflect “reality” in an operational environment and can fall out of date thereby becoming irrelevant to the Developer.

Thus there no single solution to User involvement in the requirements definition phase of database development as there are varied costs and benefits associated with the various approaches.

Tuesday, January 5, 2010

Preventing MySQL Injection Attacks

Database driven web sites can be vunerable to SQL injection attacks by malicious users. As the name implies, this is where bad code is inserted into a site's SQL queries via form input. One aim of such an attack is to create a syntactically invalid query, thereby revealing something about the database in the resulting error message. Another aim would be to alter, destroy, or expose the exposed data.

Nonetheless, SQL injection attacks can be prevented. First start by validating all data to be used in queries and perform typecasting if possible.  Then you could use MySQL prepared statements. With a prepared query, the SQL syntax is sent to the MySQL parser first, then, the specific values are sent separately. MySQL assembles the query using those values and then executes it.

Prepared statements can be created from any INSERT, UPDATE, DELETE, or SELECT query. The first step is to define the query by marking placeholders with question marks. Depicted below is a simple example using queries written in PHP and information collected from a web form:
$q= "SELECT first_name, last_name, email FROM users WHERE user_id=$id;
As a prepared statement, the query becomes:
$q="SELECT first_name,last_name,email FROM users WHERE user_id=?";
Next, you prepare the statement in MySQL by assigning the results to a PHP variable:
$stmt=mysqli_prepare($db_connection, $q);
At this point, MySQL will parse the query, but will not execute it. Next, you bind the PHP variable to the query's placeholder as shown below:
mysqli_stmt_bind_param($stmt, 'i', $id);
The i part of the command indicates the type of variable should be expected, using the characters listed in the table below:
d

Decimal
i

Integer
b

Blob
s

All Other Types
Prepared statements will always be more secure than running queries the normal way. However, they may also be faster as the query itself is sent to MySQL only once, but values can be sent separately
Get your own Widget