Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
SQL Injection Pocket Reference (Google Doc) (docs.google.com)
112 points by rudenoise on July 29, 2011 | hide | past | favorite | 13 comments


Noticed a small mistake in the MySQL section:

    Error Based:
        AND (SELECT * FROM SOME_EXISTING_TABLE) = 1
Unless SOME_EXISTING_TABLE has no more than one row, that query will fail. See http://dev.mysql.com/doc/refman/5.0/en/comparisons-using-sub...:

"For a comparison of the subquery to a scalar, the subquery must return a scalar."

It's easy to fix: just add LIMIT 1 to the end of the sub-select.

---

This document is very useful. I've been doing web app security for a while and I haven't run across some of these techniques before (eg: PROCEDURE ANALYSE()). I didn't realize that PDO_MYSQL allowed for multiple queries by default either (although it makes sense, since historically other RDBMS have supported it).


There's actually another issue that I missed before. SELECT * returns all of the columns in the table: you can't compare that to a scalar.

Tested on MySQL 5.5 on OS X:

    mysql> use test
    Database changed

    mysql> show tables;
    Empty set (0.00 sec)

    mysql> create table test (id int(10) unsigned not null auto_increment, name varchar(10), primary key(id));
    Query OK, 0 rows affected (0.13 sec)

    mysql> insert into test (name) values ("Bob");
    Query OK, 1 row affected (0.02 sec)

    mysql> select * from test;
    +----+------+
    | id | name |
    +----+------+
    |  1 | Bob  |
    +----+------+
    1 row in set (0.00 sec)

    mysql> select * from test WHERE (select * FROM test) = 1;
    ERROR 1241 (21000): Operand should contain 2 column(s)

    mysql> select * from test WHERE (select 1 FROM test) = 1;
    +----+------+
    | id | name |
    +----+------+
    |  1 | Bob  |
    +----+------+
    1 row in set (0.00 sec)

    mysql> insert into test (name) values ("Frank");
    Query OK, 1 row affected (0.00 sec)

    mysql> select * from test WHERE (select 1 FROM test) = 1;
    ERROR 1242 (21000): Subquery returns more than 1 row


Whoops. I totally misunderstood what that part of the guide was trying to accomplish. In case anyone makes the same mistake I did, the suggestion is to get the column count by reading the error message. I'm used to not being able to read the error.


HELP! I don't 'get it'!

Looking at that list of SQL injection attack techniques, I don't see the threat.

Or: Yes, I'm building a Web site, and the server will be running SQL Server.

Some of my Web pages have users enter data in text boxes. So, they could enter a SQL command in a text box.

Then when the Web page is returned to my Web server as a 'post back', my software reads the data in the text boxes.

I have the page built, have code for extracting the data from the text boxes and putting it in, say, string variables in Visual Basic .NET, and now am writing the code for looking at the data in the strings from the text boxes.

So far, I see no threat.

So, I intend to look at the data in the strings and see if it looks anything like a SQL command. Anything that looks like a SQL command will get rejected as bad data and not get near my SQL Server database. Maybe I will write the bad data to my log file.

So, I'm checking the input data from the users. I am assuming that the input data could be anything at all until my code establishes otherwise.

Doesn't everyone do such checking?

With such checking, where is the threat of 'SQL injection'?

If SQL injection is a threat, then why? Just from people not checking input data from users? Some other reason?


First, just checking the data is often not enough and can be hard to do right. The better option normally is to paramaterize the query which makes it very difficultto prevent it.

And no, not everyone paramaterizes or validates their data at all.

I wrote about some of the defensive techniques at http://www.simple-talk.com/sql/learn-sql-server/sql-injectio... a couple of months ago.


Exactly, and what does 'checking' even mean in this case. If you're not parameterizing your SQL queries and relying on your home baked string/regex/etc. checks then you've already lost.

I am honestly baffled by developers who think they can sanitize input enough to happily send it to SQL without parameters. Can anyone explain why this would ever be a good idea?

On a side note I also don't understand why anyone would give an app permission to do anything other than read/update data. Honestly if your app can drop tables then isn't it a bit broken from a security standpoint anyway?


Normally i hate "This" posts, but HN does not display my upvote, so i'm writing this comment to reiterate how important this is.

This is the right way to do it. Paramaterize your queries, do not ever use string concatenation to generate SQL queries. If you try to sanitize the inputs yourself you will fail at some point.


Let me be more clear here by what is meant by 'parameterizing' a SQL command.

So, looking at some of my old code I see in Visual Basic .NET, I start by 'declaring' the types of, and allocating storage for, three identifiers:

    Dim my_SQL_command_string As String

    Dim my_SQL_INSERT_command As New SQLCommand

    Dim new_instance_values_sum As Long
Then I build the 'command text' of a SQL command string:

    my_SQL_command_string = _
      " INSERT Instance_Value_Averages " & _
      " ( Instance_ID, Instance_Values_Sum, N_Instance_Values ) " & _
      " VALUES " & _
      " ( @Instance_ID, @Instance_Values_Sum, @N_Instance_Values ) "
In this string are three 'parameters':

      @Instance_ID
      @Instance_Values_Sum
      @N_Instance_Values
Then I assign my command string, with its three parameters, to the property CommandText of my SQL command:

    my_SQL_INSERT_command.CommandText = my_SQL_command_string
My SQL command has a property Parameters which is an instance of a collection class with a method Add. So, I use the method Add to add a value for parameter

    @Instance_Values_Sum
to the collection:

    my_SQL_INSERT_command.Parameters.Add( "@Instance_Values_Sum", _
      SqlDbType.bigint, 8 ).Value = new_instance_values_sum
I continue with such uses of the method Add and assign values to the other two parameters

      @Instance_ID
      @N_Instance_Values
Eventually I execute the SQL command via:

    my_SQL_INSERT_command.ExecuteNonQuery
For the value I give to parameter

      @Instance_Values_Sum
it is 8 bytes of binary and, thus, could have anything that could be in 8 bytes. So, presumably my SQL query is fully safe from any data at all and also from 'SQL injection'.

Is this what is meant by 'parameterizing' a SQL query?


Nice article. Thanks.

That article is at about the level I need!

Yes, so far I'm trying to stay with the single byte character set. If my business 'goes international', then to convert my code to handle all the world's languages I'll have plenty of money and get consulting from Microsoft and/or hire people!

I never spent much time writing SQL queries, have been away from SQL writing ASP.NET and other work, so am rusty on SQL and ADO.NET. But I just checked some of my old ADO.NET code and see that at times I did 'parameterize' some queries instead of just doing a string 'concatenation' to get the SQL command 'text'.

So, if parameterizing permits sending SQL Server just any garbage data safely, and it SHOULD, then I should be fairly safe just by parameterizing, which is easy enough to do.

Your article also mentioned threats that could come from inside a stored procedure: So far all my ADO.NET SQL work is so simple I've not been tempted to write stored procedures! So I don't have any stored procedures to be threats!

My usage of SQL Server is intended to be dirt simple, almost simple enough for just some key-value store. But I writing for SQL Server anyway expecting that at some point I will be glad I did, e.g., for some off-line analyses.

Thanks for your article and your post here.


> I intend to look at the data in the strings and see if it looks anything like a SQL command

How will you do this? The only certain way I can think of is having an extra copy of the database you're using with a little "dummy" data in it, and testing user input against that to see if it changes data it shouldn't. But that would be computationally expensive.

If you're checking with a regex or something, there's 1,001 ways to get around that.

There are guidelines for SQL injection prevention in .NET[1], but what you describe ain't them.

[1]http://msdn.microsoft.com/en-us/library/ff648339.aspx


Thanks for the reference

     http://msdn.microsoft.com/en-us/library/ff648339.aspx
I just got a copy. I already had

     http://msdn.microsoft.com/en-us/library/bb355989(printer).aspx
which is similar.

I do intend to look at more that .NET may have to check input.

One text box is supposed to have a whole number from 0 to 100. So, check that the string can convert to a number. If not, then the input is bad. Else, check that the number is an integer and between 0 and 100.

Another text box is supposed to have a URL. .NET may have a routine to check for a string being a legal URL.

Another text box is supposed to have just a text comment, and maybe this input would have the greatest risk. Here I am guessing to remove punctuation and leave just A-Z, a-z, and 0-9.

Then I will check my ADO.NET call where I insert a row in a table where that string is the contents of one of the columns in that row and be sure that the syntax of my ADO.NET usage is clear that even if the string is, say, "SELECT ALL" or some such, SQL won't see the user's data as part of my SQL command.


What you're not considering is that what may constitute a valid URL may also be able to be executed as an SQL command.


I've worked at a datacenter, partially as a disaster recovery guy for customers. 99% of the time, its no/improper input sanitation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: