mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-12-01 20:12:02 +00:00
1081 lines
40 KiB
Plaintext
1081 lines
40 KiB
Plaintext
POCO Data User Guide
|
|
POCO Data Library
|
|
|
|
!!!First Steps
|
|
|
|
POCO Data is POCO's database abstraction layer which allows users to
|
|
easily send/retrieve data to/from various databases. Currently supported
|
|
database connectors are SQLite, MySQL and ODBC. Framework is opened
|
|
for extension, so additional native connectors (Oracle, PostgreSQL, ...)
|
|
can be added. The intent behind the Poco::Data framework is to produce the
|
|
integration between C++ and relational databses in a simple and natural way.
|
|
|
|
The following complete example shows how to use POCO Data:
|
|
|
|
#include "Poco/Data/Session.h"
|
|
#include "Poco/Data/SQLite/Connector.h"
|
|
#include <vector>
|
|
#include <iostream>
|
|
|
|
using namespace Poco::Data::Keywords;
|
|
using Poco::Data::Session;
|
|
using Poco::Data::Statement;
|
|
|
|
struct Person
|
|
{
|
|
std::string name;
|
|
std::string address;
|
|
int age;
|
|
};
|
|
|
|
int main(int argc, char** argv)
|
|
{
|
|
// register SQLite connector
|
|
Poco::Data::SQLite::Connector::registerConnector();
|
|
|
|
// create a session
|
|
Session session("SQLite", "sample.db");
|
|
|
|
// drop sample table, if it exists
|
|
session << "DROP TABLE IF EXISTS Person", now;
|
|
|
|
// (re)create table
|
|
session << "CREATE TABLE Person (Name VARCHAR(30), Address VARCHAR, Age INTEGER(3))", now;
|
|
|
|
// insert some rows
|
|
Person person =
|
|
{
|
|
"Bart Simpson",
|
|
"Springfield",
|
|
12
|
|
};
|
|
|
|
Statement insert(session);
|
|
insert << "INSERT INTO Person VALUES(?, ?, ?)",
|
|
use(person.name),
|
|
use(person.address),
|
|
use(person.age);
|
|
|
|
insert.execute();
|
|
|
|
person.name = "Lisa Simpson";
|
|
person.address = "Springfield";
|
|
person.age = 10;
|
|
|
|
insert.execute();
|
|
|
|
// a simple query
|
|
Statement select(session);
|
|
select << "SELECT Name, Address, Age FROM Person",
|
|
into(person.name),
|
|
into(person.address),
|
|
into(person.age),
|
|
range(0, 1); // iterate over result set one row at a time
|
|
|
|
while (!select.done())
|
|
{
|
|
select.execute();
|
|
std::cout << person.name << " " << person.address << " " << person.age << std::endl;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
----
|
|
|
|
The above example is pretty much self explanatory.
|
|
|
|
The <[using namespace Poco::Data ]> is for convenience only but highly
|
|
recommended for good readable code. While <[ses << "SELECT COUNT(*)
|
|
FROM PERSON", Poco::Data::Keywords::into(count), Poco::Data::Keywords::now;]>
|
|
is valid, the aesthetic aspect of the code is improved by eliminating the need
|
|
for full namespace qualification; this document uses convention introduced in
|
|
the example above.
|
|
|
|
The remainder of this tutorial is split up into the following parts:
|
|
|
|
* Sessions
|
|
* Inserting and Retrieving Data
|
|
* Statements
|
|
* STL Containers
|
|
* Tuples
|
|
* Limits, Ranges and Steps
|
|
* <[RecordSets]>, Iterators and Rows
|
|
* Complex data types: how to map C++ objects to a database table
|
|
* Conclusion
|
|
|
|
|
|
!!!Creating Sessions
|
|
|
|
Sessions are created via the Session constructor:
|
|
|
|
Session session("SQLite", "sample.db");
|
|
----
|
|
|
|
The first parameter contains the type of the Session one wants to create.
|
|
Currently, supported backends are "SQLite", "ODBC" and "MySQL". The second
|
|
parameter contains the connection string.
|
|
|
|
In the case of SQLite, the path of the database file is sufficient as connection string.
|
|
|
|
For ODBC, the connection string may be a simple "DSN=MyDSNName" when a DSN is configured or
|
|
a complete ODBC driver-specific connection string defining all the necessary connection parameters
|
|
(for details, consult your ODBC driver documentation).
|
|
|
|
For MySQL, the connection string is a semicolon-delimited list of name-value pairs
|
|
specifying various parameters like host, port, user, password, database, compression and
|
|
automatic reconnect. Example: <["host=localhost;port=3306;db=mydb;user=alice;password=s3cr3t;compress=true;auto-reconnect=true"]>
|
|
|
|
|
|
!!!Inserting and Retrieving Data
|
|
|
|
!!Single Data Sets
|
|
|
|
Inserting data works by <[using]> the content of other variables.
|
|
Assume we have a table that stores only forenames:
|
|
|
|
ForeName (Name VARCHAR(30))
|
|
----
|
|
|
|
If we want to insert one single forename we could simply write:
|
|
|
|
std::string aName("Peter");
|
|
session << "INSERT INTO FORENAME VALUES(" << aName << ")", now;
|
|
----
|
|
|
|
However, a better solution is to use <*placeholders*> and connect each
|
|
placeholder via a <[use]> expression with a variable that will provide
|
|
the value during execution. Placeholders, depending on your database are
|
|
recognized by having either a colon(<[:]>) in front of the name or
|
|
simply by a question mark (<[?]>) as a placeholder. While having the
|
|
placeholders marked with a colon followed by a human-readable name is
|
|
very convenient due to readability, not all SQL dialects support this and
|
|
universally accepted standard placeholder is (<[?]>). Consult your database
|
|
SQL documentation to determine the valid placeholder syntax.
|
|
|
|
Rewriting the above code now simply gives:
|
|
|
|
std::string aName("Peter");
|
|
ses << "INSERT INTO FORENAME VALUES(?)", use(aName), now;
|
|
----
|
|
|
|
In this example the <[use]> expression matches the placeholder with the
|
|
<[Peter]> value. Note that apart from the nicer syntax, the real benefits of
|
|
placeholders -- which are performance and protection against SQL injection
|
|
attacks -- don't show here. Check the <[Statements]> section to find out more.
|
|
|
|
Retrieving data from the Database works similar. The <[into]>
|
|
expression matches the returned database values to C++ objects, it also
|
|
allows to provide a default value in case null data is returned from the
|
|
database:
|
|
|
|
std::string aName;
|
|
ses << "SELECT NAME FROM FORENAME", into(aName), now;
|
|
ses << "SELECT NAME FROM FORENAME", into(aName, 0, "default"), now;
|
|
|
|
You'll note the integer zero argument in the second into() call. The reason for
|
|
that is that Poco::Data supports multiple result sets for those databases/drivers
|
|
that have such capbility and we have to indicate the resultset we are referring to.
|
|
Attempting to create sufficient overloads of <[into()]> creates more trouble than
|
|
what it's worth and null values can effectively be dealt with through use of either
|
|
Poco::Nullable wrapper (see Handling Null Entries later in this document) or
|
|
Poco::Dynamic::Var, which will be set as empty for null values when used as query
|
|
output target.
|
|
----
|
|
|
|
It is also possible to combine into and use expressions:
|
|
|
|
std::string aName;
|
|
std::string match("Peter")
|
|
ses << "SELECT NAME FROM FORENAME WHERE NAME=?", into(aName), use(match), now;
|
|
poco_assert (aName == match);
|
|
----
|
|
|
|
Typically, tables will not be so trivial, i.e. they will have more than
|
|
one column which allows for more than one into/use.
|
|
|
|
Lets assume we have a Person table that contains an age, a first and a last name:
|
|
|
|
std::string firstName("Peter";
|
|
std::string lastName("Junior");
|
|
int age = 0;
|
|
ses << INSERT INTO PERSON VALUES (?, ?, ?)", use(firstName), use(lastName), use(age), now;
|
|
ses << "SELECT (firstname, lastname, age) FROM Person", into(firstName), into(lastName), into(age), now;
|
|
----
|
|
|
|
Most important here is the <!order!> of the into and use expressions.
|
|
The first placeholder is matched by the first <[use]>, the 2nd by the
|
|
2nd <[use]> etc.
|
|
|
|
The same is true for the <[into]> statement. We select <[firstname]> as
|
|
the first column of the result set, thus <[into(firstName)]> must be the
|
|
first into clause.
|
|
|
|
|
|
!! Handling NULL entries
|
|
A common case with databases are optional data fields that can contain NULL.
|
|
To accomodate for NULL, use the Poco::Nullable template:
|
|
|
|
std::string firstName("Peter";
|
|
Poco::Nullable<std::string> lastName("Junior");
|
|
Poco::Nullable<int> age = 0;
|
|
ses << INSERT INTO PERSON VALUES (?, ?, ?)", use(firstName), use(lastName), use(age), now;
|
|
ses << "SELECT (firstname, lastname, age) FROM Person", into(firstName), into(lastName), into(age), now;
|
|
// now you can check if age was null:
|
|
if (!lastName.isNull()) { ... }
|
|
----
|
|
|
|
The above used Poco::Nullable is a lightweight template class, wrapping any type
|
|
for the purpose of allowing it to have null value.
|
|
|
|
If the returned value was null, age.isNull() will return true. Whether empty
|
|
string is null or not is more of a philosophical question (a topic for discussion
|
|
in some other time and place); for the purpose of this document, suffice it to say
|
|
that different databases handle it differently and Poco::Data provides a way to
|
|
tweak it to user's needs through folowing <[Session]> features:
|
|
|
|
*emptyStringIsNull
|
|
*forceEmptyString
|
|
|
|
So, if your database does not treat empty strings as null but you want Poco::Data
|
|
to emulate such behavior, modify the session like this:
|
|
|
|
ses.setFeature("emptyStringIsNull", true);
|
|
|
|
On the other side, if your database treats empty strings as nulls but you do not
|
|
want it to, you'll alter the session feature:
|
|
|
|
ses.setFeature("forceEmptyString", true);
|
|
|
|
Obviously, the above features are mutually exclusive; an attempt to se them both
|
|
to true will result in an exception being thrown by the Data framework.
|
|
|
|
!! Multiple Data Sets
|
|
|
|
Batches of statements are supported. They return multiple sets of data,
|
|
so into() call needs and additional parameter to determine which data
|
|
set it belongs to:
|
|
|
|
typedef Tuple<std::string, std::string, std::string, int> Person;
|
|
std::vector<Person> people;
|
|
Person pHomer, pLisa;
|
|
int aHomer(42), aLisa(10), aBart(0);
|
|
|
|
session << "SELECT * FROM Person WHERE Age = ?; "
|
|
"SELECT Age FROM Person WHERE FirstName = 'Bart'; "
|
|
"SELECT * FROM Person WHERE Age = ?",
|
|
into(pHomer, 0), use(aHomer),
|
|
into(aBart, 1),
|
|
into(pLisa, 2), use(aLisa),
|
|
now;
|
|
----
|
|
|
|
! Note
|
|
|
|
Batches of statements can be used, provided, of course, that the
|
|
target driver and database engine properly support them. Additionally,
|
|
the exact SQL syntax may vary for different databases. Stored procedures
|
|
(see below) returning multiple data sets are handled in the same way.
|
|
|
|
|
|
!! Now
|
|
|
|
And now, finally, a word about the <[now]> keyword. The simple description is:
|
|
it is a manipulator. As it's name implies, it forces the immediate
|
|
execution of the statement. If <[now]> is not present, the statement
|
|
must be executed separately in order for anything interesting to happen.
|
|
|
|
More on statements and manipulators in the chapters that follow.
|
|
|
|
|
|
|
|
!! Stored Procedures And Functions Support
|
|
|
|
Most of the modern database systems support stored procedures and/or
|
|
functions. Does Poco::Data provide any support there? You bet.
|
|
While the specifics on what exactly is possible (e.g. the data types
|
|
passed in and out, automatic or manual data binding, binding direction,
|
|
etc.) is ultimately database dependent, POCO Data does it's
|
|
best to provide reasonable access to such functionality through <[in]>,
|
|
<[out]> and <[io]> binding functions. As their names imply, these
|
|
functions are performing parameters binding tho pass in or receive from
|
|
the stored procedures, or both. The code is worth thousand words, so
|
|
here's an Oracle ODBC example:
|
|
|
|
session << "CREATE OR REPLACE "
|
|
"FUNCTION storedFunction(param1 IN OUT NUMBER, param2 IN OUT NUMBER) RETURN NUMBER IS "
|
|
" temp NUMBER := param1; "
|
|
" BEGIN param1 := param2; param2 := temp; RETURN(param1+param2); "
|
|
" END storedFunction;" , now;
|
|
|
|
int i = 1, j = 2, result = 0;
|
|
session << "{? = call storedFunction(?, ?)}", out(result), io(i), io(j), now; // i = 2, j = 1, result = 3
|
|
----
|
|
|
|
|
|
Stored procedures are allowed to return data sets (a.k.a. cursors):
|
|
|
|
typedef Tuple<std::string, std::string, std::string, int> Person;
|
|
std::vector<Person> people;
|
|
int age = 13;
|
|
session << "CREATE OR REPLACE "
|
|
"FUNCTION storedCursorFunction(ageLimit IN NUMBER) RETURN SYS_REFCURSOR IS "
|
|
" ret SYS_REFCURSOR; "
|
|
"BEGIN "
|
|
" OPEN ret FOR "
|
|
" SELECT * FROM Person WHERE Age < ageLimit; "
|
|
" RETURN ret; "
|
|
"END storedCursorFunction;" , now;
|
|
|
|
session << "{call storedCursorFunction(?)}", in(age), into(people), now;
|
|
----
|
|
|
|
The code shown above works with Oracle databases.
|
|
|
|
|
|
!! A Word of Warning
|
|
|
|
As you may have noticed, in the above example, C++ code works very
|
|
closely with SQL statements. And, as you know, your C++ compiler has no
|
|
clue what SQL is (other than a string of characters). So it is <*your
|
|
responsibility*> to make sure your SQL statements have the proper
|
|
structure that corresponds to the number and type of the supplied
|
|
functions.
|
|
|
|
|
|
!!!Statements
|
|
|
|
We often mentioned the term <*Statement*> in the previous section, but
|
|
with the exception of the initial example, we have only worked with
|
|
database session objects so far. Or at least, that's what we made you
|
|
believe.
|
|
|
|
In reality, you have already worked with Statements. Lets take a look at
|
|
the method signature of the << operator at Session:
|
|
|
|
template <typename T>
|
|
Statement Session::operator << (const T& t);
|
|
----
|
|
|
|
Simply ignore the template stuff in front, you won't need it. The only
|
|
thing that counts here is that the operator << creates a <[Statement]>
|
|
internally and returns it.
|
|
|
|
What happened in the previous examples is that the returned Statement
|
|
was never assigned to a variable but simply passed on to the <[now]>
|
|
part which executed the statement. Afterwards the statement was
|
|
destroyed.
|
|
|
|
Let's take one of the previous examples and change it so that we assign the statement:
|
|
|
|
std::string aName("Peter");
|
|
Statement stmt = ( ses << "INSERT INTO FORENAME VALUES(?)", use(aName) );
|
|
----
|
|
|
|
Note that the brackets around the right part of the assignment are
|
|
mandatory, otherwise the compiler will complain.
|
|
|
|
What did we achieve by assigning the statement to a variable? Two
|
|
things: Control when to <[execute]> and the possibility to create a RecordSet
|
|
(described in its own chapter below).
|
|
|
|
Here's how we control when to actually execute the statement:
|
|
|
|
std::string aName("Peter");
|
|
Statement stmt = ( ses << "INSERT INTO FORENAME VALUES(?)", use(aName) );
|
|
stmt.execute();
|
|
poco_assert (stmt.done());
|
|
----
|
|
|
|
By calling <[execute]> we asserted that our query was executed and that
|
|
the value was inserted. The check to <[stmt.done()]> simply guarantees that the
|
|
statement was fully completed.
|
|
|
|
|
|
|
|
!!Prepared Statements
|
|
|
|
A prepared statement is created by omitting the "now" clause.
|
|
|
|
Statement stmt = ( ses << "INSERT INTO FORENAME VALUES(?)", use(aName) );
|
|
----
|
|
|
|
The advantage of a prepared statement is performance. Assume the following loop:
|
|
|
|
std::string aName;
|
|
Statement stmt = ( ses << "INSERT INTO FORENAME VALUES(?)", use(aName) );
|
|
for (int i = 0; i < 100; ++i)
|
|
{
|
|
aName.append("x");
|
|
stmt.execute();
|
|
}
|
|
----
|
|
|
|
Instead of creating and parsing the Statement 100 times, we only do this
|
|
once and then use the placeholder in combination with the <[use]> clause
|
|
to insert 100 different values into the database.
|
|
|
|
Still, this isn't the best way to insert a collection of values into a
|
|
database. Poco::Data is STL-aware and will cooperate with STL containers
|
|
to extract multiple rows from the database. More on that in the chapter
|
|
titled "STL Containers".
|
|
|
|
!!Asynchronous Execution
|
|
|
|
So far, the statements were executing synchronously. In other words,
|
|
regardless of whether the <[execute()]> method was invoked indirectly
|
|
through <[now]> manipulator or through direct method call, it did not
|
|
return control to the caller until the requested execution was
|
|
completed. This behavior can be changed, so that <[execute()]> returns
|
|
immediately, while, in fact, it keeps on running in a separate thread.
|
|
This paragraph explains how this behavior can be achieved as well as
|
|
warns about the dangers associated with asynchronous execution.
|
|
|
|
Asynchronous execution can be invoked on any statement, through the
|
|
direct call to executeAsync() method. This method returns a <[const]>
|
|
reference to <[Statement::Result]>. This reference can be used at a
|
|
later time to ensure completion of the background execution and, for
|
|
those statements that return rows, find out how many rows were
|
|
retrieved.
|
|
|
|
Here's the code:
|
|
|
|
Statement stmt = (ses << "SELECT (firstname, lastname, age) FROM Person", into(firstName), into(lastName), into(age));
|
|
Statement::Result result = stmt.executeAsync();
|
|
// ... do something else
|
|
Statement::ResultType rows = result.wait();
|
|
----
|
|
|
|
The above code did not do anything "under the hood" to change the
|
|
statement's nature. If we call <[execute()]> afterwards, it will execute
|
|
synchronously as usual. There is, however, a way (or two) to turn the
|
|
statement into asynchronous mode permanently.
|
|
|
|
First, there is an explicit <[setAync()]> call:
|
|
|
|
Statement stmt = (ses << "SELECT (age) FROM Person", into(age));
|
|
stmt.setAsync(true); // make stmt asynchronous
|
|
stmt.execute(); // executes asynchronously
|
|
// ... do something else
|
|
Statement::ResultType rows = stmt.wait(); // synchronize and retrieve the number of rows
|
|
----
|
|
|
|
And, then, there is also the <[async]> manipulator that has the same effect as the <[setAync(true)]> code above:
|
|
|
|
Statement stmt = (ses << "SELECT (age) FROM Person", into(age), async); // asynchronous statement
|
|
stmt.execute(); // executes asynchronously
|
|
// ... do something else
|
|
Statement::ResultType rows = stmt.wait();
|
|
----
|
|
|
|
|
|
!Note
|
|
|
|
In the first example, we have received <[Result]> from the statement,
|
|
while in the second two, we did not assign the return value from
|
|
<[execute()]>. The <[Result]> returned from <[executeAsync()]> is also
|
|
known as <[future]> -- a variable holding a result that will be known at
|
|
some point in future. The reason for not keeping the <[execute()]>
|
|
return value is because, for asynchronous statements, <[execute()]>
|
|
always returns zero. This makes sense, because it does not know the
|
|
number of returned rows (remember, asynchronous <[execute()]> call
|
|
returns <[immediately]> and does not wait for the completion of the
|
|
execution).
|
|
|
|
!A Word of Warning
|
|
|
|
With power comes responsibility. When executing asynchronously, make
|
|
sure to <[synchronize]> accordingly. When you fail to synchronize
|
|
explicitly, you may encounter all kinds of funny things happening.
|
|
Statement does internally try to protect you from harm, so the following
|
|
code will <*usually*> throw <[InvalidAccessException]>:
|
|
|
|
Statement stmt = (ses << "SELECT (age) FROM Person", into(age), async); // asynchronous statement
|
|
Statement::Result result = stmt.execute(); // executes asynchronously
|
|
stmt.execute(); // throws InvalidAccessException
|
|
----
|
|
|
|
We say "usually", because it may not happen every time, depending
|
|
whether the first <[execute()]> call completed in the background prior
|
|
to calling the second one. Therefore, to avoid unpleasant surprises, it
|
|
is highly recommended to <*always*> call <[wait()]> on either the
|
|
statement itself or the result (value returned from <[executeAsync()]>)
|
|
prior to engaging into a next attempt to execute.
|
|
|
|
|
|
!!Things NOT To Do
|
|
|
|
The <[use]> keyword expects as input a <[reference]> parameter, which is bound
|
|
later during execution. Thus, one should never pass temporaries to <[use()]>:
|
|
|
|
Statement stmt = (ses << "INSERT INTO PERSON VALUES (?, ?, ?)", use(getForename()), use(getSurname()), use(getAge())); //!!!
|
|
// do something else
|
|
stmt.execute(); // oops!
|
|
----
|
|
|
|
It is possible to use <[bind()]> instead of <[use()]>. The <[bind()]> call will always create a
|
|
copy of the supplied argument. Also, it is possible to execute a statement returning
|
|
data without supplying the storage and have the statement itself store the returned
|
|
data for later retrieval through <[RecordSet]>. For details, see <[RecordSet]> chapter.
|
|
|
|
|
|
!!Things TO Do
|
|
|
|
Constants, as well as naked variables (of POD and std::string
|
|
types) are permitted in the comma-separated list passed to statement.
|
|
The following example is valid:
|
|
|
|
std::string fname = "Bart";
|
|
std::string lname = "Simpson";
|
|
int age = 42;
|
|
Statement stmt = (ses << "INSERT INTO %s VALUES (?, ?, %d)", "PERSON", use(fname), use(lname), 12);
|
|
stmt.execute();
|
|
----
|
|
|
|
Placeholders for values are very similar (but not identical) to standard
|
|
printf family of functions. For details refer to <[Poco::format()]>
|
|
documentation. Note: If you are alarmed by mention of <[printf()]>, a
|
|
well-known source of many security problems in C and C++ code, do not
|
|
worry. Poco::format() family of functions is <[safe]> (and, admittedly,
|
|
slower than printf).
|
|
|
|
For cases where this type of formatting is used with queries containing
|
|
the percent sign, use double percent ("%%"):
|
|
|
|
Statement stmt = (ses << "SELECT * FROM %s WHERE Name LIKE 'Simp%%'", "Person");
|
|
stmt.execute();
|
|
----
|
|
|
|
yields the following SQL statement string:
|
|
|
|
SELECT * FROM Person WHERE Name LIKE 'Simp%'
|
|
----
|
|
|
|
!!!STL Containers
|
|
|
|
To handle many values at once, which is a very common scenario in database access, STL containers are used.
|
|
|
|
The framework supports the following container types out-of-the-box:
|
|
|
|
* deque: no requirements
|
|
* vector: no requirements
|
|
* list: no requirements
|
|
* set: the < operator must be supported by the contained datatype. Note that duplicate key/value pairs are ignored.
|
|
* multiset: the < operator must be supported by the contained datatype
|
|
* map: the () operator must be supported by the contained datatype and return the key of the object. Note that duplicate key/value pairs are ignored.
|
|
* multimap: the () operator must be supported by the contained datatype and return the key of the object
|
|
|
|
A "one-at-atime" bulk insert example via vector would be:
|
|
|
|
std::string aName;
|
|
std::vector<std::string> data;
|
|
for (int i = 0; i < 100; ++i)
|
|
{
|
|
aName.append("x");
|
|
data.push_back(aName);
|
|
}
|
|
ses << "INSERT INTO FORENAME VALUES(?)", use(data), now;
|
|
----
|
|
|
|
The same example would work with list, deque, set or multiset but not with map and multimap (std::string has no () operator).
|
|
|
|
Note that <[use]> requires a <*non-empty*> container!
|
|
|
|
Now reconsider the following example:
|
|
|
|
std::string aName;
|
|
ses << "SELECT NAME FROM FORENAME", into(aName), now;
|
|
----
|
|
|
|
Previously, it worked because the table contained only one single entry
|
|
but now the database table contains at least 100 strings, yet we only
|
|
offer storage space for one single result.
|
|
|
|
Thus, the above code will fail and throw an exception.
|
|
|
|
One possible way to handle this is:
|
|
|
|
std::vector<std::string> names;
|
|
ses << "SELECT NAME FROM FORENAME", into(names), now;
|
|
----
|
|
|
|
And again, instead of vector, one could use deque, list, set or multiset.
|
|
|
|
!!Things NOT To Do
|
|
|
|
C++ containers in conjunction with stored procedures input parameters
|
|
(i.e <[in]> and <[io]> functions) are not supported. Furthermore, there
|
|
is one particular container which, due to its peculiar nature, <!can
|
|
not!> be used in conjunction with <[out]> and <[io]> under any
|
|
circumstances: <[std::vector<bool>]> . The details are beyond the scope
|
|
of this manual. For those interested to learn more about it, there is an
|
|
excellent explanation in S. Meyers book "Efective STL", Item 18 or Gotw
|
|
#50, [[http://www.gotw.ca/gotw/050.htm When Is a Container Not a
|
|
Container]] paragraph.
|
|
|
|
|
|
!!!Tuples
|
|
|
|
Complex user-defined data types are supported through type handlers as
|
|
described in one of the chapters below. However, in addition to STL
|
|
containers, which are supported through binding/extraction there is
|
|
another complex data type supported by POCO Data
|
|
"out-of-the-box". The type is Poco::Tuple. The detailed
|
|
description is beyond the scope of this manual, but suffice it to say
|
|
here that this data structure allows for convenient and type-safe mix of
|
|
different data types resulting in a perfect C++ match for the table row.
|
|
Here's the code to clarify the point:
|
|
|
|
typedef Poco::Tuple<std::string, std::string, int> Person;
|
|
Person person("Bart Simpson", "Springfield", 12)
|
|
session << "INSERT INTO Person VALUES(?, ?, ?)", use(person), now;
|
|
----
|
|
|
|
Automagically, POCO Data internally takes care of the data
|
|
binding intricacies for you. Of course, as before, it is programmer's
|
|
responsibility to make sure the Tuple data types correspond to the table
|
|
column data types.
|
|
|
|
I can already see the reader wondering if it's possible to put tuples in
|
|
a container and kill more than one bird with one stone. As usual,
|
|
POCO Data will not disappoint you:
|
|
|
|
typedef Poco::Tuple<std::string, std::string, int> Person;
|
|
typedef std::vector<Person> People;
|
|
People people;
|
|
people.push_back(Person("Bart Simpson", "Springfield", 12));
|
|
people.push_back(Person("Lisa Simpson", "Springfield", 10));
|
|
session << "INSERT INTO Person VALUES(?, ?, ?)", use(people), now;
|
|
----
|
|
|
|
|
|
And thats it! There are multiple columns and multiple rows contained in
|
|
a single variable and inserted in one shot. Needless to say, the reverse
|
|
works as well:
|
|
|
|
session << "SELECT Name, Address, Age FROM Person", into(people), now;
|
|
----
|
|
|
|
|
|
!!!Limits and Ranges
|
|
|
|
!!Limit
|
|
|
|
Working with collections might be convenient to bulk process data but
|
|
there is also the risk that large operations will block your application
|
|
for a very long time. In addition, you might want to have better
|
|
fine-grained control over your query, e.g. you only want to extract a
|
|
subset of data until a condition is met.
|
|
|
|
To alleviate that problem, one can use the <[limit]> keyword.
|
|
|
|
Let's assume we are retrieving thousands of rows from a database to
|
|
render the data to a GUI. To allow the user to stop fetching data any
|
|
time (and to avoid having the user frantically click inside the GUI
|
|
because it doesn't show anything for seconds), we have to partition this
|
|
process:
|
|
|
|
std::vector<std::string> names;
|
|
ses << "SELECT NAME FROM FORENAME", into(names), limit(50), now;
|
|
----
|
|
|
|
The above example will retrieve up to 50 rows from the database (note
|
|
that returning nothing is valid!) and <*append*> it to the names
|
|
collection, i.e. the collection is not cleared!
|
|
|
|
If one wants to make sure that <*exactly*> 50 rows are returned one must
|
|
set the second limit parameter (which per default is set to <[false]>) to
|
|
<[true]>:
|
|
|
|
std::vector<std::string> names;
|
|
ses << "SELECT NAME FROM FORENAME", into(names), limit(50, true), now;
|
|
----
|
|
|
|
Iterating over a complete result collection is done via the Statement
|
|
object until <[statement.done()]> returns true.
|
|
|
|
For the next example, we assume that our system knows about 101 forenames:
|
|
|
|
std::vector<std::string> names;
|
|
Statement stmt = (ses << "SELECT NAME FROM FORENAME", into(names), limit(50));
|
|
stmt.execute(); //names.size() == 50
|
|
poco_assert (!stmt.done());
|
|
stmt.execute(); //names.size() == 100
|
|
poco_assert (!stmt.done());
|
|
stmt.execute(); //names.size() == 101
|
|
poco_assert (stmt.done());
|
|
----
|
|
|
|
We previously stated that if no data is returned this is valid too. Thus, executing the following statement on an
|
|
empty database table will work:
|
|
|
|
std::string aName;
|
|
ses << "SELECT NAME FROM FORENAME", into(aName), now;
|
|
----
|
|
|
|
To guarantee that at least one valid result row is returned use the <[lowerLimit]> clause:
|
|
|
|
|
|
std::string aName;
|
|
ses << "SELECT NAME FROM FORENAME", into(aName), lowerLimit(1), now;
|
|
----
|
|
|
|
If the table is now empty, an exception will be thrown. If the query
|
|
succeeds, aName is guaranteed to be initialized.
|
|
Note that <[limit]> is only the short name for <[upperLimit]>. To
|
|
iterate over a result set step-by-step, e.g. one wants to avoid using a
|
|
collection class, one would write
|
|
|
|
std::string aName;
|
|
Statement stmt = (ses << "SELECT NAME FROM FORENAME", into(aName), lowerLimit(1), upperLimit(1));
|
|
while (!stmt.done()) stmt.execute();
|
|
----
|
|
|
|
|
|
!!Range
|
|
|
|
For the lazy folks, there is the <[range]> command:
|
|
|
|
std::string aName;
|
|
Statement stmt = (ses << "SELECT NAME FROM FORENAME", into(aName), range(1,1));
|
|
while (!stmt.done()) stmt.execute();
|
|
----
|
|
|
|
The third parameter to range is an optional boolean value which
|
|
specifies if the upper limit is a hard limit, ie. if the amount of rows
|
|
returned by the query must match exactly. Per default exact matching is
|
|
off.
|
|
|
|
|
|
!!!Bulk
|
|
|
|
The <[bulk]> keyword allows to boost performance for the connectors that
|
|
support column-wise operation and arrays of values and/or parameters
|
|
(e.g. ODBC).
|
|
Here's how to signal bulk insertion to the statement:
|
|
|
|
std::vector<int> ints(100, 1);
|
|
session << "INSERT INTO Test VALUES (?)", use(ints, bulk), now;
|
|
----
|
|
|
|
The above code will execute a "one-shot" insertion into the target table.
|
|
|
|
|
|
Selection in bulk mode looks like this:
|
|
|
|
std::vector<int> ints;
|
|
session << "SELECT * FROM Test", into(ints, bulk(100)), now;
|
|
----
|
|
|
|
Note that, when fetching data in bulk quantities, we must provide the
|
|
size of data set we want to fetch, either explicitly as in the code
|
|
above or implicitly, through size of the supplied container as in
|
|
following example:
|
|
|
|
std::vector<int> ints(100, 1);
|
|
session << "SELECT * FROM Test", into(ints, bulk), now;
|
|
----
|
|
|
|
For statements that generate their ow internal extraction storage (see
|
|
RecordSet chapter below), bulk execution can be specified as follows:
|
|
|
|
session << "SELECT * FROM Test", bulk(100), now;
|
|
----
|
|
|
|
|
|
!!Usage Notes
|
|
|
|
When using bulk mode, execution limit is set internally. Mixing of
|
|
<[bulk]> and <[limit]> keywords, although redundant, is allowed as long
|
|
as they do not conflict in the value they specify.
|
|
|
|
Bulk operations are only supported for following STL containers:
|
|
|
|
* std::deque
|
|
* std::list
|
|
* std::vector, including std::vector<bool>, which is properly handled internally
|
|
|
|
For best results with <[use()]>, when passing POD types, it is
|
|
recommended to use std::vector as it is passed directly as supplied by
|
|
the user. For all the other scenarios (other containers as well as
|
|
non-POD types), framework will create temporary storage.
|
|
|
|
Data types supported are:
|
|
|
|
* All POD types
|
|
* std::string
|
|
* Poco::Data::LOB (with BLOB and CLOB specializations)
|
|
* Poco::DateTime
|
|
* Poco::Data::Date
|
|
* Poco::Data::Time
|
|
* Poco::Dynamic::Var
|
|
|
|
!!Important Considerations
|
|
|
|
Not all the connectors support <[bulk]> and some support it only to an
|
|
extent, depending on the target system. Also, not all value types
|
|
perform equally when used for bulk operations. To determine the optimal
|
|
use in a given scenario, knowledge of the target system as well as some
|
|
degree of experimentation is needed because different connectors and
|
|
target systems shall differ in performance gains. In some scenarios, the
|
|
gain is significant. For example, Oracle ODBC driver performs roughly
|
|
400-500 times faster when bulk-inserting a std::vector of 10,000
|
|
integers. However, when variable-sized entities, such as strings and
|
|
BLOBs are brought into the picture, performance decreases drastically.
|
|
So, all said, it is left to the end-user to make the best of this
|
|
feature.
|
|
|
|
!!! RecordSets, Iterators and Rows
|
|
|
|
In all the examples so far the programmer had to supply the storage for
|
|
data to be inserted or retrieved from a database.
|
|
|
|
It is usually desirable to avoid that and let the framework take care of
|
|
it, something like this:
|
|
|
|
session << "SELECT * FROM Person", now; // note the absence of target storage
|
|
----
|
|
|
|
No worries -- that's what the RecordSet class does:
|
|
|
|
Statement select(session); // we need a Statement for later RecordSet creation
|
|
select << "SELECT * FROM Person", now;
|
|
|
|
// create a RecordSet
|
|
RecordSet rs(select);
|
|
std::size_t cols = rs.columnCount();
|
|
|
|
// print all column names
|
|
for (std::size_t col = 0; col < cols; ++col)
|
|
std::cout << rs.columnName(col) << std::endl;
|
|
|
|
// iterate over all rows and columns
|
|
for (RecordSet::Iterator it = rs.begin(); it != rs.end(); ++it)
|
|
std::cout << *it << " ";
|
|
----
|
|
|
|
As you may see above, <[RecordSet]> class comes with a full-blown C++
|
|
compatible iterator that allows the above loop to be turned into a
|
|
one-liner:
|
|
|
|
std::copy(rs.begin(), rs.end(), std::ostream_iterator<Row>(std::cout));
|
|
----
|
|
|
|
RecordSet has the stream operator defined, so this shortcut to the above functionality will work, too:
|
|
|
|
std::cout << rs;
|
|
----
|
|
|
|
The default formatter supplied with RecordSet is quite rudimentary, but
|
|
user can implement custom formatters, by inheriting from RowFormatter
|
|
and providing definitions of formatNames() and formatValues() virtual
|
|
functions. See the RowFormatter sample for details on how to accomplish this.
|
|
|
|
You'll notice the Row class in the above snippet. The
|
|
<[RecordSet::Iterator]> is actually a Poco::Data::RowIterator. It means that
|
|
dereferencing it returns a Poco::Data::Row object. Here's a brief example to get an
|
|
idea of what the Poco::Data::Row class does:
|
|
|
|
Row row;
|
|
row.append("Field0", 0);
|
|
row.append("Field1", 1);
|
|
row.append("Field2", 2);
|
|
----
|
|
|
|
The above code creates a row with three fields, "Field0", "Field1" and
|
|
"Field2", having values 0, 1 and 2, respectively. Rows are sortable,
|
|
which makes them suitable to be contained by standard sorted containers,
|
|
such as std::map or std::set. By default, the first field of the row is
|
|
used for sorting purposes. However, the sort criteria can be modified at
|
|
runtime. For example, an additional field may be added to sort fields
|
|
(think "... ORDER BY Name ASC, Age DESC"):
|
|
|
|
row.addSortField("Field1"); // now Field0 and Field1 are used for sorting
|
|
row.replaceSortField("Field0", "Field2");// now Field1 and Field2 are used for sorting
|
|
----
|
|
|
|
Finally, if you have a need for different RecordSet internal storage
|
|
type than default (std::deque) provided by framework, there is a
|
|
manipulator for that purpose:
|
|
|
|
select << "SELECT * FROM Person", list, now; // use std::list as internal storage container
|
|
----
|
|
|
|
This can be very useful if you plan to manipulate the data after
|
|
retrieving it from database. For example, std::list performs much better
|
|
than std::vector for insert/delete operations and specifying it up-front
|
|
as internal storage saves you the copying effort later. For large
|
|
datasets, performance savings are significant.
|
|
|
|
Valid storage type manipulators are:
|
|
|
|
*deque (default)
|
|
*vector
|
|
*list
|
|
|
|
So, if neither data storage, nor storage type are explicitly specified,
|
|
the data will internally be kept in standard deques. This can be changed
|
|
through use of storage type manipulators.
|
|
|
|
|
|
!!!Complex Data Types
|
|
|
|
All the previous examples were contented to work with only the most
|
|
basic data types: integer, string, ... a situation, unlikely to occur in real-world scenarios.
|
|
|
|
Assume you have a class Person:
|
|
|
|
class Person
|
|
{
|
|
public:
|
|
// default constructor+destr.
|
|
// getter and setter methods for all members
|
|
// ...
|
|
|
|
bool operator <(const Person& p) const
|
|
/// we need this for set and multiset support
|
|
{
|
|
return _socialSecNr < p._socialSecNr;
|
|
}
|
|
|
|
Poco::UInt64 operator()() const
|
|
/// we need this operator to return the key for the map and multimap
|
|
{
|
|
return _socialSecNr;
|
|
}
|
|
|
|
private:
|
|
std::string _firstName;
|
|
std::string _lastName;
|
|
Poco::UInt64 _socialSecNr;
|
|
};
|
|
----
|
|
|
|
Ideally, one would like to use a Person as simple as one used a string.
|
|
All that is needed is a template specialization of the <[TypeHandler]>
|
|
template. Note that template specializations must be declared in the
|
|
<*same namespace*> as the original template, i.e. <[Poco::Data]>.
|
|
The template specialization must implement the following methods:
|
|
|
|
namespace Poco {
|
|
namespace Data {
|
|
|
|
template <>
|
|
class TypeHandler<class Person>
|
|
{
|
|
public:
|
|
static void bind(std::size_t pos, const Person& obj, AbstractBinder::Ptr pBinder, AbstractBinder::Direction dir)
|
|
{
|
|
poco_assert_dbg (!pBinder.isNull());
|
|
// the table is defined as Person (FirstName VARCHAR(30), lastName VARCHAR, SocialSecNr INTEGER(3))
|
|
// Note that we advance pos by the number of columns the datatype uses! For string/int this is one.
|
|
TypeHandler<std::string>::bind(pos++, obj.getFirstName(), pBinder, dir);
|
|
TypeHandler<std::string>::bind(pos++, obj.getLastName(), pBinder, dir);
|
|
TypeHandler<Poco::UInt64>::bind(pos++, obj.getSocialSecNr(), pBinder, dir);
|
|
}
|
|
|
|
static std::size_t size()
|
|
{
|
|
return 3; // we handle three columns of the Table!
|
|
}
|
|
|
|
static void prepare(std::size_t pos, const Person& obj, AbstractPreparator::Ptr pPrepare)
|
|
{
|
|
poco_assert_dbg (!pPrepare.isNull());
|
|
// the table is defined as Person (FirstName VARCHAR(30), lastName VARCHAR, SocialSecNr INTEGER(3))
|
|
// Note that we advance pos by the number of columns the datatype uses! For string/int this is one.
|
|
TypeHandler<std::string>::prepare(pos++, obj.getFirstName(), pPrepare);
|
|
TypeHandler<std::string>::prepare(pos++, obj.getLastName(), pPrepare);
|
|
TypeHandler<Poco::UInt64>::prepare(pos++, obj.getSocialSecNr(), pPrepare);
|
|
}
|
|
|
|
static void extract(std::size_t pos, Person& obj, const Person& defVal, AbstractExtractor::Ptr pExt)
|
|
/// obj will contain the result, defVal contains values we should use when one column is NULL
|
|
{
|
|
poco_assert_dbg (!pExt.isNull());
|
|
std::string firstName;
|
|
std::string lastName;
|
|
Poco::UInt64 socialSecNr = 0;
|
|
TypeHandler<std::string>::extract(pos++, firstName, defVal.getFirstName(), pExt);
|
|
TypeHandler<std::string>::extract(pos++, lastName, defVal.getLastName(), pExt);
|
|
TypeHandler<Poco::UInt64>::extract(pos++, socialSecNr, defVal.getSocialSecNr(), pExt);
|
|
obj.setFirstName(firstName);
|
|
obj.setLastName(lastName);
|
|
obj.setSocialSecNr(socialSecNr);
|
|
}
|
|
|
|
private:
|
|
TypeHandler();
|
|
~TypeHandler();
|
|
TypeHandler(const TypeHandler&);
|
|
TypeHandler& operator=(const TypeHandler&);
|
|
};
|
|
|
|
} } // namespace Poco::Data
|
|
----
|
|
|
|
And that's all you have to do. Working with Person is now as simple as
|
|
working with a string:
|
|
|
|
std::map<Poco::UInt64, Person> people;
|
|
ses << "SELECT * FROM Person", into(people), now;
|
|
----
|
|
|
|
|
|
!!!Session Pooling
|
|
Creating a connection to a database is often a time consuming
|
|
operation. Therefore it makes sense to save a session object for
|
|
later reuse once it is no longer needed.
|
|
|
|
A Poco::Data::SessionPool manages a collection of sessions.
|
|
When a session is requested, the SessionPool first
|
|
looks in its set of already initialized sessions for an
|
|
available object. If one is found, it is returned to the
|
|
client and marked as "in-use". If no session is available,
|
|
the SessionPool attempts to create a new one for the client.
|
|
To avoid excessive creation of sessions, a limit
|
|
can be set on the maximum number of objects.
|
|
|
|
The following code fragment shows how to use the SessionPool:
|
|
|
|
SessionPool pool("ODBC", "...");
|
|
// ...
|
|
Session sess(pool.get());
|
|
----
|
|
|
|
Pooled sessions are automatically returned to the pool when the
|
|
Session variable holding them is destroyed.
|
|
|
|
One session pool, of course, holds sessions for one database
|
|
connection. For sessions to multiple databases, there is
|
|
SessionPoolContainer:
|
|
|
|
SessionPoolContainer spc;
|
|
AutoPtr<SessionPool> pPool1 = new SessionPool("ODBC", "DSN1");
|
|
AutoPtr<SessionPool> pPool2 = new SessionPool("ODBC", "DSN2");
|
|
spc.add(pPool1);
|
|
spc.add(pPool2);
|
|
----
|
|
|
|
!!!Conclusion
|
|
|
|
This document provides an overview of the most important features
|
|
offered by the POCO Data framework. The framework also supports LOB
|
|
(specialized to BLOB and CLOB) type as well as Poco::DateTime binding.
|
|
The usage of these data types is no different than any C++ type, so we
|
|
did not go into details here.
|
|
|
|
The great deal of <[RecordSet]> and <[Row]> runtime "magic" is achieved
|
|
through employment of Poco::Dynamic::Var, which is the POCO
|
|
equivalent of dynamic language data type. Obviously, due to its nature,
|
|
there is a run time performance penalty associated with Poco::Dynamic::Var,
|
|
but the internal details are beyond the scope of this document.
|
|
|
|
POCO Data tries to provide a broad spectrum of functionality,
|
|
with configurable efficiency/convenience ratio, providing a solid
|
|
foundation for quick development of database applications. We hope that,
|
|
by reading this manual and experimenting with code along the way, you
|
|
were able to get a solid understanding of the framework. We look forward
|
|
to hearing from you about POCO Data as well as this manual. We
|
|
also hope that you find both to be helpful aid in design of elegant and
|
|
efficient standard C++ database access software.
|