You are on page 1of 46

Understanding ATG Data Anywhere Architecture

Efficient, transactional data access without writing code using Dynamo Repositories

April 2002
ATG White Paper Pat Durante Senior Practice Manager, ATG Education Services

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Contents
1 2 Executive Summary The Challenge of the Data Access Problem
Hasnt the Data Access Problem Been Solved? Why Should You Care about the ATG Data Anywhere Architecture?

2 4
5 5

ATG Data Anywhere Architecture


Data Source Independence Understanding the ATG Data Anywhere Architecture Repository Basics Using Repository Data The Repository API RepositoryFormHandler Dynamo Servlet Beans and the Repository Query Language (RQL)

8
8 9 10 12 12 13 15

Less Java Code: Faster Time-to-Market, Less Maintenance


Using the Visitor Profile (Out-of-the-Box) Extending the Definition of a Repository Item Using a Simple Auxiliary Table (To model a one-to-one relationship) Using a "Multi" Table (to model a one-to-many relationship) Switching to an Alternative Relational Database Management System Converting from one type of database to another

17
22 23 23 25 26 27

5 6

A Unified View of Customer Interactions Maximum Performance Through Intelligent Caching


Case 1: Single Dynamo Server Case 2: Read frequently, modify rarely or never Case 3: Modifications made by one Dynamo server at a time Case 4: Modification by multiple Dynamo servers Case 5: Modification by a non-Dynamo Application Disabling Caching Invalidating the Cache Controlling the Cache Sizes

28 30
30 31 31 31 32 32 32 33

Simplified Transactional Control


Overview of Transactional Integrity The J2EE Approach to Transactional Integrity ATG Data Anywhere Support for Transactional Integrity

35
35 35 35

ii

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Advantages of the ATG Data Anywhere Approach Example Page Default Transactional Behavior Recommendations

35 36 36 36

8 9

Strong Built in Search Capabilities Fine-grained Access Control


Case 1: Controlling access to all items of the same type Case 2: Controlling access to specific items Case 3: Controlling access to specific properties Case 4: Limiting Query Results Creating a Secured Repository

37 38
38 38 38 38 39

10 Conclusions Appendix: Other Sources of Information


Documentation Education

41 42
42 42

iii

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Executive Summary
Providing good online service requires access to lots of data. At most companies, this data is spread among different data stores and in different data formats across the enterprise. To provide a single face to their customers, firms need to utilize the data in all those silos. Companies also benefit by putting together a complete picture about each customer and driving their marketing and sales efforts more effectively. Accessing data for online use is difficult. Data has to be cached efficiently to prevent bottlenecks. Software has to provide transactional integrity, so that accounts will be accurate. It has to provide rich tools for important functions like searching. And data needs to be secured to prevent unauthorized access. Most importantly, accessing data has to be easy. Since there is so much work to create and maintain data access, some developers end up spending the majority of their time simply trying to integrate data sources. The ATG Data Anywhere Architecture, featuring Dynamo Repositories, provides a world in which a simple XML file is all you need to integrate a new data source for online use. This environment provides a wealth of caching choices, insures transactional integrity, and offers the rich tools needed to rapidly manipulate, search and secure data. It also provides a world where access to data stored in file systems, relational databases and LDAP directories is all accomplished using the same set of interfaces. This world is accessible to all applications built using ATG products. What does this mean? Faster time to market, better maintainability, and more extensibility combine to decrease total cost of ownership of web applications. With ATG Data Anywhere Architecture, developers can focus on implementing business logic rather than spending time writing "wrapper classes" for each persistent data type. ATG Data Anywhere Architecture offers several advantages over the standard data access methods such as Java Data Objects (JDO), Enterprise JavaBeans (EJB), and Java Database Connectivity (JDBC). Among the differences:


Data source independence ATG Data Anywhere Architecture provides access to relational database management systems, LDAP directories, and file systems using the same interfaces. This insulates application developers from schema changes and also storage mechanism. Data can even move from a relational database to an LDAP directory without requiring re-coding. Java Data Objects support data source independence, but it is up to vendors to provide an LDAP implementation. Fewer lines of Java code Less code leads to faster time-to-market and reduced maintenance cost. Persistent data types created using ATG Data Anywhere are described in an XML file. Absolutely no java code required. Unified view of all customer interactions A unified view of customer data (gathered using web applications, call center applications, and ERP systems) can be provided without copying data into a central data source. This unified view of customer data leads to a coherent and consistent customer experience.





UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

8r Qsvyr9hh

Qsvyr

Qqp 8hhyt

64/ ,QWHUIDFH /'$3 ,QWHUIDFH ;0/ ,QWHUIDFH +70/ ,QWHUIDFH

ThyrApr 9hhihr

8QLILHG ,QWHUIDFH

8hyy8rr 9hhihr

8r

@yrr 9vrp

6hyvp

Qqp 8hhyt

@E7 8hvr

Trpv

8r Hhhtrr Tr

'\QDPR

3K\VLFDO 'DWD 6WRUDJH

Figure 1 The unified view of data access provided by the ATG Data Anywhere Architecture



Maximum performance - Our intelligent caching of data objects ensures excellent performance and timely, accurate results. The JDO and EJB standards rely on a vendor implementation of caching which may or may not be available. Simplified Transactional Control The key to overall system performance is minimizing the impact of transactions while maintaining the integrity of your data. In addition to full Java Transaction API (JTA) support, ATG Data Anywhere allows both page developers and software engineers to control the scope of transactions using the same transactional modes (required, supports, never, etc.) used by EJB deployment engineers. Powerful built-in search capabilities Quality search tools lead to increased visitor satisfaction and efficiency (which often lead to increased or sustained revenue!) Customers cant buy what they cant find. Fine-grained access control Control who has access to which data at the data type, data object, even down to the individual property using Access Control Lists (ACLs) Integration with ATG product suites - Our award winning personalization, scenarios, commerce, and portal applications all make use of Repositories for data access. A development team is free to use EJBs along side of ATG technology, but the easiest way to leverage investment in ATG technology is to follow the example set by our solution sets. Our solution sets satisfy all of their data access needs using Repositories.









Technical leads and architects are faced with difficult choices to make when deciding upon the data access mechanism used for a new application. Some think JDO or J2EE/EJB may be the right choice since both they offer portability across application server vendors. However, in addition to all of the advantages above, the ATG Data Anywhere Architecture is also portable across application servers. With support for Dynamo Application Server, BEA WebLogic and IBM WebSphere, applications built using ATG Data Anywhere Architecture can be deployed on the majority of the application server market. The bottom line: ATG Data Anywhere Architecture is the most powerful, most flexible, easiest to use data access method available. It saves developers time and frustration. It helps customers have a better experience. It saves organizations money. Can there be any other choice for your next project?

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

The Challenge of the Data Access Problem


In the first generation of sites for the World Wide Web, most companies developed simple sites with largely static content describing their goods and services. Known as Brochure Ware, early sites proved to be a cost effective tool for displaying information, and a popular way for clients to do basic research on companies. As the Internet become more popular, firms recognized that the Web had the possibility of becoming a complete channel, able to service a large section of their client base for many of their needs. While providing the access that clients increasingly sought, Web sites also offered firms the ability to significantly decrease the cost of serving customers by making a selfservice option available around the clock. Firms like Amazon and Fidelity recognized that providing excellent service at low cost could create a significant competitive advantage over their rivals. However, firms quickly realized that to change customer behavior in large numbers, web sites had to offer a similar or better level of service than traditional channels offered. Many firms with business strategies predicated on lower cost, without providing a superior customer service experience, dropped from the market in record numbers. In order to provide good service via the Internet, firms have had to offer intelligent web sites. These next generation web sites are able to understand clients account, just as a real customer service representative is able to do. To fulfill the vision of excellent self-services, sites had to develop from simple brochures to rich, transactional environments able to satisfy customer needs as completely as possible. The more data sources that are available, the better the chance that the customers request can be answered.

Figure 2 EMC's Powerlink Required Approximately 20 Integrations


EMCs Powerlink system (see figure 2 ) is a great example of the kinds of integrations necessary for a modern, world-class web site (A detailed case study on the EMC Powerlink project is available on atg.com). A typical enterprise web site will require integrating data from 15 to 50 systems. A more complex site can require far more integrations. Adding to the challenge of numerous and varied data sources is the problem of transforming data gathered from these external systems into an object-oriented framework. Information is organized differently in these external systems. Relational databases store information in tables; file systems and LDAP directories store data hierarchically.

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Hasnt the Data Access Problem Been Solved?


Some data access problems have, in fact, been solved. Java Database Connectivity (JDBC) enables our web applications to interact with relational database management systems in a vendor independent way. Your applications are finally loosely coupled with your database vendor. Unfortunately, JDBC does not insulate your application from the database schema, nor does it map well into the object-oriented space. JDBC is a fairly low-level technology. Application developers execute SQL statements to interact with the data source and if a result set is returned from a query, the developer must transform the results into objects by writing code. Enterprise JavaBean (EJB) technology enables web applications to interact with relational databases in a schema independent way. The mapping between EJB properties and database columns is provided in an XML deployment descriptor file. Also, the connection logic and the transactional control is handled outside of application code thus freeing up the developers to focus on application logic. On the downside, the need for absolute portability across application servers has led to complexity in both the coding and the configuration required to get an Enterprise JavaBean up and running. Developers have to write at least 2 interfaces and one Java class for each EJB as well as provide a wide assortment of intricate configuration details. Modern development tools such as JBuilder (by Borland) have made the process of creating and configuring EJBs easier than ever, but still developers have faced steep learning curve and tedious development and configuration tasks when working with EJB technology. Java Data Objects (JDO) is the latest standard data access approach approved through the Java Community Process. JDO offers a more transparent data access mechanism. JDO allows developers to persist any Java class without source code modification. On the downside, developers still need to write a Java class for each persistent type (and modify that Java class to add or remove properties). Also, JDO is going to rely on vendor implementations for caching of data objects and LDAP access. JDBC, Enterprise JavaBeans, and JDO focus on solving only some of the data access challenges described above. If your web application needs to access data in a file system or in an LDAP directory, you'll be forced to use yet-another technology.

Why Should You Care about the ATG Data Anywhere Architecture?
As the chart below shows, ATG Data Anywhere Architecture can do anything the others approaches can do, and also much more. The ATG Data Anywhere Architecture was designed to meet the demanding requirements of web applications. Our technology enables web applications to access data in a data source and schema independent way without writing code to transform or store data in an object. Data Anywhere Architecture is a higher-level abstraction that leads to faster time-tomarket and higher reliability. Think about the possibilities: if the integration with individual data sources is simpler, the number of integrations your team can complete in the same amount of time will increase. The more successful integrations your team builds, the more intelligent your customer interactions can be. In an economy where customer retention is key, the web experience can make or break the success of your business. As an instructor who has taught and used both J2EE/EJB technology and ATG Data Anywhere Architecture since they were first implemented, I have seen the differences firsthand. To teach a developer to the J2EE/EJB approach (create a JSP that accesses either a JavaBean or a Servlet that in turn accesses a container-managed entity bean) takes 4 days. In contrast, I

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

teach developers to use the ATG Data Anywhere Architecture in 2 days, including covering much of the additional functionality provided. This is my personal measure of the elegance of the ATG Data Anywhere Architecture.

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Challenge

Description

ATG Data Anywhere Architecture

JDO

Enterprise Java Beans (EJB)

Java Database (JDBC)

Data Source Independence

Application logic does not change based upon the type of data source (e.g., relational database, XML file, LDAP directory, etc.) Application logic is completely independent from the schema (e.g., table names, column names, table relationships, etc.) so that if the schema needs to change (e.g., a new column is added/removed), the application doesn't need to be changed. Applications interact with objects not relationally or hierarchically organized data.

With Connectors or BMP

Schema Independence

Object Relational Mapping

No Java Classes

Developers do not need to write, compile and test Java classes (or interfaces) for each persistent data type that they want to use in their application, reducing development time and errors. Applications that make use of the data access solution are portable to other application servers.

Portability Across Application Servers

Intelligent Caching

The data access mechanism provides a caching mechanism so that frequently accessed information is available in memory, improving application reliability and scalability. The data access mechanism ensures the integrity of the data and the transactional scope can controlled programmatically (using modes) or via provided dynamic page tags. The ability to search across data source/types.

Vendor Specific

Vendor Specific

Simplified Transactional Control

Searching

Access Control

The ability to control access to data objects and properties within those data objects

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

ATG Data Anywhere Architecture

Data Source Independence


Figure 3 below provides a high-level overview of the ATG Data Anywhere Architecture.

$SSOLFDWLRQ /RJLF 5HSRVLWRU\


TRG8rp S97HT

Srv 6QD9yr

5HSRVLWRU\ ,WHP
-DYD 2EMHFW
r r r

5HSRVLWRU\
G96Q8rp G96Q9vrp

5HSRVLWRU\
AvyrTr 8rp

Figure 3 The ATG Data Anywhere Architecture



With ATG Data Anywhere, the application logic created by developers uses the same approach to interact with data regardless of the source of that data. One of the most powerful aspects of this architecture is that the source of the data is hidden behind the Dynamo Repository abstraction. It would be easy to change from a relational data source to an LDAP directory since none of the application logic would need to change. Once data is retrieved from a data source it is transformed into an object-oriented representation. Manipulation of the data can then be done using simple getPropertyValue and setPropertyValue methods.



UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Understanding the ATG Data Anywhere Architecture


A Repository is a data access layer that defines a generic representation of a data store. Application developers access data using this generic representation by using only interfaces such as Repository and RepositoryItem. Repositories accesses the underlying data storage device through a connector, which translates the request into whatever calls are needed to access that particular data store. Connectors for relational databases, LDAP directories, and file systems are provided out-of-the-box. Connectors use an open, published interface, so additional custom connectors can be added if necessary. Developers use Repositories to create, query, modify, and remove Repository Items. A Repository Item is like a JavaBean, but its properties are determined dynamically at runtime. From the developers perspective, the available properties in a particular repository item depend on the type of item they are working with. One item might represent the user profile (name, address, phone number), while another may represent the meta-data associated with a news article (author, keywords, synopsis). The purpose of the Repository interface system is to provide a unified perspective for data access. For example, developers can use targeting rules with the same syntax to find people or content. Applications that use only the Repository interfaces to access data can interface to any number of back-end data stores solely through configuration. Developers do not need to write a single interface or Java class to add a new persistent data type to an application

ATG also provides a unified view of your applications data through the ATG Control Center which is a graphical user interface that uses the Repository interfaces to allow users to create, query, update, and remove repository items. Figure 4 below shows the interface to user repository items this UI will look the same regardless of the data source used to store the user data.

Figure 4 Using the ATG Control Center to Access the User Repository

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Repository Basics
Figure 5 below shows an example of a repository that stores customer information.

(DFK LWHPGHVFULSWRU GHILQHV WKH VWUXFWXUH RI RQH W\SH RI PDSSHG REMHFW

LWHPGHVFULSWRU XVHU
SrvDrErvq  

9LVLWRU 3URILOH 5HSRVLWRU\

SrvDrTrvq !

LWHPGHVFULSWRU XVHU DGGUHVV


SrvDr $HhvTvq  

SrvDr$QhxTvq !

Figure 5 Sample Repository


Inside each repository, there can be several types of items (which are called "item-descriptors") and for each type there can be several repository items. The definition of each type of item is described in a repository definition file using XML. In this example, the Visitor Profile Repository defines two types of items (user and address).

10

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Developers can model relationships between types of items as shown in figure 6.

LWHPGHVFULSWRU XVHU
SrvDrErvq  

9LVLWRU 3URILOH 5HSRVLWRU\

SrvDrTrvq !

LWHPGHVFULSWRU XVHU DGGUHVV


SrvDr $HhvTvq  

$ SURSHUW\ LQ RQH UHSRVLWRU\ LWHP FDQ EH D OLQNHG WR DQRWKHU W\SH RI UHSRVLWRU\ LWHP ZKLFK DOORZV GHYHORSHUV WR PDS UHODWLRQVKLSV RQHWRRQH RQHWRPDQ\ HWF

SrvDr$QhxTvq !

Figure 6 Relationships between Repository Items


Dynamo Repositories use the Java Collections Framework to model complex relationships between items using familiar object-oriented concepts. You can store the "list" of addresses as a Set, List, Map, or array (whatever make sense for your applications needs). But the boundary does not fall at the Repositorys wall. Developers can create links between items in different repositories (see figure 7 below). This allows you to create repository items that are composed of properties retrieved from more than one data source. Youll have to keep in mind though that the properties in the adjunct repositories will not be queryable. Applications that need to query against properties from multiple data sources can still make use of Repositories, but the developers will need to query each repository separately. In the example shown below, the majority of the information about a particular visitor is stored in a relational database. In many web applications, an LDAP directory is used to store information about the organizational structure of a company and/or userid/password combinations for authentication. Dynamo Repositories allow you to create an item that has access to both relational data and LDAP data from the same object.

11

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

9LVLWRU 3URILOH 5HSRVLWRU\


LWHPGHVFULSWRU XVHU
DrErvq  

$GMXQFW 3URILOH 5HSRVLWRU\


XVHU LWHPGHVFULSWRU RUJDQL]DWLRQ
Dr6UBc@qphv

DrTrvq !

Dr6UBcTrvpr

S97HT

$Q LWHP LQ RQH UHSRVLWRU\ FDQ OLQN WR SURSHUWLHV LQ DQRWKHU


Figure 7 Linking Between Repositories

G96Q9vrp

Using Repository Data


Dynamo provides many powerful ways to make use of repository data in your application:


Programmatically via the Repository API Through the use of RepositoryFormHandler On a dynamic page (through Dynamo Servlet Beans and the Repository Query Language (RQL))





The Repository API


The Repository API allows you to programmatically create, retrieve, update, or delete items. The power of the Repository API is that developers use the same API regardless of data source. An item that contains data from an LDAP directory is manipulated the same way that an item that contains relational data is manipulated. Heres a code example that shows how a developer can retrieve the age property of a user item (assuming that the id of the user is known in this case '9'):

import atg.repository.*; Repository repository = getRepository(); RepositoryItem user = repository.getItem("9","user"); Integer age = (Integer) user.getPropertyValue("age");

12

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

The following code snippet shows how a developer can change the age property of a user item:

try { MutableRepository mutableRepository = (MutableRepository)getRepository(); MutableRepositoryItem mutableUser = mutableRepository.getItemForUpdate("9", "user"); mutableUser.setPropertyValue("age",new Integer(43)); mutableRepository.updateItem(mutableUser); } catch (RepositoryException exc) ...

Notice that the code created by the application developer uses only the Repository API. The code has no knowledge of the type of data source nor does the code have any knowledge of the schema. There is much more in the Repository API that you will want to explore (such as the ability to query the repository, control transactional boundaries, and control the validity of the repository items that are cached to improve performance), but this should give you a taste of what is involved.

RepositoryFormHandler
As you know, ATG provides a robust form handling framework that can be used by developers whenever a web form needs to be created. ATG provides a specific form handler that can be used to manipulate repository data as well. The RepositoryFormHandler can be used out-of-the-box to create, update, or delete repository items. And like any Java class, it can be extended if you need specialized behavior. Before you can use the RepositoryFormHandler, you'll need to configure a component based on this class. You will most likely want to configure the repository it will be interacting with as well as the type of repository item. Here's a property file (this is the configuration syntax used by Dynamo's Nucleus component framework) that shows an example configuration for this type of form handler:

# /RepositoryFormHandler #Thu Sep 06 08:41:24 EDT 2001 $class=atg.repository.servlet.RepositoryFormHandler $scope=request itemDescriptorName=topic repository=/MyApplication/TopicRepository requireIdOnCreate=false

13

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Once you've configured a form handler as shown above, a page designer can make use of it. Here's an example page that allows a visitor to add a new topic to the TopicRepository.

<H1>Add a New Topic</H1> <dsp:form action="addTopic.jsp" method="POST"> <!-- Default form error handling support --> <dsp:droplet name="/atg/dynamo/droplet/ErrorMessageForEach"> <dsp:oparam name="output"> <B><dsp:valueof param="message"/></B><BR> </dsp:oparam> <dsp:oparam name="outputStart"> <LI> </dsp:oparam> <dsp:oparam name="outputEnd"> </LI> </dsp:oparam> </dsp:droplet> Enter the Topic Name:<BR> <dsp:input bean="/RepositoryFormHandler.value.topicName" name="topicName" size="24" type="TEXT" required="<%=true%>"/><BR> <dsp:input bean="/RepositoryFormHandler.value.topicBody" name="topicBody" type="TEXT"/><BR> <dsp:input bean="/RepositoryFormHandler.createSuccessURL" type="HIDDEN" value="/Discussion/alltopics.jsp"/> <dsp:input bean="/RepositoryFormHandler.create" type="Submit" value="Add Forum"/> </dsp:form>

A few notes about this example:




This form handler makes it extremely easy to tie a form element to a specific item property. Note the syntax used is /FormHandlerComponentName.value.propertyName. This form handler provides several "handlers" to enable a page developer to perform the various operations (create,update,delete). This example uses the create handler.



14

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Dynamo Servlet Beans and the Repository Query Language (RQL)


ATG provides several Servlet Beans (also known as droplets) to allow page developers to retrieve and display repository data on dynamic pages (JSP or DSP). In the simplest case (when the page developer knows the unique id of the repository item), the ItemLookupDroplet can be used. The following code example shows this droplet in action (using JSP in this case):

<%@ taglib uri="/dspTaglib" prefix="dsp"%> <dsp:page> <dsp:importbean bean="/MyApplication/TopicRepository"/> <dsp:importbean bean="/atg/dynamo/droplet/ItemLookupDroplet"/> <dsp:setvalue bean="ItemLookupDroplet.useParams" value="true"/> <dsp:droplet name="ItemLookupDroplet"> <dsp:param name="id" value="1"/> <dsp:param name="repository" bean="TopicRepository"/> <dsp:param name="itemDescriptor" value="topic"/> <dsp:oparam name="output"> Name: <dsp:valueof param="element.topicName"/><br> Body: <dsp:valueof param="element.topicBody"/><br> </dsp:oparam> </dsp:droplet> </dsp:page>

A couple of notes about this example:



We provided three inputs: The unique id of the item (1), the name of the repository (TopicRepository) that contains the item, and the type of item (Topic). The output of the droplet is a RepositoryItem called element. We can retrieve the properties of that item using the simple dot notation (element.topicName for example).



In many cases, page developers will not know the unique id of the item (or items) they want to display on the page. In fact, what page developers often need to do is query the repository for a set of items that match some criteria. You might assume that the page developers will use an industry standard such as SQL to perform this query. The problem with SQL is that it is designed to work only with relational databases. Since a repository may have a relational database, an LDAP directory, or a file system behind it SQL is not an appropriate query language. ATG provides a SQL-like query language for repositories called the Repository Query Language (RQL). ATG also provides droplets that can be used by page developers to execute RQL queries and loop over the results. The code example below shows how a JSP developer can use the RQLQueryForEach droplet to display a list of all the topics in the TopicRepository that have at least 1 reply associated with them.

15

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

<%@ taglib uri="/dspTaglib" prefix="dsp"%> <dsp:page> <dsp:droplet name="/atg/dynamo/droplet/RQLQueryForEach"> <dsp:param name="queryRQL" value="numReplies >= 1"/> <dsp:param name="repository" value="/MyApplication/TopicRepository"/> <dsp:param name="itemDescriptor" value="topic"/> <dsp:oparam name="output"> Name: <dsp:valueof param="element.topicName"/><br> Body: <dsp:valueof param="element.topicBody"/><br> </dsp:oparam> </dsp:droplet> </dsp:page>

The primary difference between the ItemLookUpDroplet and RQLQueryForEach droplet is that RQLQueryForEach requires an RQL statement as an input rather than an id. Also, the output oparam will be rendered once for each item that the RQL query returns.

16

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

 Less Java Code Leads to Faster Time-to-Market and Less


Maintenance
Developers who use the ATG Data Anywhere Architecture do not need to write, compile or test Java classes or interfaces for each persistent data type that they want to use in their application. A new persistent data type can be created by simply creating an XML file which defines a mapping between a repository item and the underlying data structure as shown in example 1 below.

Example 1: XML Required to Define a Persistent Type using Dynamo Repositories

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE gsa-template PUBLIC "-//Art Technology Group, Inc.//DTD Dynamo Security//EN" "http://www.atg.com/dtds/gsa/gsa_1.0.dtd"> <gsa-template> <header> <name>Account Repository</name> <author>Pat Durante</author> </header> <item-descriptor name="account" default="true"> <table name="Account" type="primary" id-column-name="accountId"> <property name="accountId" column-name="account_id" data-type="string"/> <property name="type" data-type="string"/> <property name="balance" data-type="double"/> <property name="customerId" data-type="string"/> </table> </item-descriptor> </gsa-template>

In this example, we are creating a new type of repository item that represents a bank account. The account item contains four properties (accountId, type, balance, and customerId) that are mapped into the columns of the Account database table.

17

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

With this item description in place, we can easily display all of the accounts on a dynamic web page as shown in example below:

<%@ taglib uri="/dspTaglib" prefix="dsp"%> <dsp:page> <dsp:droplet name="/atg/dynamo/droplet/RQLQueryForEach"> <dsp:param name="queryRQL" value="ALL"/> <dsp:param name="repository" value="/MyApplication/AccountRepository"/> <dsp:param name="itemDescriptor" value="account"/> <dsp:oparam name="output"> Account Id: <dsp:valueof param="element.accountId"/><br> Balance: <dsp:valueof param="element.balance"/><br> </dsp:oparam> </dsp:droplet> </dsp:page>

The J2EE mechanism for representing a new persistent data type is to define a new Enterprise JavaBean (specifically an EntityBean). Creating an EJB requires writing a fair amount of code (as shown in example 2 below). And deploying an EJB requires a significant amount of configuration work (XML) as well.

Example 2 The Code Required for a Container Managed Entity Bean (EJB) (Account.java, AccountHome.java, and AccountBean.java) Account.java:

package atg.atm.account; import java.rmi.RemoteException; import javax.ejb.*; public interface Account extends EJBObject { public void setBalance(double pBalance) throws RemoteException; public double getBalance() throws RemoteException; public String getType() throws RemoteException; public String getCustomerId() throws RemoteException; }

18

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

AccountHome.java: package atg.atm.account; import javax.ejb.*; import java.rmi.RemoteException; import java.util.*; public interface AccountHome extends EJBHome { public Account create(String accountId, String customerID, double initialBalance, String type) throws CreateException, RemoteException; public Account findByPrimaryKey(String primaryKey) throws FinderException, RemoteException; public Enumeration findAccountsForACustomer(String custId) throws FinderException, RemoteException; } AccountBean.Java: package atg.atm.account; import import import import import java.io.Serializable; java.rmi.RemoteException; java.rmi.Remote; javax.ejb.*; java.util.*;

public class AccountBean implements EntityBean { private transient EntityContext ctx; public public public public String String double String accountId; customerId; balance; type;

public void ejbActivate() throws RemoteException { ... } public void ejbPassivate() throws RemoteException { ... } public void setEntityContext(EntityContext ctx) throws RemoteException { this.ctx = ctx; } public void unsetEntityContext() throws RemoteException { this.ctx = null; } public void ejbLoad() throws RemoteException { } public void ejbStore() throws RemoteException { }

19

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

public void ejbRemove() throws RemoteException { } public String ejbCreate(String accountId, String customerId, double initialBalance, String type) { this.accountId = accountId; this.customerId = customerId; this.balance = initialBalance; this.type = type; return null; } public void ejbPostCreate(String accountId, String customerId, double initialBalance, String type) { } public void setBalance(double pBalance) { balance = pBalance; } public double getBalance() { return balance; } public String getType() { return type; } public String getCustomerId() { return customerId; } }

To be fair, there are tools in the marketplace that can generate most of this "boilerplate" code for a new EJB, but still this code needs to be maintained and extended as the system grows. Also, even with the development and deployment of this new EJB, the data is still not available to a dynamic page designer. According to the Sun Blueprint methodology, a JSP should not access an EJB directly. This means that the developer has to write either a JavaBean or a Servlet that interacts with the EJB. Only then can a JSP be created that includes dynamic information from a data source. The JDO approach requires just a standard Java class. It provides transparent data access for all Java classes. Developers can use existing classes or write new classes that new persistence. Example 3 below shows the code needed for our account type.

20

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Example 3: The Java Class Required By JDO

public class Account { private String accountId; private String customerId; private double balance; private String type; public Account(String accountId, String customerId, double initialBalance, String type) { this.accountId = accountId; this.customerId = customerId; this.balance = initialBalance; this.type = type; } public void setBalance(double pBalance) { balance = pBalance; } public double getBalance() { return balance; } public String getType() { return type; } public String getCustomerId() { return customerId; }

21

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Using the Visitor Profile (Out-of-the-Box)


By default, the visitor profiling data that is used by the ATG e-Business Platform is stored in the Solid relational database management system that ships with the product suite. The basic information about a visitor is stored in a table called dps_user (although several auxiliary tables are used to store additional information about visitors). Figure 8 below provides a conceptual view of the out-of-the-box architecture.

5HSRVLWRU\ ,WHP XVHU


id firstName login

'DWD 6RXUFH &RQILJXUDWLRQ )LOHV SURSHUWLHV

&RQWUROV ZKLFK GDWDEDVH LQIRUPDWLRQ LV FRPLQJ IURP

64/ 5HSRVLWRU\ &RQQHFWRU

Usr_tbl
id first_name login

-DYD 2EMHFW

5HSRVLWRU\ 'HILQLWLRQ )LOH dps_user XVHU ;0/ 0DSV -DYD REMHFW SURSHUWLHV WR GDWDEDVH FROXPQV DQG WDEOHV

6ROLG 5'%06 +DQGOHV DOO GDWD DFFHVV FDFKLQJ DQG WUDQVDFWLRQV

Figure 8 Out of the Box Profile Architecture


Note that the data source configuration files contain the only dependency on the Solid RDBMS.

22

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Extending the Definition of a Repository Item


Using a Simple Auxiliary Table (To model a one-to-one relationship)
Lets say we want to extend the profile definition to include a subscription id for each visitor at the site. This too is extremely easy to do. The steps are as follows:


Create the additional table to store the new data (create a one-to-one relationship between the existing dps_user table and your new table). For example:

CREATE TABLE elrn_user ( id VARCHAR(40) not null, subscription_id VARCHAR(32) null, constraint elrn_user_p primary key ( id ), constraint elrn_user_f foreign Key ( id ) references dps_user(id) )



Create a new userprofile.xml file (in your CONFIGPATH somewhere) to extend the definition of the out-of-thebox user item descriptor. For example:

<gsa-template> <item-descriptor name="user"> <table name="elrn_user" type="auxiliary" id-column-name="id"> <property name="subscription_id" column-name="subscription_id" data-type="string" category="eLearning" display-name="Subscription Id"/> </table> </item-descriptor> </gsa-template>



Restart the server. That's it! No code changes required. The new property will show up in the ATG Control Center and you'll be able to retrieve and/or modify the value of this new property from your dynamic web pages!

<dsp:page> <dsp:importbean bean="/atg/userprofiling/Profile"/> Welcome back, <dsp:valueof bean="Profile.firstName"/>! Your subscription id is: <dsp:valueof bean="Profile.subscription_id"/>. </dsp:page>

23

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Adding a Property to an EJB Adding a single property to an EJB is considerably more involved than adding a property to a Repository Item type. Lets say we would like to add a single boolean property to our Account EJB presented above (to keep track of whether or not the account includes overdraft protection).


Modify the schema of the account table to include a new column. (Alternatively, you can create a new table and use a vendor specific mapping to build a relationship between tables.) Modify the Account interface code to allow other programmers to gain access to the new property: public boolean getOverdraftProtection() throws RemoteException;





Modify the AccountHome interface to allow the overdraft property to be initialized upon account creation: public Account create(String accountId, String customerID, double initialBalance, String type, boolean overdraft) throws CreateException, RemoteException;



Modify the AccountBean class to accommodate the additional create parameter: public String ejbCreate(String accountId, String customerId, initialBalance, String type, boolean overdraft) { this.accountId this.customerId this.balance this.type this.overdraft return null; } public void ejbPostCreate(String accountId, String customerId, double initialBalance, String type, boolean overdraft) { } = = = = = accountId; customerId; initialBalance; type; overdraft; double



Add a get method for the new property to the AccountBean class: public boolean getOverdraft() { return overdraft; }



Re-deploy the J2EE application making sure to map the new bean property to the appropriate database column. Modify the JavaBean or the Servlet code that interacts with your EJB (since JSPs should not access an EJB directly). At a minimum, you will need to add a method that can check to see if the account has overdraft protection. You are now ready to access your new property from a JSP.





24

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Using a "Multi" Table (to model a one-to-many relationship)


Lets say we want to extend the profile definition to include a list of each visitor's favorite subjects. Modeling a one-to-many relationship is a little more involved, but still generally straight forward. The steps are as follows:


Create the additional table to store the new data (create a one-to-many relationship between the existing dps_user table and your new table). For example:

CREATE TABLE elrn_subjects ( id VARCHAR(32) not null, subject VARCHAR(32) not null, constraint elrn_subjects_p primary key ( id, subject ), constraint elrn_subjects_f foreign Key ( id ) references dps_user(id) )



Create a new userprofile.xml file (in your CONFIGPATH somewhere) to extend the definition of the outof-the-box user item descriptor. Note that you can you a Set, List, array,or Map as the data-type of a multi-value property. In this example, we will use a Set (since the order of the visitor's favorite subjects is not important and we want each subject to be included only once). For example:

<gsa-template> <item-descriptor name="user"> <table name="elrn_subjects" type="multi" id-column-name="id"> <property name="favoriteSubjects" column-name="subject" data-type="set" component-data-type="string"/> </table> </item-descriptor> </gsa-template>



Restart the server. That's it! No code changes required. The new property will show up in the ATG Control Center and you'll be able to retrieve and/or modify the values assigned to this new property from your dynamic web pages!

<dsp:importbean bean="/atg/userprofiling/Profile"/> <dsp:page> <dsp:importbean bean="/atg/dynamo/droplet/ForEach"/> Welcome back, <dsp:valueof bean="Profile.firstName"/>! Your favorite subjects are: <dsp:droplet name="/atg/dynamo/droplet/ForEach"> <dsp:param bean="Profile.favoriteSubjects" name="array"/> <dsp:oparam name="output"> <li><dsp:valueof param="element"/> </dsp:oparam> </dsp:droplet> </dsp:page>

25

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Switching to an Alternative Relational Database Management System


At times, companies need to change from one database system to another. For example, during the prototyping phase of a new project, many web architects choose to use the free Solid database included with ATG Dynamo. Eventually, application will need to switch from the Solid RDBMS to a production grade RDBMS (such as Oracle). The ATG Data Anywhere Architecture makes this switch incredibly easy. All you have to do is create the appropriate tables in the RDBMS of your choice and change a few properties files to change the connection information. We recently built an application that makes use of Microsoft SQL Server instead of Solid. Here's what we needed to do to get the ATG Dynamo e-Business Platform running against SQL Server:


Create the necessary tables and indices in a SQL Server database using the provided SQL (for each ATG product there is a set of SQL files that contain DDL which can be used to create the appropriate tables and indices). For example, under \ATG\Dynamo5.6\DAS\sql\install\mssql there is a file called das_dll.sql which you can use for this purpose. Create a new data source (or replace the existing configuration). We chose to replace the existing configuration as shown below:



# /atg/dynamo/service/jdbc/MyDataSource #Wed Nov 14 15:07:38 EST 2001 $class=atg.service.jdbc.MonitoredDataSource $description=JTA Participating eLearning Datasource $scope=global dataSource=/atg/dynamo/service/jdbc/MyXADataSource logListeners=/atg/dynamo/service/logging/LogQueue,/atg/dynamo/service/logging/Scre enLog max=5 min=5 transactionManager=/atg/dynamo/transaction/TransactionManager

# /atg/dynamo/service/jdbc/MyXADataSource #Wed Nov 14 15:05:06 EST 2001 $class=atg.service.jdbc.MyXADataSource $scope=global URL=jdbc\:inetdae7\:hostname.atg.com\:1433 dataSourceJNDIName= database=eLearningBeta driver=com.inet.tds.TdsDriver logListeners=/atg/dynamo/service/logging/LogQueue,/atg/dynamo/service/logging/Scre enLog password=thepassword server=hostname.atg.com\:1433 user=theuserid

IMPORTANT: Note that the change in data source required no changes to the application code, nor did it involve changing the repository configuration. Changes were isolated to the data source configuration files.

26

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Converting from one type of database to another


If we want to switch over to using an LDAP directory (such as iPlanet Directory Server), you can do that easily as well. This paper will not provide details on how to accomplish this. If you'd like to learn more about how to do this, please read the following sections in the ATG Personalization Programming Guide:


Setting Up an LDAP Profile Repository Linking SQL and LDAP Repositories



27

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

 A Unified View of Customer Interactions


The ATG Data Anywhere Architecture excels at providing a unified view of customer interactions. A unified view of customer data leads to a coherent and consistent customer experience. For example, when a service call is logged using a called center application, your web application is aware of the service call and its status. One of the biggest challenges faced by a web application is getting access to information about a customer gathered outside of a web context. As you know, many applications within the enterprise record information about customer interactions. Call center applications and enterprise resource planning (ERP) systems are good examples of the kinds of systems used to service customers in the enterprise. The flexibility of ATG Data Anywhere allows you to "hook into" the important data gathered by call center, ERP, and other enterprise applications without having to copy it all into a central repository.

Figure 9 below shows an example of an enterprise that gathers data about customer interactions using three disparate systems (a web application, a call center application, and an ERP system).

Figure 9 Enterprise data is often managed by disparate systems

:HE $SSOLFDWLRQ 'DWD


Usr_tbl
id first_name login

&DOO &HQWHU $SSOLFDWLRQ 'DWD


Service_request
id description status

06 64/ 5'%06

(53 6\VWHP 'DWD


Order_history
id order_num date

2UDFOH 5'%06

6\EDVH 5'%06

28

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

With ATG Data Anywhere, you can access all of this customer-focused data without relocating it. Figure 10 below shows one way this can be accomplished.

5HSRVLWRU\ ,WHP FXVWRPHU


id firstName login orders calls

'DWD 6RXUFH &RQILJXUDWLRQ )LOHV SURSHUWLHV

Xri6yvphv9hh

Usr_tbl
id first_name login

5HSRVLWRU\ &RQQHFWRUV

06 64/ 5'%06
8hyy8rr6yvphv 9hh

Service_request
id description status

5HSRVLWRU\ 'HILQLWLRQ )LOHV ;0/


id

2UDFOH 5'%06
@SQTr9hh

Order_history
order_num date

6\EDVH 5'%06
Figure 10 A Unified View of Customer Data using ATG Data Anywhere

29

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

 Maximum Performance Through Intelligent Caching


Thoughtful design of database access is a key to achieving acceptable performance in web applications. You want to minimize how often your application needs to access the database, while still maintaining data integrity. An intelligent caching strategy is central to achieving these goals. Dynamo SQL Repositories provide intelligent caching of data objects to ensure excellent performance and timely/accurate results. The ATG Data Anywhere Architecture supports an intelligent and flexible caching model that provides fine-grained control to deployment experts. When an item is first retrieved from the database, it is stored in memory (in a cache). Subsequent queries for the same item will not necessarily need to access the database (as long as the cache is still "valid" the data currently stored in memory can be used).

ATG Data Anywhere was designed to work in the harsh web environment, while other systems make caching assumptions that are more appropriate to a low-scale intranet. In most cases, a single server is accessing a data object (like the user profile) at a time. Our locked-mode caching (see case #3 below) offers both high performance and data integrity. Locked caching introduces a little bit of overhead to data access, since locks must be checked, set and removed during I/O. This checking insures that data will never be stale, insuring data integrity. In the worst case, if one is reading and writing the very same item all of the time, then the performance effect is similar to omitting caching all together. However, in the normal case, when different data elements are being read and written by different systems, then locked caching offers performance similar to simple caching. Bottom line: high performance caching while data integrity is assured: the best of both worlds. See the white paper called " Caching Data for Scalability Without Losing Data Integrity" on atg.com for more information about this topic. Perhaps the most important thing to understand about Repository caching is that it is highly configurable to meet the needs of your application. Developers can choose the appropriate caching strategy in the repository definition file (XML). Dynamo SQL Repositories define four caching modes that can be used by developers as appropriate:


Simple caching (which is the default) Locked caching Distributed caching Disabled (No caching)







Case 1: Single Dynamo Server


During development (and in some rare cases in production), an application may be deployed entirely on one Dynamo server. The number of Dynamo servers that you'll be running on your production site will vary depending on the size/complexity of your application and the number of simultaneous visitors the site needs to support. Simple caching is recommended for single-Dynamo-server environments. With simple cache mode, cached items are stored in each individual Dynamo servers memory. No attempt is made to keep cache items in sync between Dynamo servers. If one Dynamo server modifies an item, other servers will have inaccurate data in cache until the cache is manually or automatically flushed (see "Invalidating the Cache" below).

30

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Note that simple caching is the default. Sites running multiple Dynamo servers will want to change the cache mode (to locked or distributed) on important item types (whose data changes periodically).

Case 2: Read frequently, modify rarely or never


Some types of repository items (such as product catalog items) are modified rarely or never on the production site. Simple caching can be used in these situations too. This includes all items that you modify only on a staging server and that you do not modify once they are published to your live site. Developers will need to flush the cache on all Dynamo instances in the production environment whenever modifications are being pushed from the staging environment to the live site (see "Invalidating the Cache" below).

Case 3: Modifications made by one Dynamo server at a time


Some types of repository items (such as the User Profile) are consistently used by one server at a time. The data may change frequently during a visitor's session, but a session is handled by a single Dynamo server (unless a failover occurs). Locked caching is recommended when typical usage will involve modification by one server at a time. If more than one server tries to modify an item at the same time, the 2nd server will be locked out until the 1st server completes its modifications. If you allow this to happen frequently, your site performance will suffer. Locked Caching is based on write-locks and read-locks. If no servers have a write-lock for an item, any number of servers may have a read-lock on that item. When a server requests a write-lock, all other servers are instructed to release their read-locks. Once an item is write-locked, no other server may get a read-lock or write-lock until the first server releases its write-lock. In other words, once a server has a write-lock on an item, all access to that item is blocked until the write is completed. A server requests a read-lock the first time it tries to access an item. Once the server has a read-lock on the item, it holds that read-lock until the lock manager notifies the server to release its read-lock. At that time, it drops the item from its cache.

Case 4: Modification by multiple Dynamo servers


In extreme cases, in a multiple Dynamo server environment, you need the ability to notify all other servers that an item has been modified (even if those other servers are not going to modify the item themselves). Use distributed caching for items that are modified infrequently during runtime. Distributed mode works best if there is little chance that two Dynamo servers will attempt to access and change a repository item at the same time. For items that change more frequently, use locked mode. Distributed mode allows any Dynamo server to read or modify an item in cache. When one Dynamo server modifies an item, it broadcasts a JMS cache invalidation event to all servers (see "Invalidating the Cache" below). Distributed caching uses asynchronous message delivery. This means that there is a slight chance of a user getting stale data, until the invalidation event message is received by all servers: if a user logged in on one server makes a change to an item, and another user logged in on a different server requests that item after the change is made, but before the second server has received the cache invalidation event, the second user would get stale data. This mode is seldom used; in most cases, locked caching is preferable to distributed caching.

31

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Case 5: Modification by a non-Dynamo Application


On some sites, the data used by the web application can be modified by a third party system. In these cases, you will need to either disable caching or find a way to notify the production Dynamo servers whenever a change is made by the third party application (using messaging).

Disabling Caching
Disabled caching should be used with great caution, because it will result in database access for every page that accesses an item of this type. This potentially has a severe impact on performance. Caching should be disabled when there is a possibility that the underlying data will be changed by a non-Dynamo Repository application. For instance, if you have an on-line banking application, and the same data is accessed by other applications in addition to Dynamo, you may want to turn off caching for displaying the users account balances. The other caching modes can only be set on a per-item-type basis, but disabled caching mode may be set on a perproperty basis. If a request is made for a disabled cache property of a cached item, the database will be queried. Example from userprofile.xml:

<property category="Login" name="password" data-type="string" required="true" column-name="password" cache-mode="disabled" >

Invalidating the Cache


Usually cache invalidation happens automatically when repository items are changed using the Repository API. Sometimes it is necessary to force cache invalidation manually, such as when the contents of the database are changed directly by a third-party application (without going through the Repository API). One way of handling integration with a third-party application is to have the third-party application send a JMS message whenever interesting data is modified. Your Dynamo application can then receive the message and programmatically invalidate the appropriate cache. To flush all items and all queries in all caches in a specific repository, use:

atg.repository.RepositoryImpl.invalidateCaches()

To flush the caches associated with a specific item type, use:

// The following method empties the item cache for the given item // descriptor atg.repository.ItemDescriptorImpl.invalidateItemCache() // The following method empties the item cache and query caches // for this item descriptor

32

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

atg.repository. ItemDescriptorImpl.invalidateCaches() // The following method method removes a specific repository item // from the cache atg.repository.ItemDescriptorImpl.removeItemFromCache(String itemId)

Controlling the Cache Sizes


The size of each repository item cache is configurable as well. By default, the item cache size is 1000 items. After running your site for some time, you can get a good idea of how well the repository item caches are working by going to the repository's page in the Dynamo Administration interface. For example, the Administration interface page for the Commerce Product Catalog repository is:

http://localhost:8830/nucleus/atg/commerce/ProductCatalog

Under the heading Cache usage statistics, this page lists, for each item descriptor, the number of items and queries in the item and query caches, the cache size, the percent of the cache in use, the hit count, the miss count, and the hit ratio. If you have a high quantity of misses and no hits, you are gaining no benefit from caching, and you can probably just turn it off, by setting the cache size to 0. If you have a mix of hits and misses, you might want to increase the cache size. If you have all hits and no misses, your cache size is big enough and perhaps too big. There is no harm in setting a cache to be too big unless it will fill up eventually and consume more memory than is necessary.

33

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

The cache size can be adjusted in a repository definition file as shown in the following example:

<gsa-template> <item-descriptor name="topic" cache-mode="locked" query-cache-size="100" item-cache-size="1500"> ... </item-descriptor> </gsa-template>

There are actually two types of caches in the repository. The item-cache caches the item and the properties; the query-cache caches the result set so that you don't need to hit the database to find out which items to return when the same query is issued again and again. By default, query caching is turned off (the default querycache-size is set to zero).

34

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

 Simplified Transactional Control


Overview of Transactional Integrity
Web applications need to be built carefully to balance the integrity of the data it manages with its performance goals. Consider a web application that allows visitors to transfer funds between bank accounts. A "transfer" operation really involves two actions (a debit from the source account and a credit to the destination account). In order to maintain the integrity of the data, both actions must complete successfully. If anything goes wrong during the "transfer" operation, the account balances should be "rolled back" to their original amounts. A system with transactional integrity will allow a developer to group multiple actions (e.g., a debit and a credit) into a single activity that either succeeds as a whole or fails as a whole.

The J2EE Approach to Transactional Integrity


J2EE provides a vendor and data source independent mechanism for managing transactions called Java Transaction API (JTA). JTA allows developers to control transactional boundaries (start, commit, rollback). J2EE also defines six transaction demarcation modes (REQUIRED, REQUIRES_NEW, NOT_SUPPORTED, SUPPORTS, MANDATORY, NEVER) for specifying the scope and impact of transactions on a particular Enterprise JavaBean method. All J2EE containers must provide a UserTransaction component which exposes programmatic control of transactions to developers. J2EE paved the way for what is called declarative transactional demarcation that allows the developer to establish the transactional behavior outside of Java code. In J2EE, the transactional behavior for a particular method is specified at deployment time in a deployment descriptor (an XML file).

ATG Data Anywhere Support for Transactional Integrity


The ATG Data Anywhere Architecture supports all of the requirements of the J2EE specification.


The ATG Dynamo Application Server provides a fully J2EE-compliant TransactionManager, but if you are building on a third-party application server (such as BEA WebLogic), you can use its TransactionManager in place of ours. Transactional boundaries can be set declaratively (for EJBs) and programmatically (using JTA)



The down side of transactional integrity is that performance of the data access functions is slowed due to the overhead of tracking data access operations occurring within transactions. The key to good overall system performance is to minimize the impact of transactions. For this reason, the ATG Data Anywhere Architecture allows page developers and Java developers to control the scope of transactions.

Advantages of the ATG Data Anywhere Approach




Page developers can use simple droplets to control transactional boundaries (without writing Java code) Java developers can leverage the transactional demarcation modes using the provided TransactionDemarcation interface (in J2EE only EJB methods can use the modal demarcation technique, with ATG this technique can be used in simple JavaBeans and Servlets as well)



35

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Example Page
<dsp:page> <dsp:importbean bean="/atg/dynamo/transaction/droplet/Transaction"/> <dsp:importbean bean="/atg/dynamo/transaction/droplet/EndTransaction"/> <dsp:importbean bean="/atg/userprofiling/Profile"/> <dsp:droplet name="Transaction"> <dsp:param name="transAttribute" value="required"/> <dsp:oparam name="output"> One transaction instead of three: <P> <dsp:valueof bean="Profile.firstName" /> <P> <dsp:valueof bean="Profile.lastName" /> <P> <dsp:valueof bean="Profile.city" /> <dsp:droplet name="EndTransaction"> <dsp:param name="op" value="commit"/> <dsp:oparam name="successOutput"> The transaction ended successfully! </dsp:oparam> <dsp:oparam name="errorOutput"> Failure: <dsp:valueof / param="errorMessage"> </dsp:oparam> </dsp:droplet> </dsp:oparam> </dsp:droplet> </dsp:page>

Default Transactional Behavior


In order to protect the integrity of data, SQL Repositories wrap a "required" mode transaction around every property read and write. This is good because by default transactional integrity will be enforced, however developers will need to consider the performance implications of such granular transactional scope. Unless a developer creates a transaction of his/her own (programmatically or using droplets), a new transaction will be conducted every time the getPropertyValue or setPropertyValue methods are called on a repository item. In order to achieve good performance, developers need to be aware of this default behavior and override it when appropriate, such as a dynamic page that displays multiple properties from the user profile (rather than beginning and ending a transaction for each property, it would be more efficient to read all of the properties in a single transaction).

Recommendations


Use the Transaction droplets when displaying repository information. When processing a form, use programmatic demarcation (typically at the start and end of your handler methods.)



36

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

 Strong Built in Search Capabilities


The ATG Data Anywhere Architecture provides a powerful set of Repository searching tools. We've already examined the use of some of the searching mechanisms that are provided (searching by id (using the ItemLookupDroplet) and querying against a single item type within a single repository (using the RQLForEachDroplet)). ATG also provides a SearchFormHandler that can be configured for most of your "search page" needs. The SearchFormHandler supports several types of searching:


Keyword searches allow you to build a search page in which visitors enter a set of keywords and queries all of the item properties that have been hold keywords. For example, "find all products in the catalog with the keyword tools" Text searches allow your visitors to perform full-text searches. Dynamo can simulate full-text searches or make use of your RDBMS-specific one (if it is available). For example, "find all products in the catalog whose description contains quality" Hierarchical searches allow your visitors to limit a search to a particular subset of items. For example, "find all products in the catalog with the keyword tools in the home-goods category" Advanced searches (also called Parametric searches) allow your visitors to limit the search based on a range of property values ("find all recipes whose cook time is between 5 and 20 minutes") or based on a specific enumerated value ("find all movies with the keyword action where the rating is PG-13") Combination searches any of the above search types can be combined together.









Searching for content across repositories and item types is an extremely powerful feature. It allows visitors and developers find the data they need more rapidly. Quality searching tools lead to higher satisfaction, greater efficiency, and potentially more revenue. Once again, developers do not need to write Java code to include a search function in their applications. The provided form handler can be configured to perform a great variety of searches. If the provided form handler does not meet your developers needs they can use inheritance to extend the provided class. For example, ATG developers specialized the search form handler for searching the commerce catalog (it allows the results to be presented as matching categories followed by matching products).

37

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

 Fine-grained Access Control


The ATG Data Anywhere Architecture provides a Secured Repository system that works with the Dynamo Security System to provide fine-grained access control to repository item descriptors, individual repository items, and even individual properties through the use of Access Control Lists (ACLs). Any repository can have security by configuring an instance of a Secured Repository Adapter on top of the repository instance. Depending on the security features you desire, some new properties (such as an owner property and an acl property) may have to be added to the underlying repository in order to support access control information storage.

Case 1: Controlling access to all items of the same type


The most basic level of access control is at the item type level. This is similar to controlling access to a particular database table. For example, you can specify that only members of the administrators group have access to user profile items.

Case 2: Controlling access to specific items


The next level of access control is on specific items. This is similar to access control of a single row in a database. For example, you can specify that members of the education managers group have access to user profile items for people who work in the education department.

Case 3: Controlling access to specific properties


You can even control who is allowed to read/write a particular property of an item. For example, you can specify that members of the human resources group can retrieve the salary property within certain user profile items.

Case 4: Limiting Query Results


You can control who can receive certain repository items as results from a repository query. For example, you can specify that only the owner can query new items until the owner previews and approves the item.

38

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Creating a Secured Repository


1. Modify the underlying repository. For those item descriptors you want to secure, you need to make some minor modifications to the underlying data and item descriptors to add properties with which to store the Access Control List (a String or an array of Strings) and owner (a user profile) information. For example:

<item-descriptor name="account"> <table name="Account" type="primary" id-column-name="accountId"> <property name="accountId" data-type="string" /> <property name="type" data-type="string" /> <property name="ACL" data-type="string" /> <property name="accountOwner" component-type="user" /> </table> </item-descriptor>



Configure the Secured Repository Adapter component. You need to wrap a Secure Repository component around the underlying repository. For example:

SecureAccountRepository.properties: $class= atg.adapter.secure.GenericSecuredMutableRepository $scope=global # The name property is for the ACC. name=Secure Account Repository repositoryName=SecureAccountRepository # The repository property refers to the underlying repository repository=AccountRepository configurationFile=secureAccountRepository.xml securityConfiguration= /atg/dynamo/security/SecuredRepository/SecurityConfiguration

39

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE



Write a secure repository definition file to spell out the access control you desire. In the following example, we first establish the name of the owner (accountOwner) property and the name of the property holding the access control list (ACL). Then it establishes an ACL that grants read, write, and list (for queries) access to account items to members of the ACC's administrators-group.

<!DOCTYPE gsa-template PUBLIC "-//Art Technology Group, Inc.//DTD General SQL Adapter//EN" "http://www.atg.com/dtds/security/secured_repository_template_1.1.dtd"> <secured-repository-template> <item-descriptor name="account"> <owner-property name="accountOwner"/> <acl-property name="ACL"/> <descriptor-acl value="Admin$role$administrators-group:list,read,write;"/> </item-descriptor> </secured-repository-template>

40

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE



Conclusions: ATG Data Anywhere Architecture Decreases Total Cost of Ownership


At the heart of all web applications is data access. Data access makes web sites more intelligent and thereby more useful for companies and customers. Web applications require a data access mechanism to interact with user profiling information, web site content and enterprise data. A customer facing web site needs to have a unified view of all customer interactions (including sales force interactions, call center interactions, and web experiences). This unified view of customer data leads to an integrated and coherent customer experience. Data access for a web application is especially complex because the object-oriented world of a Java application is quite different than the structure of data in a relational database, an LDAP directory, or a file system. The way you access each of these data sources varies dramatically, so developers have to learn the tools and tips of each kind of system. J2EE provides some support for data access in the form of JDBC and container managed entity beans (EJBs), but both most implementations of these technologies focus on mapping relational data to objects. Of course, developers can create beanmanaged EJBs that let you interface EJBs to whatever data source you want (as long as youre willing to write a lot of code or use a tool to help you). JDO is another data access standard for data access that supports data source independence, but still requires developers to write a Java class for each persistent type and caching of data objects is left to the vendor implementations.

As you can see, the ATG Data Anywhere Architecture has several advantages over traditional data access mechanisms as summarized below:


Insulates application developers from schema changes and also storage mechanism (data can move from a relational database to an LDAP directory without requiring any re-coding) Unifies your customer data without copying it all into a central data source Provides intelligent caching of data objects to ensure excellent performance and timely/accurate results Simplifies transactional control (programmatic demarcation using modes or droplets on a dynamic page) Provides powerful searching tools out-of-the-box that can span data sources and data types Provides fine-grained access control to data all the way down to the individual property level. Easier to use and more powerful than Java Data Objects (JDO) and Enterprise JavaBeans (shorter learning curve, no code required to represent a persistent type, simpler configuration, more than just relational database support)













The ATG Data Anywhere Architecture provides advantages that go well beyond the other options. Dynamo Repositories allow developers to focus on implementing business logic rather than spending time writing "wrapper classes" for each persistent data type used by the application. This focus directly improves time-to-market and significantly reduces the total cost of ownership of web applications.

41

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

Appendix: Other Sources of Information


Documentation
ATG Dynamo Application Server Programming Guide Part II: Repositories ATG Dynamo Personalization Programming Guide Setting Up a Profile Repository Setting Up an LDAP Profile Repository Linking SQL and LDAP Repositories Working with the Dynamo User Directory Setting Up an LDAP User Directory ATG Dynamo Administration Guide Using JDBC with Dynamo Configuring Databases Managing Database Servers Repository and Database Performance ATG Dynamo Page Developers Guide Using Search Forms Dynamo 5 ER Diagrams

Education
See atg.com for more information about these education offerings. Instructor-led Training ATG Dynamo Essentials for Java Developers (5 days) Utilizing Dynamo Repositories (2 days) Self-directed learning Mastering Web Applications Mastering Personalized Applications ATG e-Learning Connection Extending the User Profile (an e-Course)

42

UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

This publication may not, in whole or in part, be copied, photocopied, translated, or reduced to any electronic medium or machine-readable form for commercial use without prior consent, in writing, from Art Technology Group (ATG), Inc. ATG does authorize you to copy documents published by ATG on the World Wide Web for non-commercial uses within your organization only. In consideration of this authorization, you agree that any copy of these documents which you make shall retain all copyright and other proprietary notices contained herein. This documentation is provided as is without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. The contents of this publication could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein; these changes will be incorporated in the new editions of the publication. ATG may make improvements and/or changes in the publication and/or product(s) described in the publication at any time without notice. In no event will ATG be liable for direct, indirect, special, incidental, economic, cover, or consequential damages arising out of the use of or inability to use this documentation even if advised of the possibility of such damages. Some states do not allow the exclusion or limitation of implied warranties or limitation of liability for incidental or consequential damages, so the above limitation or exclusion may not apply to you. Acknowledgments I would like to thank all of the people who have contributed along the way to the creation of this paper. First and foremost, thanks to Bill Morrison, ATG Product Marketing Manager, who sponsored the creation of this white paper. Extra special thanks to the following trainers and courseware developers whose inspiration, ideas, diagrams, words, and experience have been used as source material for this white paper: Diana Carroll, Blake Crawford, Kevin Johnson, Pierre Billon, Karin Layher, and Paul Donovan. Thanks go to the following folks who reviewed the paper and provided helpful feedback: Blake Crawford, Karen Kilty, Joyce Wang, and Nathan Abramson. Final thanks go to my wife Bonnie Durante who put up with long hours spent designing, writing, and proofreading.



UNDERSTANDING ATG DATA ANYWHERE ARCHITECTURE

www.atg.com/offices America Headquarters Art Technology Group, Inc. 25 First Street Second Floor Cambridge, MA 02141 USA Tel: +1 617 386 1000 Fax: +1 617 386 1111 North American Offices Atlanta / Chicago / Dallas / Los Angeles / New York / Palo Alto / San Francisco / Toronto / Washington DC European Headquarters Art Technology Group (Europe), Ltd Apex Plaza Forbury Road Reading RG1 1AX UK Tel: +44 0 118 956 5000 Fax: +44 0 118 956 5001 European Offices Amsterdam / Frankfurt / London / Milan / Paris / Stockholm Asia/Pacific Headquarters Art Technology Group, Inc. Suite 46 Level 11 Tower B Zenith Centre 821 Pacific Highway Chatswood NSW 2067 Sydney Australia +61 2 8448 2071 +61 2 8448 2010 Asia/Pacific Offices Hong Kong / Singapore Japan Headquarters Art Technology Group, Inc. Imperial Tower, 15th Floor 1-1-1 Uchisaiwaicho Chiyoda-ku, Tokyo 100-0011, Japan www.atg.com 6540001-01 April 2002 2002, Art Technology Group, Inc. ATG, Art Technology Group, the techmark, the ATG Logo, and Dynamo are registered trademarks, and Personalization Server and Scenario Server are trademarks of Art Technology Group. All other trademarks are the property of their respective holders. All specifications are subject to change without notice. Art Technology Group, Inc. cannot accept liability for any loss or damage arising from the use of information or particulars in the brochure. NASDAQ:ARTG

About ATG
A trusted, global specialist in e-commerce, ATG has spent the last decade focused on helping the world's premier brands maximize the success of their online businesses. The ATG Commerce application suite is the top-rated platform by industry analysts for powering highly personalized, efficient and effective e-commerce sites. The company's platform-neutral e-commerce optimization services can be easily added to any Web site to increase conversions and reduce abandonment. These services include ATG Recommendations and eStara Connections. For more information, please visit http://www.atg.com.
2009 Art Technology Group, Inc. ATG, Art Technology Group and the ATG logo are registered trademarks of Art Technology Group. All other trademarks are the property of their respective holders. NASDAQ:ARTG

You might also like