You are on page 1of 35

Possible BI Developer Interview Questions:

1. Explain SDLC phases and types of SDLC SDLC phases 1. 2. 3. 4. 5. 6. Project Initiation Generally TA/BA will do this (Gathering Requirements BRTDD) Analysis Phase Cost, Time, What Software / technology to use? Design Phase --Specifications etc Development Phase goes along with design phase Testing Phase Unit Test, Smoke Test, Integration Test, Load Test, UAT Test Implementation / Production Phase: Dev Server Test Server Test Server Prod Server (USER ACCEPTANCE TEST is done here)

Most commonly known and used SDLC models: Waterfall Model The Waterfall Model is the oldest and most well-known SDLC model. The distinctive feature of the Waterfall model is its sequential step-by-step process from requirements analysis to maintenance. The major weakness of the Waterfall Model is that after project requirements are gathered in the first phase, there is no formal way to make changes to the project as requirements change or more information becomes available to the project team. What are good candidate software development projects for the Waterfall Model? Systems that have welldefined and understood requirements are a good fit for the Waterfall Model. Spiral Model In the Spiral SDLC Model, the development team starts with a small set of requirements and goes through each development phase (except Installation and Maintenance) for those set of requirements. Based on lesson learned from the initial iteration (via a risk analysis process), the development team adds functionality for additional requirements in ever-increasing "spirals" until the application is ready for the Installation and Maintenance phase (production). Each of the iterations prior to the production version is a prototype of the application. The advantage of the Spiral Model over the Waterfall Model is that the iterative approach allows development to begin even when all the system requirements are not known or understood by the development team. As each prototype is tested, user feedback is used to make sure the project is on track. The risk analysis step provides a formal method to ensure the project stays on track even if requirements do change. If new techniques or business requirements make the project unnecessary, it can be canceled before too many resources are wasted. In today's business environment, the Spiral Model (or its other iterative model cousins) is the most used SDLC model. An example application development project that would a good candidate for the Spiral Model is an online customer support system where it is not well-understood what services customers want or can accomplish online. Each iterative prototype helps answer the question, "Can and will customers use this system. The Spiral Model combines elements of the Top-down and Bottom-up SDLC models that are discussed in the next sections.

Top-Down Model The Top-down SDLC model was popularized by IBM in the 1970s, and its concepts are used in other SDLC models such as the Waterfall and Spiral Models previously discussed. In a pure Top-down model, high-level requirements are documented, and programs are built to meet these requirements. Then, the next level is designed and built. A good way to picture the Top-down model is to think of a menu-driven application. The top level menu items would be designed and coded, and then each sublevel would be added after the top level was finished. Each menu item represents a subsystem of the total application. The Top-down model is a good fit when the application is a new one and there is no existing functionality that can be incorporated into the new system. A major problem with the Top-down model is that real system functionality is not added and cannot be tested until late in the development process. If problems are not detected early in the project, they can be costly to remedy later. Bottom-Up Model In the Bottom-up SDLC model, the lowest level of functionality is designed and programmed first, and finally all the pieces are integrated together into the finished application. This means that, generally, the most complex components are developed and tested first. The idea is that any project show-stoppers will surface early in the project. The Bottom-up model also encourages the development and use of reusable software components that can be used multiple times across many software development projects. Again, think of a menu driven system where the development starts with the lowest level menu items. The disadvantage of the Bottom-up model is that an extreme amount of coordination is required to be sure that the individual software components work together correctly in the finished system. Hybrid Model The Hybrid SDLC model combines the top-down and bottom-up models. Rapid Prototyping With the demand for faster software development, and because of many well-documented failures of traditional SDLC models, Rapid Application Development (RAD) was introduced as a better way to add functionality to an application. The main new tenant of RAD compared to older SDLC models is the use of prototypes. After a quick requirements gathering phase, a prototype application is built and presented to the application users. Feedback from the user provides a loop to improve or add functionality to the application. Early RAD models did not involve the use of real data in the prototype, but new RAD implementations do use real data. The advantage of Rapid Prototyping Models is that time-to-market is greatly reduced. Rapid Prototyping skips many of the steps in traditional SDLC models in favor of fast and low-cost software development. The idea is that application software is a "throw-away." If a new version of the software is needed, it is developed from scratch using the newest RAD techniques and tools. The big disadvantage of the Rapid Prototyping Model is that the process can be too fast, and, therefore, proper testing (especially security testing) may not be done. The Rapid Prototyping Model is used for graphical user interface (GUI) applications such as web-based applications. Extreme Programming (XP) is a modern incarnation of the Rapid Prototyping Model.

Other SDLC models include: Object-Oriented Model Model Driven Development, Chaos Model, Agile Programming Model and many others.

2. What is Normalization and Types of normalization with examples


Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.

1.

A table (relation) is in 1NF if a. b. c. There are no duplicated rows in the table. Each cell is single-valued (i.e., there are no repeating groups or arrays). Entries in a column (attribute, field) are of the same kind.

2. 3.

A table is in 2NF if it is in 1NF and if all non-key attributes are dependent on the entire key. A table is in 3NF if it is in 2NF and if it has no transitive dependencies / A table is in 3NF if it is in 2NF and if it doesnt have any columns that are not dependent on the primary key. A table is in Boyce-Codd normal form (BCNF) if it is in 3NF and if every determinant is a candidate key. A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies. A table is in 5NF, also called "Projection-Join Normal Form" (PJNF), if it is in 4NF and if every join dependency in the table is a consequence of the candidate keys of the table. A table is in Domain\Key Normal Forms (DKNF) if every constraint on the table is a logical consequence of the definition of keys and domains.

4.

5. 6.

7.

Ref: http://en.wikipedia.org/wiki/Database_normalizationS

3. Explain DDL / DML / DCL / TCL commands with examples and differences
DDL: Data Definition Language statements are used to define the database structure or schema. 1. 2. 3. 4. 5. 6. CREATE - to create objects in the database ALTER - to alter the structure of the database DROP - to delete objects from the database COMMENT - to add comments to the data dictionary RENAME - to rename an object DBCC - (Database Console Commands) statements check the physical and logical consistency of a database

DML: Data Manipulation Language statements are used for managing data within schema objects. 1. 2. 3. 4. 5. 6. 7. 8. SELECT - retrieve data from the a database INSERT - insert data into a table UPDATE - updates existing data within a table DELETE - deletes all records from a table, the space for the records remain MERGE - UPSERT operation (insert or update) CALL - call a PL/SQL or Java subprogram EXPLAIN PLAN - explain access path to data LOCK TABLE - control concurrency

DCL: Data Control Language statements are used to control the security and permissions of the objects or parts of the database(s). 1. GRANT to allow specified users to perform specified tasks. 2. DENY disallow specified users from performing specified tasks. 3. REVOKE to cancel previously granted or denied permissions.

TCL: Transaction Control statements are used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions. 1. COMMIT - save work done 2. SAVEPOINT SAVEPOINT is a point within a particular transaction to which you may rollback without rolling back the entire transaction. 3. ROLLBACK - restore database to original since the last COMMIT 4. SET TRANSACTION - Change transaction options like isolation level and what rollback segment to use Once we commit we cannot rollback. Once we rollback we cannot commit. Commit and Rollback are generally used to commit or revoke the transactions that are with regard to DML commands.

4. How to debug Stored Procedures / Views / User Defined Functions / Triggers?


Using Debugging option in (SSMS / Visual Studio) and by creating Break Points Using SQL Profiler

5. Use of BCP command?


With BCP, you can import and export large amounts of data in and out of SQL Server databases quickly and easily. In all, BCP supports 27 switches. For example, the e switch is handy because it will create an error file you can look at if your BCP command returns errors. You access the BCP utility from the command prompt. Heres the simple syntax: bcp {dbtable | query} {in | out | queryout | format} datafile [-n native type] [-c character type] [-S server name] [-U username] [-P password] [-T trusted connection] The command example Im going to use starts with bcp followed by a fully qualified table name (database name, table or object owner, table or object name). For example, if you want to export the vSalesPerson view, as part of the Sales group from the AdventureWorks database, you supply the full view name as AdventureWorks.Sales.vSalesPerson. Next, you use an in or out argument to specify whether you want BCP to copy data into or out of a database. You then specify the location of the datafile (C:\Data\SalesPerson.txt) on your database server. bcp AdventureWorks.Sales.vSalesPerson out C:\Data\SalesPerson.txt -c T Ref: http://www.techrepublic.com/blog/datacenter/how-do-i-use-bcp-in-sql-server/319 http://searchsqlserver.techtarget.com/tip/Importing-and-exporting-bulk-data-with-SQL-Servers-bcp-utility

6. What is DTS Exec Utility?


The dtexec command prompt utility is used to configure and execute SQL Server Integration Services packages. The dtexec utility lets you load packages from three sources: a Microsoft SQL Server database, the SSIS service, and the file system. Note: When you use the version of the dtexec utility that comes with SQL Server 2008 to run a SQL Server 2005 Integration Services (SSIS) package, Integration Services temporarily upgrades the package to SQL Server 2008 Integration Services (SSIS). However, you cannot use the dtexec utility to save these upgraded changes. You can run dtexec from the xp_cmdshell prompt. EXEC xp_cmdshell 'dtexec /f "C:\UpsertData.dtsx"'

7. Explain what have you done to performance tune the SSIS packages? Data flow transformations in SSIS use memory/buffers in different ways. The way a transformation uses memory can dramatically impact the performance of your package. Transformation buffer usage can be classified into 3 categories: Non Blocking (Conditional Split / Audit / Data Conversion etc), Partially Blocking (Pivot Unpivot / Merge / Union All), and Full Blocking (Aggregate / Sort / Fuzzy Lookup). Generally speaking, if you can avoid Blocking and partially blocking transactions, your package will simply perform better. Sort is a fully blocking transformation. An easy way around needing the Sort is to sort your source data by using a SQL Command in your OLEDB Source instead of just using the drop down box and choosing Table or View. A Merge transform requires a Sort, but a Union All does not, use a Union All when you can. 8. How do you deploy the SSIS package?
We can deploy an SSIS package using Deployment Wizard or using Import Packages. Using Deployment Wizard, we have two options: 1) File System Deployment 2) SQL Server Deployment Deploy the Package 1. 2. 3. 4. 5. While in the Package designer, choose Project > [Package Name] Properties. The Configuration manager dialog will appear. Choose Deployment Utility from the tree. Change the Create Deployment Utility option from False to True. Open the Solution Explorer and right-click on the .dtsx file and choose Properties. Copy the Full Path variable and use it to find the bin\Deployment folder. Locate the [Package Name].SSIS Deployment Manifest file. Double-click on the file and follow the steps outlined by the wizard to deploy the package. Test the deployed Package 1. 2. Open MS SQL Server Management Studio and choose Connect > Integration Services from the UI. Choose the Server and connect. The packages will be saved under Stored Packages > MSDB folder. Right-click on the package to run it.

9. Explain Incremental load with example


Time is the big reason to use Incremental Load. Destructive loads truncating or deleting data first, then loading everything from the source to the destination takes longer. Plus youre most likely reloading some data that hasnt changed. If the source data is scaling, you may approach a point when you simply cannot load all that data within the nightly time-window allotted for such events. So lets only handle the data we need to the new and updated rows. The following figure describes the anatomy of an incremental load. It doesnt matter whether you are using T-SQL, SSIS, or another ETL (Extract, Transform, and Load) tool; this is how you load data incrementally. Read, Correlate, Filter, and Write.

Anatomy of an Incremental Load

Problem Description: Perform Incremental Load using SSIS package. There is one Source table with ID (may be Primary Key), CreatedDate and ModifiedDate along with other columns. The requirement is to load the destination table with new records and update the existing records (if any updated records are available). Solution: You can use Lookup Transformation where you compare source and destination data based on some id/code and get the new and updated records, and then use Conditional Split to select the new and updated rows before loading the table. However, I don't recommend this approach, especially when destination table is very huge and volume of data is very high. You can do it in simple steps: 1. Find the Maximum ID & Last ModifiedDate from destination and store in package variables. 2. 3. Pull the new and updated records from source and load to a staging table using above variables. Insert and Update the records using Execute SQL Task

Ref: http://sql-bi-dev.blogspot.com/2010/11/incremental-load-using-ssis-package.html

http://www.sqlservercentral.com/articles/Integration+Services+(SSIS)/62063/

10. Explain SCD types with an example


The simplest explanation is, SCD compares the attributes (column values) of rows of incoming data against a reference table, using a unique key called the Business Key to identify the record to compare against. What can make it complex is the range of comparison options and possible outputs for the component. The component checks attributes for three scenarios: 1. New record no record with that business key exists in the reference table 2. Changed attributes a record with that business key exists and compared attributes have changed 3. Unchanged attributes a record with that business key exists and compared attributes have not changed Now, within those scenarios are a subset of possibilities to allow for that the changed attribute shouldnt change (Fixed attributes), or that the history of changes needs to be tracked (Historic attributes), or allowing for Inferred members. The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. Type 2 had unlimited history preservation and Type 3 has limited history preservation

You could sum or average the sales by salesperson, but if you use that to compare the performance of salesmen, that might give misleading information. If the salesperson that was transferred used to work in a hot market where sales were easy, and now works in a market where sales are infrequent, her totals will look much stronger than the other salespeople in her new region, even if they are just as good. Or you could create a second salesperson record and treat the transferred person as a new sales person, but that creates problems also. Dealing with these issues involves SCD management methodologies referred to as Type 0 through 6. Type 6 SCDs are also sometimes called Hybrid SCDs. TYPE 01 SCD EXAMPLE: In this example, Supplier_Code is the natural key and Supplier_Key is a surrogate key. The disadvantage of Type 1 SCD is that there is no historical record kept in the data warehouse and an advantage to Type 1 SCDs is that they are very easy to maintain.

Supplier_Key Supplier_Code Supplier_Name Supplier_State

123 Supplier_Key 123

ABC

Acme Supply Co

CA Supplier_State IL

Supplier_Code Supplier_Name ABC Acme Supply Co

TYPE 02 SCD EXAMPLE: The Type 2 method tracks historical data by creating multiple records for a given natural key in the dimensional tables with separate surrogate keys and/or different version numbers. With Type 2, we have unlimited history preservation as a new record is inserted each time a change is made. Type 2 SCDs are not a good choice if the dimensional model is subject to change.

Supplier_Key Supplier_Code Supplier_Name Supplier_State Version

123

ABC

Acme Supply Co

CA

124

ABC

Acme Supply Co

IL

Another popular method for tuple versioning is to add effective date columns. Supplier_Key Supplier_Code Supplier_Name Supplier_State Start_Date End_Date

123

ABC

Acme Supply Co

CA

01-Jan-2000 21-Dec-2004

124

ABC

Acme Supply Co

IL

22-Dec-2004

The null End_Date in row two indicates the current tuple version. In some cases, a standardized surrogate high date (e.g. 9999-12-31) may be used as an end date, so that the field can be included in an index, and so that null-value substitution is not required when querying. TYPE 03 SCD EXAMPLE: The disadvantage of Type 3 SCD is it cannot track all historical changes, such as when a supplier moves twice

Supplier_Ke Supplier_Cod Supplier_Nam Original_Supplier_Stat Effective_Dat Current_Supplier_Stat y e e e e e

123

ABC

Acme Supply Co

CA

22-Dec-2004

IL

Ref: http://www.bimonkey.com/2009/07/the-slowly-changing-dimension-transformation-part-1/

http://www.cozyroc.com/ssis/table-difference http://en.wikipedia.org/wiki/Slowly_changing_dimension

11. Explain cleansing operation is SSIS


Data cleansing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table or database. For example, correcting misspelled records from a given record set. Below diagram shows when data cleansing is required.

To avoid those situations and to have consistent, relevant and accurate data, data cleansing is required.

Before data cleansing, Data Quality Testing need to be done. After that, data cleansing is done by parsing, data transformation, duplicate elimination and by doing some statistical analysis. The final output of the data cleansing process will be accurate, consistent and relevant data. SQL Server Integration Services (SSIS) is providing the facility to implement data cleansing processes. There are some components which can be used to perform data cleansing operations provided by SSIS. They are Lookup Fuzzy Lookup Fuzzy Grouping

Ref: http://gopika-lasitha.blogspot.com/2010/03/data-cleansing-with-ssis.html

12. What is a fact table?


Any table that you've used with a Sum or Average function in a totals query is a good bet to be a fact table. This might be order detail information, payroll records, drug effectiveness information, or anything else that's amenable to summing and averaging.

13. What is a Dimension table?


A dimension table contains hierarchical data by which you'd like to summarize. Examples would be 1) Orders table, that you might group by year, month, week, or day of receipt 2) Books table that you might want to group by genre and title.

14. Can Dimension table contain Numeric Values?


Yes dimension table can have numeric values. that is surrogate Key which holds numeric value for unique identification of records in the dimension

15. What is Degenerate Dimension Table?


A degenerate dimension is a data that is dimensional in nature but it is stored in the fact table rather than storing in the dimension table. It eliminates the need to join to a Dimension table.

16. What are ER Diagrams?


Entity Relationship Diagrams (ERDs) illustrate the logical structure of database. Tools to generate ER Diagrams: Free Tools : MySQL Bench Proprietary Tools: MS VISIO, ERWin, ER/STUDIO etc.

17. What is VLDB?


VLDB = Very Large Database, an environment or storage space managed by a relational database management system (RDBMS) consisting of vast quantities of information. The definition of what exactly a VLDB is changes every day as hardware and software adapt, become faster and faster and are capable of handling more and more loads. What main issues do you face with a VLDB? Backing up the database. With a VLDB, a daily backup of everything via RMAN or Hot Backup is simply not possible, as you cant run the backup in 24 hours. You need to: Backup less often; backup only part of the DB; use hardware such as mirror splitting or deltas; some other trick like, say, never backing it up but having 3 standbys. Ive seen it done. Performance. You need to consider radical changes such as removing RI or designing around full table scans and ignoring the block buffer cache for the largest tables. Maintenance tasks become a challenge in their own right. This could be stats gathering, it could be adding columns to a table, it could be recreating global indexes, all of which now take more time than you can schedule in the maintenance windows {so part of the definition of a VLDB could be down to how active a database is and how small you maintenance windows are 1TB could be a VLDB if you can never spend more than an hour doing anything!} GUIs are no use to you. Listing all the table spaces in your database with OEM is a pain in the proverbial when you have 962 table spaces. You cant keep track of all of them, visually. You cant properly test or prototype as you cannot afford to create a full sized test system

18. What is Data Mart?


The DM is a subset of the DW.

Reasons for creating a data mart


Easy access to frequently needed data Creates collective view by a group of users Improves end-user response time Ease of creation than Data warehouse Lower cost than implementing a full Data warehouse Potential users are more clearly defined than in a full Data warehouse

19. What is Data Warehouse?


A data warehouse is a place where data is stored for archival, analysis and security purposes. A data warehouse plays a major role player in a decision support system (DSS). DSS is a technique used by organizations to come up with facts, trends or relationships that can help them make effective decisions or create effective strategies to accomplish their organizational goals. Some of the applications data warehousing can be used for are: 1. Decision support systems 2. Trend analysis 3. Financial forecasting 4. Churn Prediction for Telecom subscribers, Credit Card users etc. 5. Insurance fraud analysis 6. Call record analysis 7. Logistics and Inventory management 8. Agriculture Advantages: 1. Employees or end users can access the data warehouse and use the data for reports, analysis and decision making. 2. Helps you understand more about the environment that your business operates in. Dis-Advantages: 1. It is time consuming to create and to keep operating 2. Security might be a huge concern, especially if your data is accessible over an open network such as the internet 3. You might also have a problem with current systems being incompatible with your data

20. What is Data Modeling?


Data modeling is the process of designing and validating a database that will be used to meet a business challenge. There are three common types of data models. 1) Conceptual data models define and describe business concepts at a high level for stakeholders addressing a business challenge. 2) Logical data models are more detailed and describe entities, attributes and relationships in business terms. 3) Physical data models define database objects, schema and the actual columns and tables of data that will be created in the database.

21. What are the steps to create a cube?


To build a new data cube using BIDS, you need to perform these steps: Create a new Analysis Services project Define a data source Define a data source view Invoke the Cube Wizard

22. What are the types of reports you have worked with?
I have created simple as well as complex reports. Reports like Table Reports and Matrix Reports, Drill down, drill through, parameterized and cascading reports. In one of my projects, I was given a task to work with parameterized reports where the user will be given a chance of selecting values from a drop down list. These entire parameters cascade on the parameter before it, more like a hierarchy. For this particular requirement I had to create a stored procedure and use it in this report.

23. How do you deploy a report?


Deploying in a report server:- after the report has been completed, it has to be deployed to a report server. Then the users can have a look at it through the report manager UI. To deploy the report, just right click on the report and say deploy where u will be asked for the TARGETSERVERURL. This is going to be http://computername/reportserver Deploying in a share point: - In addition to the report server, we can also deploy the reports in a SharePoint. For deploying the reports in SharePoint environment, we can go to the documents tab upload a new report/document. Once this is done, you are required to define the data source for this report. Go to the data connections documentsnew documentreport data source and create a new data source and finally go to the report ,right click and using the manage data sources point this data source to our report.

24. How do you build a report from cube?

25. Explain the steps you followed to tune reports using SSRS?
One thing I observed is too much filtering and data modification going on in the report itself. This causes reports to slow down. To overcome this, one should do as much of this in the T-SQL query (if you are using one). For example, if you are using filters on your dataset, try to put these filters in the T-SQL query (possibly in the WHERE argument section) instead of filtering within a table or a group, etc.

26. How to create a dash board and score cards in SSRS?

27. How will you deal with a slow running query?


This is a very open ended question and there could be a lot of reasons behind the poor performance of a query. But some general issues that you could talk about would be: No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization, excess usage of cursors and temporary tables. Some of the tools/ways that help you troubleshooting performance problems are: SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical execution plan in Query Analyzer.

28. Explain DBCC Commands with examples


The DBCC commands (Database Consistency Check commands) are divided into four main categories: Status, validation, maintenance, and miscellaneous commands. Status commands The status commands are the ones you normally run first. With these commands, you can gain an insight into what youre server is doing. DBCC SHOWCONTIG (This is the command youll probably use the most) shows you how fragmented a table, view or index is. Fragmentation is the non-contiguous placement of data. Just like a hard drive, its often the cause of the slowness of a system. Other Examples: DBCC SHOW_STATISTICS

Validation commands Once youve seen the performance issues due to fragmentation or index problems, you normally run these commands next, since they will flush out the problems the various database objects (including the database itself) are having. DBCC CHECKDB: This is By far the most widely used command to check the status of your database. This command has two purposes: To check a database, and to correct it. Other Examples: DBCC CHECKTABLE

Maintenance commands The maintenance commands are the final steps you normally run on a database when youre optimizing the database or fixing a problem. DBCC DBREINDEX: This command rebuilds the indexes on a database. The DBCC INDEXDEFRAG command defragments the index rather than rebuilding it.

Miscellaneous commands These commands perform such tasks as enabling row-level locking or removing a dynamic-link library (DLL) from memory. DBCC HELP command simply shows you the syntax of the other commands:

29. How will you schedule jobs using SSIS?


With SQL Server Agent we can schedule a Job / SSIS package in SSIS

30. How will you monitor disk space on a server?


1. xp_fixeddrives 2. fn_fixeddrives() 3. WSH (Windows Scripting Host)

4. WMI (Windows Management Instrumentation)


The xp_fixeddrives system stored procedure returns a list of physical hard drives and the amount of free space on each one. Syntax EXEC master.dbo.xp_fixeddrives --2000 fn_fixeddrives() ---2005 & up Normally, using WSH (Windows Scripting Host) and Instrumentation) is a better way of gathering disk information. WMI (Windows Management

31. Explain different backup and recovery models Backups Methods: Total 5
1. A full backup makes a complete backup of your database. Done every week end. 2. A differential backup stores all changes that have occurred to the database since the last full backup. Done every night. 3. A file group backup is useful when your database is so large that a full backup would take too long. 4. A transaction log backup creates a copy of all changes made to the database that are currently stored in the transaction log file. Done once in every 30 min. If you perform a simple recovery model, you will not have the option of transaction log back up. 5. Tail Log Backup May or may not retrieve data Done once in every 15 min

Recovery Models:
1. 2. 3. Full Recovery model Bulk Log Recovery model Simple Recovery model

32. Why do we need SQL Profiler?


SQL Profiler is a tool that captures SQL Server events from the server and saves those events in what's known as a trace file. You can then analyze or use the trace file to troubleshoot logic or performance problems. This information can be used to troubleshoot a wide range of SQL Server issues, such as poorly-performing queries, locking and blocking, excessive table/index scanning, and a lot more. You can use this utility to monitor several areas of server activity, such as: Analyzing and debugging SQL statements and stored procedures. Monitoring slow performance. Stress analysis. General debugging and troubleshooting. Fine-tuning indexes. Auditing and reviewing security activity.

33. What is SQL Mail?


SQL Server database has an integrated mailing system. SQL Server 2000 has SQL Mail. SQL Mail allows SQL Server to send and receive e-mail by establishing a client connection with a supported mail server. Outlook installations, Messaging Application Programming Interface (MAPI) profiles, third party Simple Mail Transfer Protocol (SMTP) connector, and extended stored procedures are all needed for SQL Mail. More importantly, SQL Mail will degrade SQL Server performance. So, to overcome all these, SQL Server 2005 came up with Database Mail. 34.

What is index tuning wizard?


The index tuning wizard is a utility that comes with SQL Server that makes recommendations on how indexes should be built on a database to optimize performance. These recommendations are derived based on T-SQL commands that the wizard analyzes. The wizard makes it easy to tune your indexes without any great understanding of SQL Server index structure and/or database internals. The index tuning wizard can also determine how a proposed change might affect performance. The wizard has the capability to make index changes immediately, schedule them for a later timeframe, or build a T-SQL script to create the indexes.

35. Explain about SQL Client Utility The SQL Client Network Utility lets you change the way ADO connects to SQL Server and MSDE by changing the protocols used. 36. Explain Temporary Tables, Table Variable and CTE
Temp tables, behave just like normal tables, but are created in the TempDB database. They persist until dropped, or until the connection that created them disappears. They are visible in the procedure that created them. Just like normal tables, they can have primary keys, constraints and indexes, and column statistics are kept for the table. They are divided in to two: Local Temp Table (#T) and Global Temp Table (##T) Table Variables; behave very much like other variables in their scoping rules. They are created when they are declared and are dropped when they go out of scope. They cannot be explicitly dropped. Just like temp tables, table variables also reside in TempDB. Table variables can have a primary key, but indexes cannot be created on them, neither is statistics maintained on the

columns. This makes table variables less optimal for large numbers of rows, as the optimizer has no way of knowing the number of rows in the table variable. Common table expression (CTE) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query. Unlike a derived table, a CTE can be self-referencing and can be referenced multiple times in the same query.SQL Server supports two types of CTEsrecursive and nonrecursive. A nonrecursive CTE is one that does not reference itself within the CTE. A recursive CTE is one that references itself within that CTE.

37. Which one is better? Temporary Tables, Table Variable and CTE?
Depends

38. How to Unit Test in SSIS


Unit testing of SSIS packages really comes down to testing of Executables in a control flow, and particularly executables with a high degree of programmability. The two most significant control flow executable types are Script Task executables and Data Flow executables.

http://politechnosis.kataire.com/2008/06/ssis-unit-testing.html

39. How do you handle errors in SSIS?


Errors typically fall into 3 categories: 1. Data conversion errors: These occur if a conversion results in loss of significant digits, the loss of insignificant digits, and the truncation of strings. Data conversion errors also occur if the requested conversion is not supported. Expression evaluation errors: These occur if expressions that are evaluated at run time perform invalid operations or become syntactically incorrect because of missing or incorrect data values. Lookup errors: These occur if a lookup operation fails to locate a match in the lookup table.

2. 3.

At Control flow level; I will use OnError Event Handler and log the error in a custom table At Data flow Level; Based on the business requirement, If I have to redirect the error (to flat file or a table), I will redirect it If I have to ignore the failure, then I will ignore it If I have to fail the component, I will fail it ------------------------------------------------------------------------------------------------If the package fails due to network errors; then I will look the reason for failure at the Job History table ---------------------------------------------------------------------------------------------------

40. Types of indexes and differences


Indexs Follow Balanced Tree Structure (B-TREE) and Search results are quicker with indexes Two Types: Clustered Index and Non-Clustered Index Clustered Index All the data is stored at the leaf level. So no need of HEAP Can Create only 1 clustered Index per table. Non-Clustered Index Instead of data, there will be a pointer at the leaf level which points towards data in the heap SQL Server 2005 supports up to 249 nonclustered indexes, and SQL Server 2008 support up to 999 When u create a Non-clustered Index, data is Logically Sorted in a column Data Sorting is SLOW and they can take up quite a bit of space on the disk Can use INCLUDE OPTION

When u create a clustered Index, data is physically Sorted in a column Data Sorting is FAST Can Not use INCLUDE OPTION

You cannot create Indexs on columns configured with large object (LOB) data types, such as image, text, and varchar (max). Indexs allow only up to 900 Bytes of size (sum of all data types on all columns in a table). If the sum of all data types is greater than 900 Bytes, then you have to use Include Option to build an index on that table.

41. Types of triggers and Adv & Dis-Advantages of triggers


A Trigger is nothing but an automatic stored procedure which gets fired when a defined event happens. We create Triggers to maintain business rules. Depending on the time when a trigger gets fired, it is classified as Instead of Trigger (DDL Trigger) and After Trigger (DML Trigger). Instead of Trigger (DDL Trigger) gets fired before the event happens After Trigger (DML Trigger) gets fired after the event happens. Ex: Table Auditing , Sending Email A Trigger can be created between two tables even if the two tables are in different Databases. In Data warehouse, we will not use triggers. Nesting of triggers is possible up to 32 levels. Advantages of Triggers: 1. It can catch the errors in business logic at the database level. 2. It provides an alternative way to run scheduled tasks. 3. It is very much useful when we use it to audit the change the data in a database table. 4. It provides an alternative way to check integrity. Dis Advantages of Triggers: 1. Triggers executes invisibly from client-application which connects to the database server. So it is difficult to figure out what happens at the database layer. 2. Triggers run on every update made to the table therefore it adds more load to the database and cause the system to run slow.

42. Difference between Stored Procedure and Function Stored Procedure Function
SP Can perform Error Handling SP Cannot be used as a Table Valued SP May/May not return a value SP Cannot be called from a select statement We can call function in a stored procedure Func Cannot perform Error Handling Func Can be used as a Table Valued Function should return a value Function can be called from a select statement We cannot call SP in a function. Only Extended SP can be called

43. Difference Between Stored Procedure and View Stored Procedure


We cannot update a Stored Procedure We cannot write Select * From SP You can not Join two SP

View

We can update a view We can write Select * from a view You can join Two Views

44. Difference between Stored Procedure and Trigger Stored Procedure


In SP you can pass parameter We have to explicitly call the Sp We can write a stored procedure within a trigger SP is written for a Database SP may return a value

Trigger

in trigger you can't pass parameter Trigger is implicitly fired when there is an insert/ update/delete on a table / view We cannot write a trigger within a stored procedure. Trigger written on an individual Table Trigger will not return a value

45. Primary Key Vs Unique Key Primary Key


PK will not Allow NULLs We can have only one PK on a table PK will make sure every row is unique

Unique Key
Will Allow Only one NULL Can have multiple Unique Keys on a Table It will make sure set of columns is unique

46. Truncate Vs Delete Truncate


TRUNCATE will reset any identity columns to the default seed value Cannot TRUNCATE a table that has any foreign key constraints TRUNCATE is also a logged operation, but in a different way. It logs the deallocations of the data pages in which the data exists. TRUNCATE is a faster operation to perform compared to DELETE You can't use WHERE clause
Triggers will not get fired

Delete
DELETE will not reset any identity columns to the default seed value Can DELETE any row that will not violate a constraint, while leaving the foreign key or any other constraint in place DELETE is a logged operation on a per row basis

DELETE is a slower operation compared to TRUNCATE You can use WHERE clause
Triggers will get fired

47. Identity Column Vs Primary Key


Identity Column: 1. Identity column is auto incremented 2. Identity Column can only have numeric values 3. We can only have one Identity column in a table 4. All identity columns are primary Keys 5. Identity Values cannot be updated Primary Key: 1. Primary Key value is not auto incremented. It has to be entered manually by the user. 2. Primary Key column can be Int, Numeric or even Char 3. Can be created on more than one column (composite primary key) 4. All primary keys are not identity columns. 5. Primary Key Values Can be updated

48. Char Vs Varchar


Char: 1.Fixed length memory storage 2.CHAR takes up 1 byte per character 3.Use Char when the data entries in a column are expected to be the same size VarChar: 1.Variable length memory storage(Changeable) 2.VARCHAR takes up 1 byte per character, + 2 bytes to hold length information 3.varchar when the data entries in a column are expected to vary considerably in size. Conclusion: 1.When Using the fixed length data's in column like phone number, use Char 2.When using the variable length data's in column like address use VarChar

49. Where vs. Having clause


Where Clause: 1.Where Clause can also be used in statements other than Select statement 2.Where applies to each and every single row 3.In where clause the data that fetched from memory according to condition 4.Where is used before GROUP BY clause Ex: Using Condition for the data in the memory. Having Clause: 1. Having is used only with the SELECT statement. 2. Having applies to summarized rows (summarized with GROUP BY) 3.In having the completed data firstly fetched and then separated according to condition. 4. HAVING is used after GROUP BY clause. Having clause is used to impose condition on GROUP Function Ex: when using the avg function and then filter the data like avg(Sales)>0

Summary: Having works like Where clause without Group By Clause

50. VARCHAR Vs NVARCHAR


VARCHAR: 1.Storage: 8 bit 2.Abbreviation: Variable -Length Character String 3.Accepts only English character 4.Doesn't supports other language symbols 5.Runs faster than NVARCHAR as consumes less memory 6.Use this when you develop the application for only local purpose NVARCHAR: 1.Storage: 16 bit 2.Abbreviation: uNicode 3.Accepts both English character and non-English symbols 4.supports other language symbols 5.Runs slower than VARCHAR as consumes less memory 6.Use this when you use your application globally

51. What are Views and Types of VIEWS


A view is a virtual table that consists of columns from one or more tables. These tables are referred to as base or underlying tables. A view serves as a security mechanism. This ensures that users are able to retrieve and modify only the data seen by them. Users cannot see or access the remaining data in the underlying tables. A view also serves as a mechanism to simplify query execution. Complex queries can be stored in the form as a view, and data from the view can be extracted using simple queries. Views ensure the security of data by restricting access to the following data:

Specific rows of the tables. Specific columns of the tables. Specific rows and columns of the tables. Rows fetched by using joins. Statistical summary of data in a given tables. Subsets of another view or a subset of views and tables.

Some common examples of views are:

A subset of rows or columns of a base table. A union of two or more tables. A join of two or more tables. A statistical summary of base tables. A subset of another view, or some combination of views and base table.

The restrictions imposed on views are as follows:

A view can be created only in the current database. The name of a view must follow the rules for identifiers and must not be the same as that of the base table. A view can be created only if there is a SELECT permission on its base table. A SELECT INTO statement cannot be used in view declaration statement. A trigger or an index cannot be defined on a view. The CREATE VIEW statement cannot be combined with other SQL statements in a single batch.

SQL Server stores information on the view in the following system tables:

SYSOBJECTS stores the name of the view. SYSCOLUMNS stores the names of the columns defined in the view. SYSDEPENDS stores information on the view dependencies. SYSCOMMENTS stores the text of the view definition.

There are also certain system-stored procedures that help retrieve information on views. The sp_help system-stored procedure displays view-related information. It displays the view definition, provided the name of the view is given as its parameter. The guidelines for renaming a view are as follows:

The view must be in the current database. The new name for the view must be followed by the rules for identifiers. A view can be renamed only by its owner. A view can also be renamed by the owner of the database. A view can be renamed by using the sp_rename system stored procedure.

Manipulating Data Using Views


You can modify data by using a view in only one of its base tables even though a view may be derived from multiple underlying tables. For example, a view vwNew that is derived from two tables, table and table2, can be used to modify either table or table2 in a single statement. A single data modification statement that affected both the underlying tables is not permitted. You cannot modify the following of Columns using a view:

Columns that are based on computed values. Columns that are based on built_in_function like numeric and string functions. Columns that are based on row aggregate functions (Sum / GroupBy).

Consider a situation in which a table contains a few Columns that have been defined as NOT NULL, but the view derived from the table does not contain any of these NOT NULL columns. During an INSERT operation, there may be situation where:

All the NOT NULL columns in the base tables are defined with default values. In the first case, the INSERT operation will be successful because the default values are supplied for the NOT NULL columns. In the second case, the INSERT operation will fail because default values are not supplied for the NOT NULL columns.

Difference between view and materialized view View - store the SQL statement in the database and let you use it as a table. Every time you access the view, the SQL statement executes. Materialized view - stores the results of the SQL statement in the table form in a database. SQL statement only executes once and after that every time you run the query, the stored result set is used. Pros include quick query results.

52. Explain Cursors and types of Cursors?


Cursors: Syntax: Declare (CursorName) Datatype Whenever you have to do a row by row operation, you go with cursors. A good example is the one that requires running totals. Cursors are good on tables with 2,000 to 10,000 records in it. When you apply a cursor on a table, it will lockdown the table. Cursors always get the data from TempDB. Cursors are of 3 types: STATIC With these cursors, we may not always work with the updated data DYNAMIC Before this type of cursor access the data in the next row of a TempDB; it will go back to the main table to check if there is any change in the data. If there is a change, it will update accordingly KEYSET Any changes in the non key column values will get updated and changes in key columns will not get updated. (Only Primary Key and Unique Key). If we have both, it will go with PK

Cursor Attributes: 1. Read Only: SQL Server will not lock the table if this attribute is used 2. Fast-Forward: If this attribute is selected, cursor always go in sequence while performing a row by row operation (Row1, row2, row3, ..) . Fetch Next Option can be used only in this attribute. 3. Scroll: If this attribute is selected, you can scroll between the rows while performing the any row operation (Row1, row30, row4, row77,.) Here are some alternatives to using a cursor: Use WHILE LOOPS Using Row Number Function and Loop over every row Using Table Variables , Using Temp Tables, Using Derived Tables Use correlated sub-queries or Use the CASE statement

53. What are different types of constraints?


1. 2. 3. 4. 5. 6. Six types of Constraints: Primary Key Constraint Unique Key Constraint Foreign Key Constraint Default Constraint Check / Domain Constraint Not Null Constraint

54. Types of joins? Example of a self Join?


Three types of Joins 1. Inner Join 2. Outer Join Left Outer Join Right Outer Join Full Outer Join 3. Cross Join

55. What are Transactions? (TCL Commands)


Transaction is defined as a unit of work occurring or failing as a batch Begin Transaction Rollback Transaction Commit Transaction SQL Server allows you to nest transactions. Basically, this feature means that a new transaction can start even though the previous one is not complete.

56. What is a sub query? Properties of sub Queries.


A Query with in a query is called a sub-Query. It is divided in to two types: Co-Related Sub Query and Non-Correlated Sub Query Co-Related Sub Query: Output of the inner query depends on the outer query Non-Correlated Sub Query: Output of the inner query does not depend on the outer query To increase the performance; Try to replace the Sub-Queries with Joins or with Where Exists Clause. Try to replace Non-Correlated Sub Query with Co-Related Sub Query

57. Explain steps to migrate SS2005 to SS2008 58. What is a check point in SSIS?
SQL Server Integrated Services (SSIS) offers the ability to restart failed packages from the point of failure without having to rerun the entire package. When checkpoints are configured, the values of package variables as well as a list of tasks that have completed successfully are written to the checkpoint file as XML. When the package is restarted, this file is used to restore the state of the package to what it was when the package failed.

59. What is a break point SSIS?


Break Points are used in debugging an SSIS Package. SSIS allows you to setup two different kinds of breakpoints; on packages, tasks and containers, and inside the script objects. Once set, the execution of your SSIS package will stop at breakpoints and allow you to view the package in a paused state. Breakpoints can only be set on items in your control flow; you cannot setup breakpoints on data flow tasks. To troubleshoot problems inside a data flow, then Data Viewers come in to play.

60. What is a Save Point in Transactions?


Savepoints offer a mechanism to roll back portions of transactions. A user can set a savepoint, or marker, within a transaction. The savepoint defines a location to which a transaction can return if part of the transaction is conditionally canceled. SQL Server allows you to use savepoint via the SAVE TRAN statement

61. Difference between Star & Snow Flake Schemas?


Star schema: It contains only one fact table which is associated with numerous dimension tables (with the help of foreign keys) around it, which resembles the shape of a star. This schema depicts a high level of denormalized view of data. Star Schema cannot have a parent table. However in Snowflake schema, each dimension table is associated with many sub dimension tables and it is considered as a normalized form of star schema. Snowflake Schema can have a parent table.

Star Schema: Definition: The star schema is the simplest data warehouse schema. It is called a star schema because the diagram resembles a star with points radiating from a center. A single Fact table (center of the star) surrounded by multiple dimensional tables (the points of the star). Advantages: Simplest DW schema Easy to understand Easy to Navigate between the tables due to less number of joins. Most suitable for Query processing

Disadvantages: Occupies more space Highly Denormalized

Snowflake schema: Definition: A Snowflake schema is a Data warehouse Schema which consists of a single Fact table and multiple dimensional tables. These Dimensional tables are normalized. A variant of the star schema where each dimension can have its own dimensions. Advantages: These tables are easier to maintain Saves the storage space.

Disadvantages: Due to large number of joins it is complex to navigate

Starflake schema - Hybrid structure that contains a mixture of (denormalized) STAR and (normalized) SNOWFLAKE schemas.

62. How will you maintain security in SQL Server?


Again this is another open ended question. Here are some things you could talk about: Preferring NT authentication, using server, database and application roles to control access to the data, securing the physical database files using NTFS permissions, using an unpredictable SA password, restricting physical access to the SQL Server, renaming the Administrator account on the SQL Server computer, disabling the Guest account, enabling auditing, using multiprotocol encryption, setting up SSL, setting up firewalls, isolating SQL Server from the web server etc.

63. Explain Locks , Types of locks


Locks ensure transactional integrity and database consistency. Locking prevents users from reading data being changed by other users, and prevents multiple users from changing the same data at the same time. If locking is not used, data within the database may become logically incorrect, and queries executed against that data may produce unexpected results. Shared lock (S) more than one Query can access the object. Exclusive lock (X) only one Query can access the object. Update lock (U) Intent share (IS) Intent Exclusive (IX) Lock escalation: Lock escalation is the process of converting a lot of low level locks (like row locks, page locks) into higher level locks (like table locks).

64. What is dead lock? How to avoid dead locks?


Deadlocking occurs when two user processes have locks on separate objects and each process is trying to acquire a lock on the object that the other process has. When this happens, SQL Server identifies the problem and ends the deadlock by automatically choosing one process and aborting the other process, allowing the other process to continue. The aborted transaction is rolled back and an error message is sent to the user of the aborted process. Generally, the transaction that requires the least amount of overhead to rollback is the transaction that is aborted. Here are some tips on how to avoid deadlocking on your SQL Server: 1. Ensure the database design is properly normalized. 2. Have the application access server objects in the same order each time. 3. During transactions, don't allow any user input. Collect it before the transaction begins. 4. Avoid cursors. 5. Keep transactions as short as possible. One way to help accomplish this is to reduce the number of round trips between your application and SQL Server by using stored procedures or keeping transactions with a single batch. Another way of reducing the time a transaction takes to complete is to make sure you are not performing the same reads over and over again. If your application does need to read the same data more than once, cache it by storing it in a variable or an array, and then re-reading it from there, not from SQL Server. 6. Reduce lock time. Try to develop your application so that it grabs locks at the latest possible time, and then releases them at the very earliest time. 7. If appropriate, reduce lock escalation by using the ROWLOCK or PAGLOCK. 8. Consider using the NOLOCK hint to prevent locking if the data being locked is not modified often.

9. If appropriate, use as low of isolation level as possible for the user connection running the transaction. 10. Consider using bound connections.

65. What is an execution plan?


A query execution plan outlines how the SQL Server query optimizer actually ran (or will run) a specific query. This information if very valuable when it comes time to find out why a specific query is running slow. Execution plans show you what's going on behind the scenes in SQL Server. They can provide you with a wealth of information on how your queries are being executed by SQL Server, including: Which indexes are getting used and where no indexes are being used at all. How the data is being retrieved, and joined, from the tables defined in your query. How aggregations in GROUP BY queries are put together. The anticipated load, and the estimated cost, that all these operations place upon the system.

66.

What is log shipping?


Essentially, log shipping is the process of automating the backup of database and transaction log files on a production SQL server, and then restoring them onto a standby server. But this is not all. The key feature of log shipping is it will automatically backup transaction logs throughout the day (for whatever interval you specify) and automatically restore them on the standby server. This in effect keeps the two SQL Servers in "synch". Should the production server fail, all you have to do is point the users to the new server, and you are all set.

67. Explain Try-Catch Block


Versions before SQL Server 2005, we handle exceptions by checking the @@error global variable immediately after an INSERT, UPDATE or DELETE, and then perform some corrective action if @@error did not equal zero. From SQL Server 2005 exception handing is done through TRY CATCH block as other programming language like JAVA, C# etc. Example:
BEGIN TRY RAISERROR ('A problem is raised', 16,1) END TRY BEGIN CATCH SELECT ERROR_NUMBER() as ERROR_NUMBER, ERROR_SEVERITY() as ERROR_SEVERITY, ERROR_STATE() as ERROR_STATE, ERROR_MESSAGE() as ERROR_MESSAGE END CATCH

ERROR_NUMBER() returns the error number. ERROR_SEVERITY() returns the severity. ERROR_STATE() returns the error state number. ERROR_PROCEDURE() returns the name of the stored procedure or trigger where the error occurred. ERROR_LINE() returns the line number inside the routine that caused the error. ERROR_MESSAGE() returns the complete text of the error message. The text includes the values supplied for any substitutable parameters, such as lengths, object names and times etc.

68. What Third Party tools have you used in your previous projects?
CozyRoc's Table Difference to replace SCD SQL PROMPT 5.0 - Red Gate Tool

69. What are the challenges have you faced in your previous projects? How did you overcome those challenges?
One of our business requirements is they wanted output in XML format. Since we do not have any XML destination component in SSIS, I have to write a stored procedure using FOR XML clause to get the XML output.

70. Explain your previous project and your roles and responsibilities in it
My recent project is with Security Health Plan, located in Marshfield, WI. This is a migration project from their legacy system to QNXT Application with back end as SQL Server Database. My role in this project is mainly as a SSIS Developer \ Interface & Extract developer. I designed SSIS packages to export data out of SQL Server using Stored Procedures. I also validated (Using BIDS & Execute Package Utility --> Validation) and deployed SSIS packages. I Prepared release notes document and also Test Case Scenarios for all the interfaces and extracts. I Optimized existing SQL queries and also fine tuned SSIS packages by eliminating Blocked transformations and replacing them with either Partially Blocking or Non-Blocking transformations. I also used Cache connection managers while performing the lookup operation which increased the processing speed.

71. Why do you want to join our company?


The Lacek Group's prevailing area of expertise is loyalty and retention marketing. Loyalty Program Tracking Software: REWARD and jREWARD

72. Explain Ranking Functions?


T-SQL currently supports four ranking functions: ROW_NUMBER, RANK, DENSE_RANK, and NTILE. ROW_NUMBER function returns a sequential number starting at 1 for each row or grouping within your result set. For Example, 1,2,3,4,5,6,7,8,9,10. RANK() returns the rank of each row within the partition of a result set. When there is a tie, the same rank is assigned to the tied rows. For example, 1, 2, 3, 3, 3, 6, 7, 7, 9, 10.. DENSE_RANK() works like RANK(), except that the numbers being returned are packed (do not have gaps) and always have consecutive ranks. For example, 1, 2, 3, 3, 3, 4, 5, 5, 6, 7. NTILE function is used to break up a record set into a specific number of groups. In this NTILE example, I want to group my Person records into three different groups of records. I want these groups to be based on the Age column. To do that I would run the following T-SQL: SELECT FirstName, Age, NTILE(3) OVER (ORDER BY Age) AS [Age Groups] FROM Person Here is my result set from the above T-SQL command: FirstName ---------Larry Doris George Mary Sherry Sam Ted Marty Sue Frank John Age ----------5 6 6 11 11 17 23 23 29 38 40 Age Groups -------------------1 1 1 1 2 2 2 2 3 3 3

In my result set I ended up with three different Age Groups. The first age group goes from Age 5 to Age 11, the second age group goes from 11 to 23, and the last age group is 29 to 40. The NTILE function just evenly divides your record set into the number of groups the NTILE function requests. By using the NTILE function each record in a group is give the same ranking.

73. Pivot and Unpivot?


PIVOT and UNPIVOT relational operators are used to change a table-valued expression into another table. PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output. UNPIVOT performs the opposite operation to PIVOT by rotating columns of a table-valued expression into column values.

If we PIVOT any table and UNPIVOT that table do we get our original table? I really think this is good question. Answer is yes, you can but not always. When we pivot the table we use aggregated functions. If due to use of this function if data is aggregated, it will be not possible to get original data back. Ref: http://blog.sqlauthority.com/2008/06/07/sql-server-pivot-and-unpivot-table-examples/

74. Union Vs Union All


The main difference between UNION ALL statement and UNION is UNION All statement is much faster than UNION, the reason behind this is that because UNION ALL statement does not look for duplicate rows, but on the other hand UNION statement does look for duplicate rows, whether or not they exist.

75. Why should we hire you?

You might also like