Professional Documents
Culture Documents
• Henry Zegarra
• SaiKrishna Kondapaka
• Nazeer Ahmed
• Chaitanaya Sanjay
Definitions
Data Science
Data Analysis
Big Data
Data Mining
Analytics
Definitions
Data Science Data Analysis Big Data Data Mining Analytics
Visitor Data
Segmentation of customers
Building of who site visitors are
Content Optimization
Knowing how much time visitors spend on the website
Determine which pages visitors stay less
Web Analytics
Avinash Kaushik
Is an Indian entrepreneur, author and public
speaker.
He encourages his vision of Web Analytics 2.0,
and the Principle of aggregation of marginal
gains.
Works as advisor and associate instructor in
some universities on US and Canada.
Awarded:
2009 with the ‘Harry V Roberts Statistical Advocate Year Award’ from the
American Statistical Association
2011 with the ‘Most Influential Industry Contributor’ award from the Web
Analytics Association.
Paradox of Data
The adoption in 2012 of Web Analytics
in Fortune 500 was:
Google Analytics has an entire set of Acquisition reports, dedicated to categorizing user’s
sources of traffic to the site.
Did they come from a search engine, a link on a social media site, or a paid
advertisement?
Traffic Sources in GA
In the first hit in a user’s session, GA looks at the browser’s Referrer value (the URL of the
previous page) to determine where the user came from to arrive at the site.
Based on this value, it assigns values for two of the dimensions used in the Acquisition
reports, Medium and Source.
A medium represents a general category or type of traffic, while the source specifies a
specific site within that category.
GA categorizes traffic by default into the following mediums and sources:
GA also gives you the option to ignore certain referring sites (treating them as direct).
This is most common in the following situations:
• Certain types of third-party sites, such as PayPal.
It’s typical for a user to leave your site, go to PayPal (to complete a transaction), and then return to
your site for the final confirmation message. You’re not interested in counting the return as a
referral from PayPal.
• Cross-domain tracking
You can specify domains to treat as direct in GA’s Admin area in the property settings under
Tracking Info ➤ Referral Exclusion List. Any referrals from sites added to this list will be treated as
direct traffic.
Campaign Tracking :
• For links to your site that you control, you can
specify exactly the value you’d like GA to use for
the medium and source (as well as additional
traffic source dimensions).
• This could include many types of marketing and
advertising links:
• Paid search and display advertisements
• Social posts and paid social advertisements
• Links in email marketing, such as a newsletter
or promotion
• Links from partner or affiliate sites
• Links in offline advertising, such as print, TV, or
radio
Troubleshooting Traffic Sources
• Sometimes, traffic source information can go missing. Let’s examine the causes of incorrect traffic
source data and see how you can avoid pitfalls.
Redirects
• Redirects are a valuable tool to enforce consistency in URLs on a website,
• to provide alternative (usually shorter) URLs, and
• to ensure that historical links continue to work.
• be a little careful about how redirects are used on your site to ensure that you don’t lose data
about how a user arrived at the site.
• The redirect preserves any query parameters in place on the original URL—
These parameters should be visible in the URL in the final destination page—
• You can check for the appropriate behavior using your browser’s testing tools on a redirected URL..
Self-Referrals :
• One of the most common traffic source problems in GA
is seeing self-referrals
• your own website appears as a referral source.
Obviously this isn’t ntended—when a user follows a
link from one page on your website to another page,
that shouldn’t count as a referral—it’s just navigating
through the website!
• Why do self-referrals happen? The two most common
reasons are untagged pages and incorrect
• cross-domain or subdomain tracking.
Untagged Pages
• When a user lands on a page and begins a session, GA assigns the source, medium,
and other traffic source
dimensions. However, suppose you have a situation where the user lands on a
page where no GA tag fires.
What happens?
• Since no GA tag fired, no session has yet begun. I
• f the user continues to navigate to a second page—this one with a GA tag—GA
begins a session and says, “OK, where did this user come from?” In this case, it’s
from another page on your site, and GA assigns the medium “referral” and the
source as your own domain.
• In the Acquisition ➤ Traffic Sources ➤ Referrals report, you can drill down into
self-referrals to see the
pages they originate from.
Google Analytics reports a metric called Users, which sounds like it counts the
number of people using the site.
• Google Analytics typically counts users with the client ID, an identifier stored in a
cookie that is particular to a specific browser and device. For sites where users log
in or you can otherwise identify them, GA supports using a user ID instead for a
more accurate count of users across devices.
• User ID features are enabled in GA at the property level, choosing to use session
unification (counting hits before the user logs in) or not. Within that property, user
ID–enabled views can be created, which show only data with an associated user ID
along with additional cross-device reports.
• In GTM, the user ID is captured from the website, typically by inserting the user
ID value into the data layer. GA tags in GTM are altered to include this user ID
variable
Importing Data into Google
Analytics
Data import allows you to fill in the data using files uploaded directly to
GA. This can be useful in situations such as the following:
• The data isn’t available to the site at the time the hit occurs—for
example, because it’s stored in a separate system. You can upload data
from such systems to GA.
• The data is sensitive and you wouldn’t want to include it on the site,
such as certain kinds of user or product data.
• Data Import Process
1. You create a data set associated with a property in GA, to configure the dimensions and metrics that will be
imported.
2. You upload a text file with the data to be imported. GA takes this and processes it into the data in reports.
3. You update the data set as necessary to update the data going forward.
2. Extended data (import several kinds of dimension or metric values to be applied to existing hits in GA) :
a. User Data Import
b. Campaign Data Import
c. Geographical Data Import
d. content Data Import
e. Product Data Import
f. Custom Data Import
3. Summary data (import metrics for data already aggregated in certain dimensions).
a. Cost Data Import
BigQuery for Big Data Analysis