You are on page 1of 24

CUSTOMER-360

Customer 360 is a Customer Relationship Management (CRM) best practice which aims
at improving the relationship with existing customers, finding new prospective
customers, and winning back former customers. One of the core tenants of CRM is to
have a holistic “360 degree” view of all customer interactions and information that is
easily accessible by the business or organization. Accented solutions operate on a
seamless platform which contains Customer 360 degree views all applications and
business process that interact with the customer, improving quality of service to
maximize customer satisfaction and increase revenue.
COMPONENTS USED

1. HDFS
2. HBASE
3. PIG
4. HIVE

HDFS (Hadoop Distributed File System)

WHAT IS HDFS?
The Hadoop Distributed File System (HDFS) is the primary data storage system used by
Hadoop applications. It employs a Name Node and Data Node architecture to
implement a distributed file system that provides high-performance access to data
across highly scalable Hadoop Clusters.

HDFS is a key part of the many Hadoop ecosystem technologies, as it provides a reliable
means for managing pools of big data and supporting related big data analytics
applications.

Features of HDFS
 It is suitable for the distributed storage and processing.
 Hadoop provides a command interface to interact with HDFS.
 The built-in servers of name node and data node help users to easily check the
status of the cluster.
 Streaming access to file system data.
 HDFS provides file permission and authentication.

Working with HDFS


Storing of Data sets into the Hadoop Distributed File System

HBase
What is HBase?
HBase is a distributed column-oriented database built on top of the Hadoop file system.
It is an open – source project and is horizontally scalable.
HBase is a data model that is like Google’s big table designed to provide quick random
access to huge amounts of structured data. It leverages the fault tolerance provided by
the Hadoop File System (HDFS).
It is a part of Hadoop ecosystem that provides random real-time read/write access to
data in the Hadoop File System.
One can store the data in HDFS either directly or through HBase. Data consumer
reads/accesses the data in HDFS randomly using HBase. HBase sits on top of the Hadoop
File System and provides read and write access.

Features of HBase
 HBase is linearly scalable.
 It has automatic failure support.
 It provides consistent read and writes.
 It integrates with Hadoop, both as a source and a destination.
 It has easy java API for client.
 It provides data replication across clusters.

WORKING WITH HBASE


Creating CUSTOMER360 USING HBASE:
PIG
What is Apache Pig?
Apache Pig is an abstraction over MapReduce. It is a tool/platform which is used to
analyse larger sets of data representing them as data flows. Pig is generally used with
Hadoop. We can perform all the data manipulation operations in Hadoop using Apache
Pig.
To write data analysis programs, Pig provides a high-level language known as Pig Latin.
This language provides various operators using which programmers can develop their
own functions for reading, writing and processing data.
To analyse data using Apache Pig, programmers need to write scripts using Pig Latin
Language. All these scripts are internally converted to Map and Reduce tasks. Apache
Pig has a component known as Pig Engine that accepts the Pig Latin Scripts as inputs and
conveys those scripts into MapReduce jobs.

Features of Pig
 Rich set of operators- It provides many operators to perform operations like join,
sort, filer, etc.
 Ease of programming- Pig Latin is like SQL and it is easy to write a Pig Script if you
are good at SQL.
 Optimization opportunities- The tasks in Apache Pig optimize their execution
automatically, so the programmers need to focus only on semantics of the
language.
 Extensibility- Using the existing operators, users can develop their own functions
to read, process and write data.
WORKING WITH PIG
Loading and Storing ‘demographics.csv’

Scanning ‘demographics.csv’
Loading and Storing ‘creditcard.csv’

Scanning ‘creditcard.csv’
Loading and Storing ‘depositaccount.csv’

Scanning ‘depositaccount.csv’
Loading and Storing ‘loanaccount.csv’

Scanning ‘loanaccount.csv’
Loading and Storing ‘savingsaccount.csv’

Scanning ‘savingsaccount.csv’
Loading and Storing ‘creditcardtrx.csv’
SUM
DISTINCT TRANSACTION TYPES

HIVE
What is HIVE?
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It
resides on top of Hadoop to summarize Big Data and makes querying and analysing
easy.
Initially Hive was developed by Facebook, later the Apache Software Foundation took it
up and developed it further as an open source under the name Apache Hive. It is used
by different companies. For example, Amazon uses it in Amazon Elastic MapReduce.

Features of Hive
 It stores schema in a database and processed data into HDFS.
 It is designed for OLAP.
 It provides SQL type language for querying called HiveQL or HQL.
 It is familiar, fast, scalable and extensible.
WORKING WITH HIVE
DATA ANALYSIS
1. Select count(*) from reception where deposittype is not null;

2. Select * from reception limit 5;


3. Select * from reception where age=’31’ and occupation=’Others’ limit 7;

4. Select * from reception where gender=’F’ limit 5;

5. Select a.id, a.registrationdate, a.age, a.gender, b.income from accountsdept a


join finance b on a.savingsid=b.savingsid limit 5;

6. Select * from loandept limit 5;


7. Select * from accountsdept where occupation=’Business’ limit 5;

8. Select savingsid,type,avgbalance + 2000 from loandepth limit 7;

9. Select age, count(age) from accountsdept group by age having count(age)>9


limit 5;
10. Select id,occupation,income from accountsdept limit 9;

11. Select savings id,type from loandept limit 10;


12. Select loanid,cardnumber, from finance limit 10;

13. Select id,income from reception limit 15;


CONCLUSION
Customer-360 us the most needed project in many banking telecom and retail sectors.
Altogether, Customer-360 view offers a detailed overview of customer data across
various marketing and customer databases. They allow marketers to see a customer’s
lifetime of activity, behaviour, interests and preferences with your company. Businesses
can generate better insights about its customers, identify top prospects and those at
risk, and offer more context to the various actions that customers take-both online and
offline.
Using Customer-360 or 360-degree views, we can answer important product and
customer questions, identify patterns and insights, and create new questions to answer
with your customer data.
AKNOWLEDGEMENT
I would like to express my special thanks of gratitude towards Webtek Labs for giving
me the golden opportunity to do this wonderful project. This project which has become
an important part of my daily life for the last four weeks has provided me a lot of
valuable knowledge and insight.
I am especially grateful to Mr. Hemant, Hadoop Administrator and Technical Consultant,
who has spared valuable time from his busy schedule for providing his invaluable
guidance and support throughout the project which has made me complete the project
duly.
REFERENCES
Websites:
 https://www.tutorialspoint.com
 https://www.apache.org
 https://vision.cloudera.com/using-bigdata-to-drive -a-true-
customer-360/

Thank You

You might also like